[GH-ISSUE #120] How to iterate for best performance #100

Closed
opened 2026-05-05 11:40:46 -06:00 by gitea-mirror · 4 comments
Owner

Originally created by @jeroen on GitHub (Aug 21, 2017).
Original GitHub issue: https://github.com/jmcnamara/libxlsxwriter/issues/120

Originally assigned to: @jmcnamara on GitHub.

The examples iterate over rows first, and within each row over the fields. Is it also possible to fill the sheet column-wise, or is this much slower? Column wise is often easier and faster to implement if the original data have typed columns (e.g. sql tables).

Additional question: if I know in advance the that the data is n by m is there an option to preallocate the lxw_worksheet of the right size, rather than dynamically growing it?

Originally created by @jeroen on GitHub (Aug 21, 2017). Original GitHub issue: https://github.com/jmcnamara/libxlsxwriter/issues/120 Originally assigned to: @jmcnamara on GitHub. The examples iterate over rows first, and within each row over the fields. Is it also possible to fill the sheet column-wise, or is this much slower? Column wise is often easier and faster to implement if the original data have typed columns (e.g. sql tables). Additional question: if I know in advance the that the data is `n` by `m` is there an option to preallocate the `lxw_worksheet` of the right size, rather than dynamically growing it?
gitea-mirror 2026-05-05 11:40:46 -06:00
Author
Owner

@jmcnamara commented on GitHub (Aug 21, 2017):

Hi Jeroen,

The examples iterate over rows first, and within each row over the fields. Is it also possible to fill the sheet column-wise, or is this much slower?

The worksheet cell data is stored in Red-Black trees by row so iterating by row is faster. And when using constant_memory mode you have to write the data in row x column order.

Additional question: if I know in advance the that the data is n by m is there an option to preallocate the lxw_worksheet of the right size, rather than dynamically growing it?

Unfortunately not. As usual the choice of internal data structure is a trade off. For ease of use libxlsxwriter, and for sparse data, the data is stored in Red-Black trees which can't be preallocated ahead of time.

Regards,

John

<!-- gh-comment-id:323769641 --> @jmcnamara commented on GitHub (Aug 21, 2017): Hi Jeroen, > The examples iterate over rows first, and within each row over the fields. Is it also possible to fill the sheet column-wise, or is this much slower? The worksheet cell data is stored in Red-Black trees by row so iterating by row is faster. And when using [`constant_memory`](https://libxlsxwriter.github.io/workbook_8h.html#a8ca9bd8c30c618b81ca6180f78b03323) mode you have to write the data in row x column order. > Additional question: if I know in advance the that the data is n by m is there an option to preallocate the lxw_worksheet of the right size, rather than dynamically growing it? Unfortunately not. As usual the choice of internal data structure is a trade off. For ease of use libxlsxwriter, and for sparse data, the data is stored in Red-Black trees which can't be preallocated ahead of time. Regards, John
Author
Owner

@jeroen commented on GitHub (Aug 21, 2017):

OK that makes sense thank you. One final performance question:

My input timestamps are stored as (standard unix) double with seconds since 1970-01-01 00:00:00. From your documentation I read that excel also uses doubles but a different scale:

The integer part of the number stores the number of days since the epoch, which is generally 1900, and the fractional part stores the percentage of the day.

Is there an obvious way to insert unix doubles as dates without first converting everything to lxw_datetime structs?

<!-- gh-comment-id:323775258 --> @jeroen commented on GitHub (Aug 21, 2017): OK that makes sense thank you. One final performance question: My input timestamps are stored as (standard unix) double with seconds since 1970-01-01 00:00:00. From your [ documentation](https://libxlsxwriter.github.io/working_with_dates.html) I read that excel also uses doubles but a different scale: > The integer part of the number stores the number of days since the epoch, which is generally 1900, and the fractional part stores the percentage of the day. Is there an obvious way to insert unix doubles as dates without first converting everything to `lxw_datetime` structs?
Author
Owner

@jmcnamara commented on GitHub (Aug 21, 2017):

Is there an obvious way to insert unix doubles as dates without first converting everything to lxw_datetime structs?

Excel dates are stored as an integer and fraction part where the integer part is the number of days since the Excel epoch (~1/1/1900) and the fractional part is time as the percentage of the day in seconds.

So to convert from Unix time (seconds since Unix epoch) you could do the following:

# Excel date for 1/1/1970 = 25569.0
# Number of seconds in day =  (24*60*60)

# Unixtime to Excel time (convert/cast everything to doubles)
= 25569 + unix_time / (24*60*60)

Then write the result using worksheet_write_number() with an appropriate date format.

<!-- gh-comment-id:323782708 --> @jmcnamara commented on GitHub (Aug 21, 2017): > Is there an obvious way to insert unix doubles as dates without first converting everything to lxw_datetime structs? Excel dates are stored as an integer and fraction part where the integer part is the number of days since the Excel epoch (~1/1/1900) and the fractional part is time as the percentage of the day in seconds. So to convert from Unix time (seconds since Unix epoch) you could do the following: # Excel date for 1/1/1970 = 25569.0 # Number of seconds in day = (24*60*60) # Unixtime to Excel time (convert/cast everything to doubles) = 25569 + unix_time / (24*60*60) Then write the result using `worksheet_write_number()` with an appropriate date format.
Author
Owner

@jeroen commented on GitHub (Aug 22, 2017):

Thanks!

<!-- gh-comment-id:324124865 --> @jeroen commented on GitHub (Aug 22, 2017): Thanks!
Sign in to join this conversation.
No milestone
No project
No assignees
1 participant
Notifications
Due date
The due date is invalid or out of range. Please use the format "yyyy-mm-dd".

No due date set.

Dependencies

No dependencies set.

Reference: github-starred/libxlsxwriter#100
No description provided.