[GH-ISSUE #165] File size is very large comparing to OpenXML SDK generated #137

Closed
opened 2026-05-05 11:45:42 -06:00 by gitea-mirror · 3 comments
Owner

Originally created by @igorko on GitHub (Apr 27, 2018).
Original GitHub issue: https://github.com/jmcnamara/libxlsxwriter/issues/165

When using OpenXML SDK for generating xlsx, size is much smaller even when using 9 compression level in libxlsxwriter. Sheet.xml document size differs very much.

  • 100 000 rows on 230 columns document. (7.5MB vs 2.4MB)
  • 400 000 rows on 230 columns document. (327MB vs 107MB) Also such large file generated with libxlsxwriter is corrupted
Originally created by @igorko on GitHub (Apr 27, 2018). Original GitHub issue: https://github.com/jmcnamara/libxlsxwriter/issues/165 When using OpenXML SDK for generating xlsx, size is much smaller even when using 9 compression level in libxlsxwriter. Sheet.xml document size differs very much. - 100 000 rows on 230 columns document. (7.5MB vs 2.4MB) - 400 000 rows on 230 columns document. (327MB vs 107MB) Also such large file generated with libxlsxwriter is corrupted
Author
Owner

@igorko commented on GitHub (Apr 27, 2018):

The problem is that OpenXML sdk doesn't write row number and cell name in attributes (at least for inlineStr), so xlsx, generated by OpenXML is compressed better (in 3 times). Are row and cell names required attributes?

<!-- gh-comment-id:384930678 --> @igorko commented on GitHub (Apr 27, 2018): The problem is that OpenXML sdk doesn't write row number and cell name in attributes (at least for inlineStr), so xlsx, generated by OpenXML is compressed better (in 3 times). Are row and cell names required attributes?
Author
Owner

@jmcnamara commented on GitHub (Apr 27, 2018):

When using OpenXML SDK for generating xlsx, size is much smaller even when using 9 compression level in libxlsxwriter.

It is a design goal of libxlsxwriter that it replicates the OpenXML variant used by Excel to ensure compatibility and to allow testing against files created in Excel.

In general libxlsxwriter files are smaller than equivalent Excel files with the default zlib compression level.

<!-- gh-comment-id:384944749 --> @jmcnamara commented on GitHub (Apr 27, 2018): > When using OpenXML SDK for generating xlsx, size is much smaller even when using 9 compression level in libxlsxwriter. It is a design goal of libxlsxwriter that it replicates the OpenXML variant used by Excel to ensure compatibility and to allow testing against files created in Excel. In general libxlsxwriter files are smaller than equivalent Excel files with the default zlib compression level.
Author
Owner

@jmcnamara commented on GitHub (Apr 27, 2018):

Also such large file generated with libxlsxwriter is corrupted

I've had no reports of that. And when I run the following test program it works fine (although it is slow to load in Excel):

#include "xlsxwriter.h"

int main() {

    lxw_workbook  *workbook  = workbook_new("test.xlsx");
    lxw_worksheet *worksheet = workbook_add_worksheet(workbook, NULL);
    int row;
    int col;

    for (row = 0; row < 400000; row++)
        for (col = 0; col < 230; col++)
            worksheet_write_string(worksheet, row, col, "Hello", NULL);

    workbook_close(workbook);

    return 0;
}

Here is the output file loaded in Excel.
aa_image

If you have an example program that produces a corrupt file that Excel can't read that open up a new bug report issue.

BTW, the file produces by the above program is ~187MB. The same file saved by Excel is ~243MB.

Closing.

<!-- gh-comment-id:384945895 --> @jmcnamara commented on GitHub (Apr 27, 2018): > Also such large file generated with libxlsxwriter is corrupted I've had no reports of that. And when I run the following test program it works fine (although it is slow to load in Excel): ```C #include "xlsxwriter.h" int main() { lxw_workbook *workbook = workbook_new("test.xlsx"); lxw_worksheet *worksheet = workbook_add_worksheet(workbook, NULL); int row; int col; for (row = 0; row < 400000; row++) for (col = 0; col < 230; col++) worksheet_write_string(worksheet, row, col, "Hello", NULL); workbook_close(workbook); return 0; } ``` Here is the output file loaded in Excel. ![aa_image](https://user-images.githubusercontent.com/94267/39360807-8e434d9a-4a17-11e8-9420-c7ad44781df9.png) If you have an example program that produces a corrupt file that Excel can't read that open up a new bug report issue. BTW, the file produces by the above program is ~187MB. The same file saved by Excel is ~243MB. Closing.
Sign in to join this conversation.
No milestone
No project
No assignees
1 participant
Notifications
Due date
The due date is invalid or out of range. Please use the format "yyyy-mm-dd".

No due date set.

Dependencies

No dependencies set.

Reference: github-starred/libxlsxwriter#137
No description provided.