[GH-ISSUE #306] constant_memory not working as expected #245

Closed
opened 2026-05-05 12:01:39 -06:00 by gitea-mirror · 12 comments
Owner

Originally created by @oliviera9 on GitHub (Sep 9, 2020).
Original GitHub issue: https://github.com/jmcnamara/libxlsxwriter/issues/306

Originally assigned to: @jmcnamara on GitHub.

Hello,
I am successfully using libxlsxwriter for creating Excel files in an embedded system.
Because of this I am using the constant_memory option but I have noticed the memory usage is not like what I would expect.
From my understanding, the memory usage should be limited to a row size with this options.
Anyway, monitoring the free memory available in the system, it appears that the memory continues to increase as long as I write rows (I can see this with examples/constant_memory.c).
I digged into the code, and I think this is because the tmpfile, created with constant_memory on, is never rewound but when closing the workbook. I would expect the rewind to occur every time a now row is created.
Is this the intended behavior?
Thanks,
Alain.

Originally created by @oliviera9 on GitHub (Sep 9, 2020). Original GitHub issue: https://github.com/jmcnamara/libxlsxwriter/issues/306 Originally assigned to: @jmcnamara on GitHub. Hello, I am successfully using libxlsxwriter for creating Excel files in an embedded system. Because of this I am using the constant_memory option but I have noticed the memory usage is not like what I would expect. From my understanding, the memory usage should be limited to a row size with this options. Anyway, monitoring the free memory available in the system, it appears that the memory continues to increase as long as I write rows (I can see this with examples/constant_memory.c). I digged into the code, and I think this is because the tmpfile, created with constant_memory on, is never rewound but when closing the workbook. I would expect the rewind to occur every time a now row is created. Is this the intended behavior? Thanks, Alain.
gitea-mirror 2026-05-05 12:01:39 -06:00
Author
Owner

@jmcnamara commented on GitHub (Sep 9, 2020):

Anyway, monitoring the free memory available in the system, it appears that the memory continues to increase as long as I write rows (I can see this with examples/constant_memory.c).

That shouldn't happen. How are you monitoring the memory usage?

I modified the constant_memory.c example to push up the row x column limits:

#include "xlsxwriter.h"

int main() {

    lxw_row_t row;
    lxw_col_t col;
    lxw_row_t max_row = 100000;
    lxw_col_t max_col = 500;

    /* Set the worksheet options. */
    lxw_workbook_options options = {.constant_memory = LXW_TRUE,
                                    .tmpdir = NULL,
                                    .use_zip64 = LXW_FALSE};

    /* Create a new workbook with options. */
    lxw_workbook  *workbook  = workbook_new_opt("constant_memory.xlsx", &options);
    lxw_worksheet *worksheet = workbook_add_worksheet(workbook, NULL);

    for (row = 0; row < max_row; row++) {
        for (col = 0; col < max_col; col++) {
            worksheet_write_number(worksheet, row, col, 123.45, NULL);
        }
    }

    return workbook_close(workbook);
}

Running this and monitoring it with top-o cpu shows content memory of 912K. At the final stages of assembly the file the memory jumps up to 1368K. That isn't related to the writing row data however, it is just the overhead of creating the files that make up the xlsx file and adding them to a zip container.

So, strictly speaking the memory isn't constant for the entire lifetime of the program, but it should 100% be constant while writing row data.

<!-- gh-comment-id:689481093 --> @jmcnamara commented on GitHub (Sep 9, 2020): > Anyway, monitoring the free memory available in the system, it appears that the memory continues to increase as long as I write rows (I can see this with examples/constant_memory.c). That shouldn't happen. How are you monitoring the memory usage? I modified the constant_memory.c example to push up the row x column limits: ```C #include "xlsxwriter.h" int main() { lxw_row_t row; lxw_col_t col; lxw_row_t max_row = 100000; lxw_col_t max_col = 500; /* Set the worksheet options. */ lxw_workbook_options options = {.constant_memory = LXW_TRUE, .tmpdir = NULL, .use_zip64 = LXW_FALSE}; /* Create a new workbook with options. */ lxw_workbook *workbook = workbook_new_opt("constant_memory.xlsx", &options); lxw_worksheet *worksheet = workbook_add_worksheet(workbook, NULL); for (row = 0; row < max_row; row++) { for (col = 0; col < max_col; col++) { worksheet_write_number(worksheet, row, col, 123.45, NULL); } } return workbook_close(workbook); } ``` Running this and monitoring it with `top-o cpu` shows content memory of 912K. At the final stages of assembly the file the memory jumps up to 1368K. That isn't related to the writing row data however, it is just the overhead of creating the files that make up the xlsx file and adding them to a zip container. So, strictly speaking the memory isn't constant for the entire lifetime of the program, but it should 100% be constant while writing row data.
Author
Owner

@oliviera9 commented on GitHub (Sep 9, 2020):

I just did a 'watch -n1 free' in a shell. My embedded system has just 64 MB of RAM, so RAM consumption is crearly visible.
I see top showing a constant memory usage as you. Anyway I think this is because data is written to the temporary file which is not counted on the process address space.

<!-- gh-comment-id:689483590 --> @oliviera9 commented on GitHub (Sep 9, 2020): I just did a 'watch -n1 free' in a shell. My embedded system has just 64 MB of RAM, so RAM consumption is crearly visible. I see top showing a constant memory usage as you. Anyway I think this is because data is written to the temporary file which is not counted on the process address space.
Author
Owner

@jmcnamara commented on GitHub (Sep 9, 2020):

Anyway I think this is because data is written to the temporary file which is not counted on the process address space.

The disk space usage shouldn't have any effect on the memory usage unless / or /tmp are mapped into memory. Are they on your system.

Either way I don't think there is anything I can fix here. Are you okay to close the issue.

<!-- gh-comment-id:689501806 --> @jmcnamara commented on GitHub (Sep 9, 2020): > Anyway I think this is because data is written to the temporary file which is not counted on the process address space. The disk space usage shouldn't have any effect on the memory usage unless / or /tmp are mapped into memory. Are they on your system. Either way I don't think there is anything I can fix here. Are you okay to close the issue.
Author
Owner

@oliviera9 commented on GitHub (Sep 9, 2020):

In UNIX, the temporary directory /tmp/ is supposed to be a tmpfs which resides in RAM indeed. So, I think the constant_memory options is trasferring the memory usage from inside the application (using calloc for raws) to the temporary file.
Unfortunately I don't know how an xlsx file is made up, but, I suppose, a solution for the problem would be to flush to temporary file to the final xlsx file once a row is completed, and rewind it. This would keep its size limited to a row size permitting to maintain the RAM usage stable.

<!-- gh-comment-id:689517865 --> @oliviera9 commented on GitHub (Sep 9, 2020): In UNIX, the temporary directory /tmp/ is supposed to be a tmpfs which resides in RAM indeed. So, I think the constant_memory options is trasferring the memory usage from inside the application (using calloc for raws) to the temporary file. Unfortunately I don't know how an xlsx file is made up, but, I suppose, a solution for the problem would be to flush to temporary file to the final xlsx file once a row is completed, and rewind it. This would keep its size limited to a row size permitting to maintain the RAM usage stable.
Author
Owner

@jmcnamara commented on GitHub (Sep 9, 2020):

In UNIX, the temporary directory /tmp/ is supposed to be a tmpfs which resides in RAM indeed.

That isn't/wasn't always the case, although I believe that a lot of Linux distro are enabling that by default.

If that the case then constant_memory = LXW_TRUE will probably consume more memory than constant_memory = LXW_FALSE. I'll put an update in the doc about that.

You can try specifying an alternative non-ram based folder for temp files using the tmpdir option in lxw_workbook_options, like this:

    /* Set the worksheet options. */
    lxw_workbook_options options = {.constant_memory = LXW_TRUE,
                                    .tmpdir = "/some/writeable/directory",
                                    .use_zip64 = LXW_FALSE};

    /* Create a new workbook with options. */

Try that and see how you get on.

<!-- gh-comment-id:689532680 --> @jmcnamara commented on GitHub (Sep 9, 2020): > In UNIX, the temporary directory /tmp/ is supposed to be a tmpfs which resides in RAM indeed. That isn't/wasn't always the case, although I believe that a lot of Linux distro are enabling that by default. If that the case then `constant_memory = LXW_TRUE` will probably consume more memory than `constant_memory = LXW_FALSE`. I'll put an update in the doc about that. You can try specifying an alternative non-ram based folder for temp files using the `tmpdir` option in `lxw_workbook_options`, like this: ```C /* Set the worksheet options. */ lxw_workbook_options options = {.constant_memory = LXW_TRUE, .tmpdir = "/some/writeable/directory", .use_zip64 = LXW_FALSE}; /* Create a new workbook with options. */ ``` Try that and see how you get on.
Author
Owner

@oliviera9 commented on GitHub (Sep 9, 2020):

Yeah, that's the case in my system: /tmp is RAM.
I will try with constant_memory = LXW_FALSE too.
Anyway, don't you see any chance to flush and rewind the temporary file once a row is completed?
I suppose this would require to move the xls creation from the workbook close to the workbook creation, and writing row data to the final xls every time. Quite a hard refactoring, probably.

<!-- gh-comment-id:689536001 --> @oliviera9 commented on GitHub (Sep 9, 2020): Yeah, that's the case in my system: /tmp is RAM. I will try with constant_memory = LXW_FALSE too. Anyway, don't you see any chance to flush and rewind the temporary file once a row is completed? I suppose this would require to move the xls creation from the workbook close to the workbook creation, and writing row data to the final xls every time. Quite a hard refactoring, probably.
Author
Owner

@jmcnamara commented on GitHub (Sep 9, 2020):

Anyway, don't you see any chance to flush and rewind the temporary file once a row is completed?
I suppose this would require to move the xls creation from the workbook close to the workbook creation, and writing row data to the final xls every time.

Unfortunately, that wouldn't work. You would end up with only 1 row of data written to the file.

The trade off in constant_memory mode is between memory and disk space. If both of those are the same thing (as in your case) then there isn't any trade off.

Instead, try setting .tmpdir to a writeable directly and re-running your test case.

<!-- gh-comment-id:689550864 --> @jmcnamara commented on GitHub (Sep 9, 2020): > Anyway, don't you see any chance to flush and rewind the temporary file once a row is completed? > I suppose this would require to move the xls creation from the workbook close to the workbook creation, and writing row data to the final xls every time. Unfortunately, that wouldn't work. You would end up with only 1 row of data written to the file. The trade off in `constant_memory` mode is between memory and disk space. If both of those are the same thing (as in your case) then there isn't any trade off. Instead, try setting `.tmpdir ` to a writeable directly and re-running your test case.
Author
Owner

@oliviera9 commented on GitHub (Sep 9, 2020):

I'll do some tests and let you know.
If there's no solution I think the issue could be closed.
Thanks for your support.
Alain.

<!-- gh-comment-id:689554288 --> @oliviera9 commented on GitHub (Sep 9, 2020): I'll do some tests and let you know. If there's no solution I think the issue could be closed. Thanks for your support. Alain.
Author
Owner

@oliviera9 commented on GitHub (Nov 27, 2020):

Using .tmpdir as a writeable directory the cached memory decreases but does not lead to an out-of-memory condition: I suppose the kernel frees the cached pages and effectively writes them into the temporary file.
This is not the case where .tmpdir is in tmpfs.
Alain.

<!-- gh-comment-id:734767053 --> @oliviera9 commented on GitHub (Nov 27, 2020): Using .tmpdir as a writeable directory the cached memory decreases but does not lead to an out-of-memory condition: I suppose the kernel frees the cached pages and effectively writes them into the temporary file. This is not the case where .tmpdir is in tmpfs. Alain.
Author
Owner

@oliviera9 commented on GitHub (Nov 30, 2020):

Using .tmpdir as a writeable directory the cached memory decreases but does not lead to an out-of-memory condition: I suppose the kernel frees the cached pages and effectively writes them into the temporary file.
This is not the case where .tmpdir is in tmpfs.
Alain.

<!-- gh-comment-id:735623095 --> @oliviera9 commented on GitHub (Nov 30, 2020): Using .tmpdir as a writeable directory the cached memory decreases but does not lead to an out-of-memory condition: I suppose the kernel frees the cached pages and effectively writes them into the temporary file. This is not the case where .tmpdir is in tmpfs. Alain.
Author
Owner

@jmcnamara commented on GitHub (Dec 1, 2020):

Thanks for the followup. I need to add something about this to the docs so I'll reopen the issue until that is complete.

<!-- gh-comment-id:736702167 --> @jmcnamara commented on GitHub (Dec 1, 2020): Thanks for the followup. I need to add something about this to the docs so I'll reopen the issue until that is complete.
Author
Owner

@jmcnamara commented on GitHub (Mar 23, 2021):

This issue and workaround is now documented: https://libxlsxwriter.github.io/working_with_memory.html#ww_mem_temp

<!-- gh-comment-id:805167981 --> @jmcnamara commented on GitHub (Mar 23, 2021): This issue and workaround is now documented: https://libxlsxwriter.github.io/working_with_memory.html#ww_mem_temp
Sign in to join this conversation.
No milestone
No project
No assignees
1 participant
Notifications
Due date
The due date is invalid or out of range. Please use the format "yyyy-mm-dd".

No due date set.

Dependencies

No dependencies set.

Reference: github-starred/libxlsxwriter#245
No description provided.