mirror of
https://github.com/jmcnamara/libxlsxwriter.git
synced 2026-05-15 14:15:54 -06:00
[GH-ISSUE #306] constant_memory not working as expected #245
Labels
No labels
awaiting user feedback
bug
cmake
cmake
docs
feature request
in progress
long term
medium term
medium term
pull-request
question
question
ready to close
short term
under investigation
wontfix
No milestone
No project
No assignees
1 participant
Notifications
Due date
No due date set.
Dependencies
No dependencies set.
Reference: github-starred/libxlsxwriter#245
Loading…
Add table
Add a link
Reference in a new issue
No description provided.
Delete branch "%!s()"
Deleting a branch is permanent. Although the deleted branch may continue to exist for a short time before it actually gets removed, it CANNOT be undone in most cases. Continue?
Originally created by @oliviera9 on GitHub (Sep 9, 2020).
Original GitHub issue: https://github.com/jmcnamara/libxlsxwriter/issues/306
Originally assigned to: @jmcnamara on GitHub.
Hello,
I am successfully using libxlsxwriter for creating Excel files in an embedded system.
Because of this I am using the constant_memory option but I have noticed the memory usage is not like what I would expect.
From my understanding, the memory usage should be limited to a row size with this options.
Anyway, monitoring the free memory available in the system, it appears that the memory continues to increase as long as I write rows (I can see this with examples/constant_memory.c).
I digged into the code, and I think this is because the tmpfile, created with constant_memory on, is never rewound but when closing the workbook. I would expect the rewind to occur every time a now row is created.
Is this the intended behavior?
Thanks,
Alain.
@jmcnamara commented on GitHub (Sep 9, 2020):
That shouldn't happen. How are you monitoring the memory usage?
I modified the constant_memory.c example to push up the row x column limits:
Running this and monitoring it with
top-o cpushows content memory of 912K. At the final stages of assembly the file the memory jumps up to 1368K. That isn't related to the writing row data however, it is just the overhead of creating the files that make up the xlsx file and adding them to a zip container.So, strictly speaking the memory isn't constant for the entire lifetime of the program, but it should 100% be constant while writing row data.
@oliviera9 commented on GitHub (Sep 9, 2020):
I just did a 'watch -n1 free' in a shell. My embedded system has just 64 MB of RAM, so RAM consumption is crearly visible.
I see top showing a constant memory usage as you. Anyway I think this is because data is written to the temporary file which is not counted on the process address space.
@jmcnamara commented on GitHub (Sep 9, 2020):
The disk space usage shouldn't have any effect on the memory usage unless / or /tmp are mapped into memory. Are they on your system.
Either way I don't think there is anything I can fix here. Are you okay to close the issue.
@oliviera9 commented on GitHub (Sep 9, 2020):
In UNIX, the temporary directory /tmp/ is supposed to be a tmpfs which resides in RAM indeed. So, I think the constant_memory options is trasferring the memory usage from inside the application (using calloc for raws) to the temporary file.
Unfortunately I don't know how an xlsx file is made up, but, I suppose, a solution for the problem would be to flush to temporary file to the final xlsx file once a row is completed, and rewind it. This would keep its size limited to a row size permitting to maintain the RAM usage stable.
@jmcnamara commented on GitHub (Sep 9, 2020):
That isn't/wasn't always the case, although I believe that a lot of Linux distro are enabling that by default.
If that the case then
constant_memory = LXW_TRUEwill probably consume more memory thanconstant_memory = LXW_FALSE. I'll put an update in the doc about that.You can try specifying an alternative non-ram based folder for temp files using the
tmpdiroption inlxw_workbook_options, like this:Try that and see how you get on.
@oliviera9 commented on GitHub (Sep 9, 2020):
Yeah, that's the case in my system: /tmp is RAM.
I will try with constant_memory = LXW_FALSE too.
Anyway, don't you see any chance to flush and rewind the temporary file once a row is completed?
I suppose this would require to move the xls creation from the workbook close to the workbook creation, and writing row data to the final xls every time. Quite a hard refactoring, probably.
@jmcnamara commented on GitHub (Sep 9, 2020):
Unfortunately, that wouldn't work. You would end up with only 1 row of data written to the file.
The trade off in
constant_memorymode is between memory and disk space. If both of those are the same thing (as in your case) then there isn't any trade off.Instead, try setting
.tmpdirto a writeable directly and re-running your test case.@oliviera9 commented on GitHub (Sep 9, 2020):
I'll do some tests and let you know.
If there's no solution I think the issue could be closed.
Thanks for your support.
Alain.
@oliviera9 commented on GitHub (Nov 27, 2020):
Using .tmpdir as a writeable directory the cached memory decreases but does not lead to an out-of-memory condition: I suppose the kernel frees the cached pages and effectively writes them into the temporary file.
This is not the case where .tmpdir is in tmpfs.
Alain.
@oliviera9 commented on GitHub (Nov 30, 2020):
Using .tmpdir as a writeable directory the cached memory decreases but does not lead to an out-of-memory condition: I suppose the kernel frees the cached pages and effectively writes them into the temporary file.
This is not the case where .tmpdir is in tmpfs.
Alain.
@jmcnamara commented on GitHub (Dec 1, 2020):
Thanks for the followup. I need to add something about this to the docs so I'll reopen the issue until that is complete.
@jmcnamara commented on GitHub (Mar 23, 2021):
This issue and workaround is now documented: https://libxlsxwriter.github.io/working_with_memory.html#ww_mem_temp