[GH-ISSUE #287] Feature request: Eliminate temporary files from worksheet_insert_image_buffer #229

Closed
opened 2026-05-05 12:00:14 -06:00 by gitea-mirror · 6 comments
Owner

Originally created by @evanmiller on GitHub (May 6, 2020).
Original GitHub issue: https://github.com/jmcnamara/libxlsxwriter/issues/287

Originally assigned to: @jmcnamara on GitHub.

I'd like to create workbooks that potentially contain thousands of small images. These images are generated in memory, and so I'm inserting them into the sheet with worksheet_insert_image_buffer_opt. However I noticed that this function just writes the buffer to a temporary file and reads it back in.

I'd like to avoid creating thousands of temporary files when constructing my spreadsheet. I believe the needed change could be as simple as replacing:

a0e6a362ba/src/worksheet.c (L6306)

With:

image_stream = fmemopen(NULL, image_size, "w+b");

See the fmemopen (http://man7.org/linux/man-pages/man3/fmemopen.3.html) - this will create a (second) buffer managed by the file handle that will be freed when the file handle is closed.

Happy to open a PR if you think it's worthwhile.

Originally created by @evanmiller on GitHub (May 6, 2020). Original GitHub issue: https://github.com/jmcnamara/libxlsxwriter/issues/287 Originally assigned to: @jmcnamara on GitHub. I'd like to create workbooks that potentially contain thousands of small images. These images are generated in memory, and so I'm inserting them into the sheet with `worksheet_insert_image_buffer_opt`. However I noticed that this function just writes the buffer to a temporary file and reads it back in. I'd like to avoid creating thousands of temporary files when constructing my spreadsheet. I believe the needed change could be as simple as replacing: https://github.com/jmcnamara/libxlsxwriter/blob/a0e6a362baf0afc545c7bc020464ed3068d1f03a/src/worksheet.c#L6306 With: ```C image_stream = fmemopen(NULL, image_size, "w+b"); ``` See the `fmemopen` (http://man7.org/linux/man-pages/man3/fmemopen.3.html) - this will create a (second) buffer managed by the file handle that will be freed when the file handle is closed. Happy to open a PR if you think it's worthwhile.
gitea-mirror 2026-05-05 12:00:14 -06:00
Author
Owner

@jmcnamara commented on GitHub (May 6, 2020):

I'd like to avoid creating thousands of temporary files when constructing my spreadsheet.

The code should only keep the filehandle open for the duration of the read and parsing of the metadata but maybe you want to avoid even that.

Happy to open a PR if you think it's worthwhile.

I'd say create a PR anyway, just to see if it passes all the compatibility and other tests.

I think that it probably isn't ANSI-C compatible, which is probably why I didn't use it. If you plug it in and run a build you will soon find out.

I mainly try to maintain ANSI-C compatibility so that the code is consumable by MSVC so it would need to work there as well (or at least with some #define to map it to the equivalent Windows function). I haven't checked but I presume that is possible. Could you check that as well.

<!-- gh-comment-id:624775865 --> @jmcnamara commented on GitHub (May 6, 2020): > I'd like to avoid creating thousands of temporary files when constructing my spreadsheet. The code should only keep the filehandle open for the duration of the read and parsing of the metadata but maybe you want to avoid even that. > Happy to open a PR if you think it's worthwhile. I'd say create a PR anyway, just to see if it passes all the compatibility and other tests. I think that it probably isn't ANSI-C compatible, which is probably why I didn't use it. If you plug it in and run a build you will soon find out. I mainly try to maintain ANSI-C compatibility so that the code is consumable by MSVC so it would need to work there as well (or at least with some #define to map it to the equivalent Windows function). I haven't checked but I presume that is possible. Could you check that as well.
Author
Owner

@evanmiller commented on GitHub (May 6, 2020):

Okay, opened a pull request with the full rationale, awaiting results from Travis. I would indeed prefer to avoid hitting the file system with every single image, even if the images are immediately deleted.

I'm not sure where to look re: Windows compatibility. It might be worth setting up the project on AppVeyor.

<!-- gh-comment-id:624812602 --> @evanmiller commented on GitHub (May 6, 2020): Okay, opened a pull request with the full rationale, awaiting results from Travis. I would indeed prefer to avoid hitting the file system with every single image, even if the images are immediately deleted. I'm not sure where to look re: Windows compatibility. It might be worth setting up the project on AppVeyor.
Author
Owner

@jmcnamara commented on GitHub (May 6, 2020):

I would indeed prefer to avoid hitting the file system with every single image, even if the images are immediately deleted.

Ok. Understandable. Just in case it is an issue, I should point out that the library uses tmp files for the XML data before assembling the file.

It might be worth setting up the project on AppVeyor.

There were AppVeyor and Tea-CI build files but they failed for opaque and inconsistent reasons so I removed them in commit 8abc4ecd19.

<!-- gh-comment-id:624852763 --> @jmcnamara commented on GitHub (May 6, 2020): > I would indeed prefer to avoid hitting the file system with every single image, even if the images are immediately deleted. Ok. Understandable. Just in case it is an issue, I should point out that the library uses tmp files for the XML data before assembling the file. > It might be worth setting up the project on AppVeyor. There were AppVeyor and Tea-CI build files but they failed for opaque and inconsistent reasons so I removed them in commit 8abc4ecd1942e806cc6d96f3a2b5614ae56881d6.
Author
Owner

@jmcnamara commented on GitHub (May 6, 2020):

The builds are failing on Travis due to the ANSI-C issue.

worksheet.c:6306:20: error: implicit declaration of function 'fmemopen'
1075      [-Werror,-Wimplicit-function-declaration]
1076    image_stream = fmemopen(NULL, image_size, "w+b");

I was a bit surprised that it builds on MacOS (even with gcc-9) but it does.

Anyway, incompatibility with ANSI-C is a deal breaker. I'd suggest:

  1. Just patching your copy of the library since the change is small.
  2. We add an undocumented (because the documentation and build system changes aren't worth the effort) compile time option like this:
#ifdef USE_FMEMOPEN
   image_stream = fmemopen(NULL, image_size, "w+b");
#else
    image_stream = lxw_tmpfile(self->tmpdir); 
#endif
<!-- gh-comment-id:624859926 --> @jmcnamara commented on GitHub (May 6, 2020): The builds are failing on [Travis](https://travis-ci.org/github/jmcnamara/libxlsxwriter/builds/683953027) due to the ANSI-C issue. ``` worksheet.c:6306:20: error: implicit declaration of function 'fmemopen' 1075 [-Werror,-Wimplicit-function-declaration] 1076 image_stream = fmemopen(NULL, image_size, "w+b"); ``` I was a bit surprised that it builds on MacOS (even with gcc-9) but it does. Anyway, incompatibility with ANSI-C is a deal breaker. I'd suggest: 1. Just patching your copy of the library since the change is small. 2. We add an undocumented (because the documentation and build system changes aren't worth the effort) compile time option like this: ```C #ifdef USE_FMEMOPEN image_stream = fmemopen(NULL, image_size, "w+b"); #else image_stream = lxw_tmpfile(self->tmpdir); #endif ```
Author
Owner

@evanmiller commented on GitHub (May 6, 2020):

If you're OK with branching on an #ifdef then I will look into getting the compile-time machinery in place. I think we can do this inside the CMakeLists script without adding a user-facing flag.

Might be worth getting a MacOS target going on Travis so that both code branches get exercised.

<!-- gh-comment-id:624873051 --> @evanmiller commented on GitHub (May 6, 2020): If you're OK with branching on an #ifdef then I will look into getting the compile-time machinery in place. I think we can do this inside the CMakeLists script without adding a user-facing flag. Might be worth getting a MacOS target going on Travis so that both code branches get exercised.
Author
Owner

@jmcnamara commented on GitHub (May 8, 2020):

Merged in PR #288

<!-- gh-comment-id:625799677 --> @jmcnamara commented on GitHub (May 8, 2020): Merged in PR #288
Sign in to join this conversation.
No milestone
No project
No assignees
1 participant
Notifications
Due date
The due date is invalid or out of range. Please use the format "yyyy-mm-dd".

No due date set.

Dependencies

No dependencies set.

Reference: github-starred/libxlsxwriter#229
No description provided.