[GH-ISSUE #513] worksheet_add_table allocates O(n²) memory for default column headers, allocating excess memory, especially for wide tables #399

Open
opened 2026-05-05 12:14:43 -06:00 by gitea-mirror · 0 comments
Owner

Originally created by @billdenney on GitHub (Apr 6, 2026).
Original GitHub issue: https://github.com/jmcnamara/libxlsxwriter/issues/513

Summary

When worksheet_add_table() is called without user-supplied column options, the internal function that generates default "Column1", "Column2", … headers allocates num_cols copies of lxw_table_column per loop iteration instead of one. Total allocation grows as num_cols², causing calloc to fail and the function to return LXW_ERROR_MEMORY_MALLOC_FAILED for any table wide enough that num_cols² × sizeof(lxw_table_column) exceeds available memory.

Reproduction

#include "xlsxwriter.h"
#include <stdio.h>

int main(void) {
    lxw_workbook  *workbook  = workbook_new("table_cols_bug.xlsx");
    lxw_worksheet *worksheet = workbook_add_worksheet(workbook, NULL);

    /*
     * A table spanning all 16,384 columns with default headers.
     * Before the fix: attempts to allocate ~15 GB (16384² × ~56 bytes).
     * On Linux with memory overcommit the allocations silently "succeed"
     * but consume large amounts of virtual address space. Run with a
     * virtual memory cap to force the failure:
     *
     *   ulimit -v 2097152 && ./table_cols_bug
     *
     * After the fix: allocates ~896 KB and succeeds with or without the cap.
     */
    lxw_error err = worksheet_add_table(worksheet,
                                        0, 0,
                                        1, LXW_COL_MAX - 1,
                                        NULL);
    printf("worksheet_add_table returned: %d (%s)\n", err, lxw_strerror(err));
    workbook_close(workbook);
    return err;
}

Expected behaviour

worksheet_add_table() succeeds (returns LXW_NO_ERROR) for any column count within Excel's limits.

Actual behaviour

On Linux with the default overcommit policy, calloc returns non-NULL for each
allocation (virtual pages are not backed by physical memory until accessed, and
only the first element of each allocation is written). The function therefore
appears to succeed, but silently maps up to ~15 GB of virtual address space.

On systems where overcommit is disabled, or when virtual memory is capped (e.g.
ulimit -v 2097152), the call fails and returns LXW_ERROR_MEMORY_MALLOC_FAILED.
The failure also occurs on embedded targets and any allocator that enforces
commit limits.

16,384 iterations × calloc(16,384, sizeof(lxw_table_column))
≈ 16,384 × 16,384 × 56 bytes
≈ 15 GB

Root cause

In _set_default_table_columns() (called by worksheet_add_table() when no lxw_table_options are provided), the inner loop contains:

for (i = 0; i < num_cols; i++) {
    column = calloc(num_cols, sizeof(lxw_table_column));  /* bug: should be 1 */
    ...
    columns[i] = column;
}

The second argument to calloc is num_cols instead of 1, allocating num_cols structs on every iteration when only one is needed. Only columns[i] (the first element of each over-allocated block) is ever used. The remaining num_cols - 1 structs per iteration are unreachable. The existing cleanup path (_free_worksheet_table_columnfree(columns[i])) correctly frees each block, so there is no memory leak — only the excessive upfront allocation.

Originally created by @billdenney on GitHub (Apr 6, 2026). Original GitHub issue: https://github.com/jmcnamara/libxlsxwriter/issues/513 ### Summary When `worksheet_add_table()` is called without user-supplied column options, the internal function that generates default "Column1", "Column2", … headers allocates `num_cols` copies of `lxw_table_column` per loop iteration instead of one. Total allocation grows as `num_cols²`, causing `calloc` to fail and the function to return `LXW_ERROR_MEMORY_MALLOC_FAILED` for any table wide enough that `num_cols² × sizeof(lxw_table_column)` exceeds available memory. ### Reproduction ```c #include "xlsxwriter.h" #include <stdio.h> int main(void) { lxw_workbook *workbook = workbook_new("table_cols_bug.xlsx"); lxw_worksheet *worksheet = workbook_add_worksheet(workbook, NULL); /* * A table spanning all 16,384 columns with default headers. * Before the fix: attempts to allocate ~15 GB (16384² × ~56 bytes). * On Linux with memory overcommit the allocations silently "succeed" * but consume large amounts of virtual address space. Run with a * virtual memory cap to force the failure: * * ulimit -v 2097152 && ./table_cols_bug * * After the fix: allocates ~896 KB and succeeds with or without the cap. */ lxw_error err = worksheet_add_table(worksheet, 0, 0, 1, LXW_COL_MAX - 1, NULL); printf("worksheet_add_table returned: %d (%s)\n", err, lxw_strerror(err)); workbook_close(workbook); return err; } ``` ### Expected behaviour `worksheet_add_table()` succeeds (returns `LXW_NO_ERROR`) for any column count within Excel's limits. ### Actual behaviour On Linux with the default overcommit policy, `calloc` returns non-`NULL` for each allocation (virtual pages are not backed by physical memory until accessed, and only the first element of each allocation is written). The function therefore appears to succeed, but silently maps up to ~15 GB of virtual address space. On systems where overcommit is disabled, or when virtual memory is capped (e.g. `ulimit -v 2097152`), the call fails and returns `LXW_ERROR_MEMORY_MALLOC_FAILED`. The failure also occurs on embedded targets and any allocator that enforces commit limits. ``` 16,384 iterations × calloc(16,384, sizeof(lxw_table_column)) ≈ 16,384 × 16,384 × 56 bytes ≈ 15 GB ``` ### Root cause In `_set_default_table_columns()` (called by `worksheet_add_table()` when no `lxw_table_options` are provided), the inner loop contains: ```c for (i = 0; i < num_cols; i++) { column = calloc(num_cols, sizeof(lxw_table_column)); /* bug: should be 1 */ ... columns[i] = column; } ``` The second argument to `calloc` is `num_cols` instead of `1`, allocating `num_cols` structs on every iteration when only one is needed. Only `columns[i]` (the first element of each over-allocated block) is ever used. The remaining `num_cols - 1` structs per iteration are unreachable. The existing cleanup path (`_free_worksheet_table_column` → `free(columns[i])`) correctly frees each block, so there is no memory leak — only the excessive upfront allocation.
Sign in to join this conversation.
No milestone
No project
No assignees
1 participant
Notifications
Due date
The due date is invalid or out of range. Please use the format "yyyy-mm-dd".

No due date set.

Dependencies

No dependencies set.

Reference: github-starred/libxlsxwriter#399
No description provided.