[GH-ISSUE #151] Problem with number formats #125

Closed
opened 2026-05-05 11:43:58 -06:00 by gitea-mirror · 11 comments
Owner

Originally created by @franciscopajares on GitHub (Mar 13, 2018).
Original GitHub issue: https://github.com/jmcnamara/libxlsxwriter/issues/151

Originally assigned to: @jmcnamara on GitHub.

Hello.

I am using libslxswriter version 0.7.6 in Windows 7, compiled with Microsoft Visual Studio 2005, and I have found a problem with numeric formats.
The problem is:

When you add two numeric formats with the function format_set_num_format and these formats are similar, i.e "#,##0.0" and "#,##0.00000", sometimes the cells are not formatted with the correct format.
The bug is located in the function _prepare_num_formats in workbook.c .
This function calls to lxw_hash_key_exists, and is not returning the correct format, because the function lxw_hash_key_exists compares the hash with memcmp with the length of the first format. If both formats have different length, but the first characters are equal, the return value is TRUE and should be FALSE.
The formats "#,##0.0" and "#,##0.00000" are different, but when compare with memcmp("#,##0.0", "#,##0.00000", strlen("#,##0.0")) the function returns that they are equal.

Example to reproduce the bug:

lxw_format * MyXLSXAddStyle(lxw_workbook  *workbook, COLORREF backColor, COLORREF foreColor, char * fontName, int fontSize, int bold, int italic, int underline, int strikeOut, char * dataMask			)
{

	lxw_format *styleFormat;
	styleFormat = workbook_add_format(workbook);
	format_set_font_color(styleFormat, foreColor);
	format_set_bg_color(styleFormat, backColor);
	format_set_font_name(styleFormat, fontName);
	format_set_font_size(styleFormat, fontSize);
	if (bold) {
		format_set_bold(styleFormat);
	}
	if (italic) {
		format_set_italic(styleFormat);
	}
	if (underline) {
		format_set_underline(styleFormat, LXW_UNDERLINE_SINGLE);
	}
	if (strikeOut) {
		format_set_font_strikeout(styleFormat);
	}

	if (dataMask) {
		format_set_num_format(styleFormat, dataMask);
	}

	format_set_border(styleFormat, LXW_BORDER_THIN);
	format_set_border_color(styleFormat, 0XC0C0C0);	

	return styleFormat;
}

int main() {
/* Create a workbook and add a worksheet. */
    lxw_workbook  *workbook  = workbook_new("c:\\tmp\\xlsx\\pruestilos.xlsx");
    lxw_worksheet *worksheet = workbook_add_worksheet(workbook, NULL);
    int row = 0;
    int col = 0;
    int numcolumns = 2;
    int i;
    lxw_color_t color_1, color_2;

    lxw_format *style_01, *style_02, *style_03, *style_04;
	 
    for (i = 0; i < numcolumns; i++) {
        worksheet_set_column(worksheet, i, i, 20, NULL);
    }

    color_1 = 16777170;
    color_2 = 16777215;

    style_01 = MyXLSXAddStyle(workbook, color_1, 0, "MS Sans Serif", 8, 0, 0, 0, 0, "#,##0.0");
    style_02 = MyXLSXAddStyle(workbook, color_1, 0, "MS Sans Serif", 8, 0, 0, 0, 0, "#,##0.00000");
    style_03 = MyXLSXAddStyle(workbook, color_2, 0, "MS Sans Serif", 8, 0, 0, 0, 0, "#,##0.0");
    style_04 = MyXLSXAddStyle(workbook, color_2, 0, "MS Sans Serif", 8, 0, 0, 0, 0, "#,##0.00000");

    //First Row
    row = 0;
    worksheet_write_number  (worksheet, row, 0, 1234.5,     style_01); // "1,234.5" -> OK
    worksheet_write_number  (worksheet, row, 1, 1234.5,     style_02); // "1,234.50000" ->OK

    //Second Row . The problem is in the cell A2
    row = 1;
    worksheet_write_number  (worksheet, row, 0, 1234.5,     style_03); // "1,234.50000" -> NOT OK Should be "1,234.5"
    worksheet_write_number  (worksheet, row, 1, 1234.5,     style_04); // "1,234.50000" -> OK

    /* Save the workbook and free any allocated memory. */
    return workbook_close(workbook);
}

The error is in the function _prepare_num_formats located in workbook.c.
The block of code that causes the problem is:

        if (*num_format) {
            /* Look up the num_format in the hash table. */
            hash_element = lxw_hash_key_exists(num_formats, num_format,
                                               strlen(num_format));

            if (hash_element) {
                /* Num_Format has already been used. */
                format->num_format_index = *(uint16_t *) hash_element->value;
            }
            else {
                /* This is a new num_format. */
                num_format_index = calloc(1, sizeof(uint16_t));
                *num_format_index = index;
                format->num_format_index = index;
                lxw_insert_hash_element(num_formats, num_format,
                                        num_format_index, strlen(num_format));
                index++;
                num_format_count++;
            }
        }

I think that could be replaced by the next piece of code, that compares two hashes with the same length:

        if (*num_format) {
            /* Look up the num_format in the hash table. */
	    char aux_format[LXW_FORMAT_FIELD_LEN];

	    memset(aux_format,0,LXW_FORMAT_FIELD_LEN);
	    strcpy(aux_format, num_format);
            hash_element = lxw_hash_key_exists(num_formats, aux_format,
                                               LXW_FORMAT_FIELD_LEN);

            if (hash_element) {
                /* Num_Format has already been used. */
                format->num_format_index = *(uint16_t *) hash_element->value;
            }
            else {
                /* This is a new num_format. */
                num_format_index = calloc(1, sizeof(uint16_t));
                *num_format_index = index;
                format->num_format_index = index;
                lxw_insert_hash_element(num_formats, aux_format,
                                        num_format_index, LXW_FORMAT_FIELD_LEN);
                index++;
                num_format_count++;
            }
        }

I hope it helps.

Kind regards.

Originally created by @franciscopajares on GitHub (Mar 13, 2018). Original GitHub issue: https://github.com/jmcnamara/libxlsxwriter/issues/151 Originally assigned to: @jmcnamara on GitHub. Hello. I am using libslxswriter version 0.7.6 in Windows 7, compiled with Microsoft Visual Studio 2005, and I have found a problem with numeric formats. The problem is: When you add two numeric formats with the function **format_set_num_format** and these formats are similar, i.e **"#,##0.0"** and **"#,##0.00000"**, sometimes the cells are not formatted with the correct format. The bug is located in the function **_prepare_num_formats** in **workbook.c** . This function calls to **lxw_hash_key_exists**, and is not returning the correct format, because the function **lxw_hash_key_exists** compares the hash with **memcmp** with the length of the first format. If both formats have different length, but the first characters are equal, the return value is TRUE and should be FALSE. The formats "#,##0.0" and "#,##0.00000" are different, but when compare with memcmp("#,##0.0", "#,##0.00000", strlen("#,##0.0")) the function returns that they are equal. Example to reproduce the bug: ```c++ lxw_format * MyXLSXAddStyle(lxw_workbook *workbook, COLORREF backColor, COLORREF foreColor, char * fontName, int fontSize, int bold, int italic, int underline, int strikeOut, char * dataMask ) { lxw_format *styleFormat; styleFormat = workbook_add_format(workbook); format_set_font_color(styleFormat, foreColor); format_set_bg_color(styleFormat, backColor); format_set_font_name(styleFormat, fontName); format_set_font_size(styleFormat, fontSize); if (bold) { format_set_bold(styleFormat); } if (italic) { format_set_italic(styleFormat); } if (underline) { format_set_underline(styleFormat, LXW_UNDERLINE_SINGLE); } if (strikeOut) { format_set_font_strikeout(styleFormat); } if (dataMask) { format_set_num_format(styleFormat, dataMask); } format_set_border(styleFormat, LXW_BORDER_THIN); format_set_border_color(styleFormat, 0XC0C0C0); return styleFormat; } int main() { /* Create a workbook and add a worksheet. */ lxw_workbook *workbook = workbook_new("c:\\tmp\\xlsx\\pruestilos.xlsx"); lxw_worksheet *worksheet = workbook_add_worksheet(workbook, NULL); int row = 0; int col = 0; int numcolumns = 2; int i; lxw_color_t color_1, color_2; lxw_format *style_01, *style_02, *style_03, *style_04; for (i = 0; i < numcolumns; i++) { worksheet_set_column(worksheet, i, i, 20, NULL); } color_1 = 16777170; color_2 = 16777215; style_01 = MyXLSXAddStyle(workbook, color_1, 0, "MS Sans Serif", 8, 0, 0, 0, 0, "#,##0.0"); style_02 = MyXLSXAddStyle(workbook, color_1, 0, "MS Sans Serif", 8, 0, 0, 0, 0, "#,##0.00000"); style_03 = MyXLSXAddStyle(workbook, color_2, 0, "MS Sans Serif", 8, 0, 0, 0, 0, "#,##0.0"); style_04 = MyXLSXAddStyle(workbook, color_2, 0, "MS Sans Serif", 8, 0, 0, 0, 0, "#,##0.00000"); //First Row row = 0; worksheet_write_number (worksheet, row, 0, 1234.5, style_01); // "1,234.5" -> OK worksheet_write_number (worksheet, row, 1, 1234.5, style_02); // "1,234.50000" ->OK //Second Row . The problem is in the cell A2 row = 1; worksheet_write_number (worksheet, row, 0, 1234.5, style_03); // "1,234.50000" -> NOT OK Should be "1,234.5" worksheet_write_number (worksheet, row, 1, 1234.5, style_04); // "1,234.50000" -> OK /* Save the workbook and free any allocated memory. */ return workbook_close(workbook); } ``` The error is in the function **_prepare_num_formats** located in **workbook.c**. The block of code that causes the problem is: ```c++ if (*num_format) { /* Look up the num_format in the hash table. */ hash_element = lxw_hash_key_exists(num_formats, num_format, strlen(num_format)); if (hash_element) { /* Num_Format has already been used. */ format->num_format_index = *(uint16_t *) hash_element->value; } else { /* This is a new num_format. */ num_format_index = calloc(1, sizeof(uint16_t)); *num_format_index = index; format->num_format_index = index; lxw_insert_hash_element(num_formats, num_format, num_format_index, strlen(num_format)); index++; num_format_count++; } } ``` I think that could be replaced by the next piece of code, that compares two hashes with the same length: ```c++ if (*num_format) { /* Look up the num_format in the hash table. */ char aux_format[LXW_FORMAT_FIELD_LEN]; memset(aux_format,0,LXW_FORMAT_FIELD_LEN); strcpy(aux_format, num_format); hash_element = lxw_hash_key_exists(num_formats, aux_format, LXW_FORMAT_FIELD_LEN); if (hash_element) { /* Num_Format has already been used. */ format->num_format_index = *(uint16_t *) hash_element->value; } else { /* This is a new num_format. */ num_format_index = calloc(1, sizeof(uint16_t)); *num_format_index = index; format->num_format_index = index; lxw_insert_hash_element(num_formats, aux_format, num_format_index, LXW_FORMAT_FIELD_LEN); index++; num_format_count++; } } ``` I hope it helps. Kind regards.
gitea-mirror 2026-05-05 11:43:58 -06:00
Author
Owner

@jmcnamara commented on GitHub (Mar 13, 2018):

Hi,

Thanks for the detailed report and the debugging.

I'll put a fix in for it soon.

John

<!-- gh-comment-id:372721234 --> @jmcnamara commented on GitHub (Mar 13, 2018): Hi, Thanks for the detailed report and the debugging. I'll put a fix in for it soon. John
Author
Owner

@franciscopajares commented on GitHub (Mar 13, 2018):

Hi,

Thanks to you for saving me so much time developing this library.

Francisco.

<!-- gh-comment-id:372724111 --> @franciscopajares commented on GitHub (Mar 13, 2018): Hi, Thanks to you for saving me so much time developing this library. Francisco.
Author
Owner

@jmcnamara commented on GitHub (Mar 13, 2018):

Here is a quick fix in the meantime.

diff --git a/src/hash_table.c b/src/hash_table.c
index 36fa59e..65cca49 100644
--- a/src/hash_table.c
+++ b/src/hash_table.c
@@ -51,7 +51,9 @@ lxw_hash_key_exists(lxw_hash_table *lxw_hash, void *key, size_t key_len)
 
         /* Iterate over the keys in the bucket's linked list. */
         SLIST_FOREACH(element, list, lxw_hash_list_pointers) {
-            if (memcmp(element->key, key, key_len) == 0) {
+            if ((strlen(element->key) == strlen(key)) &&
+                (memcmp(element->key, key, key_len) == 0)) {
+
                 /* The key already exists in the table. */
                 return element;
             }
<!-- gh-comment-id:372731783 --> @jmcnamara commented on GitHub (Mar 13, 2018): Here is a quick fix in the meantime. ```diff diff --git a/src/hash_table.c b/src/hash_table.c index 36fa59e..65cca49 100644 --- a/src/hash_table.c +++ b/src/hash_table.c @@ -51,7 +51,9 @@ lxw_hash_key_exists(lxw_hash_table *lxw_hash, void *key, size_t key_len) /* Iterate over the keys in the bucket's linked list. */ SLIST_FOREACH(element, list, lxw_hash_list_pointers) { - if (memcmp(element->key, key, key_len) == 0) { + if ((strlen(element->key) == strlen(key)) && + (memcmp(element->key, key, key_len) == 0)) { + /* The key already exists in the table. */ return element; } ```
Author
Owner

@franciscopajares commented on GitHub (Mar 13, 2018):

Hi John.

This fix is valid when comparing strings but, is it safe when comparing structs, pointers or another data types?

Thanks again.

<!-- gh-comment-id:372741922 --> @franciscopajares commented on GitHub (Mar 13, 2018): Hi John. This fix is valid when comparing strings but, is it safe when comparing structs, pointers or another data types? Thanks again.
Author
Owner

@jmcnamara commented on GitHub (Mar 13, 2018):

is it safe when comparing structs, pointers or another data

No. Not fully. I'll need to rethink it. Maybe a solution like yours is better in this case.

<!-- gh-comment-id:372758493 --> @jmcnamara commented on GitHub (Mar 13, 2018): > is it safe when comparing structs, pointers or another data No. Not fully. I'll need to rethink it. Maybe a solution like yours is better in this case.
Author
Owner

@jmcnamara commented on GitHub (Mar 16, 2018):

I've pushed a test and fix for this to master.

<!-- gh-comment-id:373561830 --> @jmcnamara commented on GitHub (Mar 16, 2018): I've pushed a test and fix for this to master.
Author
Owner

@franciscopajares commented on GitHub (Mar 16, 2018):

Hello.

Seems to work fine.

A little question.
Do the lxw_strcpy function fills the memory at the right of the '\0' delimiter with the same value every time that this function is executed?

Thanks,

<!-- gh-comment-id:373650885 --> @franciscopajares commented on GitHub (Mar 16, 2018): Hello. Seems to work fine. A little question. Do the **lxw_strcpy function** fills the memory at the right of the **'\0'** delimiter with the same value every time that this function is executed? Thanks,
Author
Owner

@jmcnamara commented on GitHub (Mar 16, 2018):

Do the lxw_strcpy function fills the memory at the right

Good eyes.

The value being copied (format->num_format) will almost always be NULL padded (from initialization) so effectively the stored num_format value will be NULL padded too.

However, it is possible that it could contain some garbage if the user reset the num_format in the format to a shorter string. This wouldn't affect the formatting in the Excel file however, it would just create a potentially duplicate format.

Your solution of memset + strcpy would avoid even that edge case so maybe I should just use that instead.

Thanks,

John

<!-- gh-comment-id:373664534 --> @jmcnamara commented on GitHub (Mar 16, 2018): > Do the lxw_strcpy function fills the memory at the right Good eyes. The value being copied (`format->num_format`) will almost always be NULL padded (from initialization) so effectively the stored `num_format` value will be NULL padded too. However, it is possible that it could contain some garbage if the user reset the `num_format` in the format to a shorter string. This wouldn't affect the formatting in the Excel file however, it would just create a potentially duplicate format. Your solution of `memset + strcpy` would avoid even that edge case so maybe I should just use that instead. Thanks, John
Author
Owner

@jmcnamara commented on GitHub (Mar 16, 2018):

I've change the code around a bit to ensure that the stored buffer is alway NULL padded.

Take a look when you get a chance.

<!-- gh-comment-id:373754444 --> @jmcnamara commented on GitHub (Mar 16, 2018): I've change the code around a bit to ensure that the stored buffer is alway NULL padded. Take a look when you get a chance.
Author
Owner

@franciscopajares commented on GitHub (Mar 21, 2018):

Hello.

It works fine.
The memory issues are the most difficult to trace and solve, and I think that it was worth it to ensure that the padding on the right was with the same character.

I think that this issue could be marked as solved.

Thank you again for your help.

<!-- gh-comment-id:374881874 --> @franciscopajares commented on GitHub (Mar 21, 2018): Hello. It works fine. The memory issues are the most difficult to trace and solve, and I think that it was worth it to ensure that the padding on the right was with the same character. I think that this issue could be marked as solved. Thank you again for your help.
Author
Owner

@jmcnamara commented on GitHub (Mar 21, 2018):

Thanks for the report and follow-up. Closing.

<!-- gh-comment-id:374906945 --> @jmcnamara commented on GitHub (Mar 21, 2018): Thanks for the report and follow-up. Closing.
Sign in to join this conversation.
No milestone
No project
No assignees
1 participant
Notifications
Due date
The due date is invalid or out of range. Please use the format "yyyy-mm-dd".

No due date set.

Dependencies

No dependencies set.

Reference: github-starred/libxlsxwriter#125
No description provided.