[GH-ISSUE #189] Issue with string cells containing \r\n, show up as _x000D_\n in Excel #151

Closed
opened 2026-05-05 11:47:45 -06:00 by gitea-mirror · 3 comments
Owner

Originally created by @j15e on GitHub (Jul 5, 2018).
Original GitHub issue: https://github.com/jmcnamara/libxlsxwriter/issues/189

Originally assigned to: @jmcnamara on GitHub.

I am using the latest version (well, our fork https://github.com/hooktstudios/libxlsxwriter) and via the ruby wrapper (https://github.com/Paxa/fast_excel).

When I insert strings from my database (PostgreSQL) with multiple lines, the generated XLSX contains weird _x000D_ at the end of each new line inside each cell :

capture d ecran 2018-07-04 a 17 59 16

I made a quick test modifying the test data01 :

b616fb59ff/test/functional/src/test_data01.c (L17)

To test this string :

worksheet_write_string(worksheet, 0, 0, "Hello \r\nBonjour", NULL);

And I do get the string _x000D_ in the sheet too ;

capture d ecran 2018-07-05 a 00 02 39

As a quick fix I replace my \r\n with \n, but I would be glad to pinpoint what is the exact issue or a better solution.

I compared the resulting XML with the pervious library I was using (RubyXL), I noticed it was always writing text as inlineString (which it creates much larger files), but I had not this issue. Not sure this really leads into the good direction to find the issue, could be unrelated.

Originally created by @j15e on GitHub (Jul 5, 2018). Original GitHub issue: https://github.com/jmcnamara/libxlsxwriter/issues/189 Originally assigned to: @jmcnamara on GitHub. I am using the latest version (well, our fork https://github.com/hooktstudios/libxlsxwriter) and via the ruby wrapper (https://github.com/Paxa/fast_excel). When I insert strings from my database (PostgreSQL) with multiple lines, the generated XLSX contains weird `_x000D_` at the end of each new line inside each cell : ![capture d ecran 2018-07-04 a 17 59 16](https://user-images.githubusercontent.com/143380/42301931-0e4ba31c-7fe6-11e8-84a8-dc652a70aa83.png) I made a quick test modifying the test data01 : https://github.com/jmcnamara/libxlsxwriter/blob/b616fb59ffe0268b1ee0cecf822bf8ed45cf4f30/test/functional/src/test_data01.c#L17 To test this string : ```c worksheet_write_string(worksheet, 0, 0, "Hello \r\nBonjour", NULL); ``` And I do get the string `_x000D_` in the sheet too ; <img width="256" alt="capture d ecran 2018-07-05 a 00 02 39" src="https://user-images.githubusercontent.com/143380/42302051-c17dfa66-7fe6-11e8-8e4e-f2a87dd680e3.png"> As a quick fix I replace my ` \r\n` with `\n`, but I would be glad to pinpoint what is the exact issue or a better solution. I compared the resulting XML with the pervious library I was using (RubyXL), I noticed it was always writing text as inlineString (which it creates much larger files), but I had not this issue. Not sure this really leads into the good direction to find the issue, could be unrelated.
gitea-mirror 2026-05-05 11:47:45 -06:00
Author
Owner

@jmcnamara commented on GitHub (Jul 5, 2018):

Hi,

This is the correct behavior in terms of how Excel behaves. Excel encodes \r to _x000D_ in the XML. \n is stored as the character \n. This is the same result you would get if you pasted a string with\rinto Excel.

However, it shouldn't be visible in Excel. If you are using something other than Excel it may be.

This behaviour is mentioned a few places on the internet as well such as here or here or here.

If you don’t want the encoded "\r" in the output file you should strip it from the input data.

John

<!-- gh-comment-id:402626801 --> @jmcnamara commented on GitHub (Jul 5, 2018): Hi, This is the correct behavior in terms of how Excel behaves. Excel encodes `\r` to `_x000D_` in the XML. `\n` is stored as the character `\n`. This is the same result you would get if you pasted a string with`\r`into Excel. However, it shouldn't be visible in Excel. If you are using something other than Excel it may be. This behaviour is mentioned a [few places on the internet as well](https://www.google.com/search?&q=_x000D_+excel) such as [here](https://stackoverflow.com/questions/36167807/access-newline-becoming-x000d) or [here](https://answers.microsoft.com/en-us/office/forum/office_2010-word/field-code-inserts-x000d-in-office-2010-word/9b384b43-89c0-4861-b359-858f0be3f694?auth=1) or [here](https://forum.opencart.com/viewtopic.php?t=103108). If you don’t want the encoded "\r" in the output file you should strip it from the input data. John
Author
Owner

@j15e commented on GitHub (Jul 5, 2018):

If this is the expected behavior of libxlsxwriter, I guess you can close this.

Adding a closing comments with some references so other people don't search as long as I did :


I did find those internet mentions too, but I expected libxlsxwriter would maybe handle it as my previous library (RubyXL) did. I tried to find out why, it is not 100% clear.

I haven't found any code handling this specifically in RubyXL, so it looks like it is a side effect of the underlying XML library which was originally also escaping \r (see discussion here https://github.com/weshatheleopard/rubyXL/issues/202).

Per the XML spec the line return of the document nodes should indeed be only \n, but tab, carriage return, line feed, and the legal characters of Unicode are all legal characters inside an xml node text.

So both results are correct.

Maybe I can add the removal of \r upstream to the ruby wrapper as it would be convenient to remove it in a context where the data is coming from internet they always contain both CR LF in the submitted form (https://www.w3.org/TR/html5/forms.html#the-textarea-element).

PS It is effectively an issue only with non-Excel like LibreOffice (shows _x000D_) and Numbers (shows 2 line breaks).

<!-- gh-comment-id:402734632 --> @j15e commented on GitHub (Jul 5, 2018): If this is the expected behavior of libxlsxwriter, I guess you can close this. Adding a closing comments with some references so other people don't search as long as I did : --- I did find those internet mentions too, but I expected libxlsxwriter would maybe handle it as my previous library (RubyXL) did. I tried to find out why, it is not 100% clear. I haven't found any code handling this specifically in [RubyXL](https://github.com/weshatheleopard/rubyXL), so it looks like it is a side effect of the underlying XML library which was originally also escaping `\r` (see discussion here https://github.com/weshatheleopard/rubyXL/issues/202). Per the [XML spec](https://www.w3.org/TR/REC-xml/#sec-line-ends) the line return of the document nodes should indeed be only `\n`, but `tab, carriage return, line feed, and the legal characters of Unicode` are all legal characters inside an xml node text. So both results are correct. Maybe I can add the removal of `\r` upstream to the ruby wrapper as it would be convenient to remove it in a context where the data is coming from internet they always contain both `CR LF` in the submitted form (https://www.w3.org/TR/html5/forms.html#the-textarea-element). PS It is effectively an issue only with non-Excel like LibreOffice (shows `_x000D_`) and Numbers (shows 2 line breaks).
Author
Owner

@jmcnamara commented on GitHub (Jul 5, 2018):

If this is the expected behavior of libxlsxwriter

This is the expected (or unexpected) behavior of Excel. libxlsxwriter just implements it.

<!-- gh-comment-id:402743593 --> @jmcnamara commented on GitHub (Jul 5, 2018): > If this is the expected behavior of libxlsxwriter This is the expected (or unexpected) behavior of Excel. libxlsxwriter just implements it.
Sign in to join this conversation.
No milestone
No project
No assignees
1 participant
Notifications
Due date
The due date is invalid or out of range. Please use the format "yyyy-mm-dd".

No due date set.

Dependencies

No dependencies set.

Reference: github-starred/libxlsxwriter#151
No description provided.