mirror of
https://github.com/jmcnamara/libxlsxwriter.git
synced 2026-05-15 14:15:54 -06:00
[GH-ISSUE #189] Issue with string cells containing \r\n, show up as _x000D_\n in Excel #151
Labels
No labels
awaiting user feedback
bug
cmake
cmake
docs
feature request
in progress
long term
medium term
medium term
pull-request
question
question
ready to close
short term
under investigation
wontfix
No milestone
No project
No assignees
1 participant
Notifications
Due date
No due date set.
Dependencies
No dependencies set.
Reference: github-starred/libxlsxwriter#151
Loading…
Add table
Add a link
Reference in a new issue
No description provided.
Delete branch "%!s()"
Deleting a branch is permanent. Although the deleted branch may continue to exist for a short time before it actually gets removed, it CANNOT be undone in most cases. Continue?
Originally created by @j15e on GitHub (Jul 5, 2018).
Original GitHub issue: https://github.com/jmcnamara/libxlsxwriter/issues/189
Originally assigned to: @jmcnamara on GitHub.
I am using the latest version (well, our fork https://github.com/hooktstudios/libxlsxwriter) and via the ruby wrapper (https://github.com/Paxa/fast_excel).
When I insert strings from my database (PostgreSQL) with multiple lines, the generated XLSX contains weird
_x000D_at the end of each new line inside each cell :I made a quick test modifying the test data01 :
b616fb59ff/test/functional/src/test_data01.c (L17)To test this string :
And I do get the string
_x000D_in the sheet too ;As a quick fix I replace my
\r\nwith\n, but I would be glad to pinpoint what is the exact issue or a better solution.I compared the resulting XML with the pervious library I was using (RubyXL), I noticed it was always writing text as inlineString (which it creates much larger files), but I had not this issue. Not sure this really leads into the good direction to find the issue, could be unrelated.
@jmcnamara commented on GitHub (Jul 5, 2018):
Hi,
This is the correct behavior in terms of how Excel behaves. Excel encodes
\rto_x000D_in the XML.\nis stored as the character\n. This is the same result you would get if you pasted a string with\rinto Excel.However, it shouldn't be visible in Excel. If you are using something other than Excel it may be.
This behaviour is mentioned a few places on the internet as well such as here or here or here.
If you don’t want the encoded "\r" in the output file you should strip it from the input data.
John
@j15e commented on GitHub (Jul 5, 2018):
If this is the expected behavior of libxlsxwriter, I guess you can close this.
Adding a closing comments with some references so other people don't search as long as I did :
I did find those internet mentions too, but I expected libxlsxwriter would maybe handle it as my previous library (RubyXL) did. I tried to find out why, it is not 100% clear.
I haven't found any code handling this specifically in RubyXL, so it looks like it is a side effect of the underlying XML library which was originally also escaping
\r(see discussion here https://github.com/weshatheleopard/rubyXL/issues/202).Per the XML spec the line return of the document nodes should indeed be only
\n, buttab, carriage return, line feed, and the legal characters of Unicodeare all legal characters inside an xml node text.So both results are correct.
Maybe I can add the removal of
\rupstream to the ruby wrapper as it would be convenient to remove it in a context where the data is coming from internet they always contain bothCR LFin the submitted form (https://www.w3.org/TR/html5/forms.html#the-textarea-element).PS It is effectively an issue only with non-Excel like LibreOffice (shows
_x000D_) and Numbers (shows 2 line breaks).@jmcnamara commented on GitHub (Jul 5, 2018):
This is the expected (or unexpected) behavior of Excel. libxlsxwriter just implements it.