mirror of
https://github.com/jmcnamara/libxlsxwriter.git
synced 2026-05-15 14:15:54 -06:00
[GH-ISSUE #290] German "Umlauts" corrupt the file. #233
Labels
No labels
awaiting user feedback
bug
cmake
cmake
docs
feature request
in progress
long term
medium term
medium term
pull-request
question
question
ready to close
short term
under investigation
wontfix
No milestone
No project
No assignees
1 participant
Notifications
Due date
No due date set.
Dependencies
No dependencies set.
Reference: github-starred/libxlsxwriter#233
Loading…
Add table
Add a link
Reference in a new issue
No description provided.
Delete branch "%!s()"
Deleting a branch is permanent. Although the deleted branch may continue to exist for a short time before it actually gets removed, it CANNOT be undone in most cases. Continue?
Originally created by @FrankenApps on GitHub (May 19, 2020).
Original GitHub issue: https://github.com/jmcnamara/libxlsxwriter/issues/290
Originally assigned to: @jmcnamara on GitHub.
Hi, thanks for this great library.
Whenever I try to write german Umlauts to a cell, this results in a corrupted file, which can not be repaired by Excel. This is bad, because quite a few german names contain these letters, e.g.
A basic example would be
I already tried to encode my source file with UTF-8, but unfortunately had no luck so far.
When I insert the russian letters, from the utf8.c example I get an intact workbook.
I do not understand, why this problem arises, because all the .xml files inside the .xlsx file seem to be UTF-8 encoded and should therefore support these characters...
@jmcnamara commented on GitHub (May 19, 2020):
@utelle Could you help with this question.
@jmcnamara commented on GitHub (May 19, 2020):
BTW, the first example works for me on MacOS:
Output:
@FrankenApps commented on GitHub (May 19, 2020):
That is interesting, when I run this Code on Windows 10 64-bit
I get the attached file.
I will have to try this on iOS soon and then could give you an update.
BTW, this is where the problem occurs (in sharedStrings.xml):
utf8.xlsx
@utelle commented on GitHub (May 19, 2020):
The German umlaut ä in sharedStrings.xml is encoded in ISO 8859-1 resp Windows-1252. That is, it is definitely not encoded as UTF-8 - therefore Excel refuses to open the file correctly.
If you see the string encoding="UTF-8" in the XML files, this simply means that it is expected that strings are encoded in UTF-8, but it is your responsibility as the developer to make sure that the UTF-8 encoding is effectively used for strings.
Obviously, your source code editor does not use UTF-8 encoding, but Windows-1252 resp ISO 8859-1 encoding.
There are 2 possible approaches to overcome the problem:
The latter would be either
or
However, I'm not sure whether the second form really works for standard C strings. It could be that this form can be used only for wide strings (
wchar_t).If you enter the text strings containing German umlauts or other Unicode characters from other sources (user interface, files etc), you will have to make sure that those sources are UTF-8 encoded.
@FrankenApps commented on GitHub (May 19, 2020):
@utelle Thanks a lot.
I figured it out, unfortunately the problem was on my side.
I saved the source file as Unicode (UTF-8 with signature) - Codepage 65001 from Visual Studio. However this does not seem to be the correct setting.
Saving the file with Unicode (UTF-8 without signature) - Codepage 65001 works as expected.
The method of using
works for me, too.
@jmcnamara commented on GitHub (May 19, 2020):
Thanks @utelle