[GH-ISSUE #311] Support for Dynamic Array Functions like FILTER #252

Closed
opened 2026-05-05 12:02:12 -06:00 by gitea-mirror · 12 comments
Owner

Originally created by @Robert-M-Muench on GitHub (Nov 5, 2020).
Original GitHub issue: https://github.com/jmcnamara/libxlsxwriter/issues/311

Originally assigned to: @jmcnamara on GitHub.

Looks like the new dynamic array functions result in a bit different XML then before. Any chance to add these?

xmllint --format bsh/xl/worksheets/sheet3.xml | rg "filter"
        <f t="array" ref="E2:T326">_xlfn._xlws.FILTER(... <f>
Originally created by @Robert-M-Muench on GitHub (Nov 5, 2020). Original GitHub issue: https://github.com/jmcnamara/libxlsxwriter/issues/311 Originally assigned to: @jmcnamara on GitHub. Looks like the new dynamic array functions result in a bit different XML then before. Any chance to add these? ``` xmllint --format bsh/xl/worksheets/sheet3.xml | rg "filter" <f t="array" ref="E2:T326">_xlfn._xlws.FILTER(... <f> ```
gitea-mirror 2026-05-05 12:02:12 -06:00
Author
Owner

@jmcnamara commented on GitHub (Nov 5, 2020):

That is already supported. You just need to prefix the formula with _xlfn._xlws.

This is explained, somewhat, in this section of the docs: Formulas added in Excel 2010 and later.

For example:

#include "xlsxwriter.h"

int main() {

    /* Create a new workbook and add a worksheet. */
    lxw_workbook  *workbook  = workbook_new("array_formula.xlsx");
    lxw_worksheet *worksheet = workbook_add_worksheet(workbook, NULL);

    worksheet_write_array_formula(worksheet, 3, 5, 18, 8, "_xlfn._xlws.FILTER(A5:D20,C5:C20=H2,\"\")", NULL);

    /* It can also be written like this: */
    /* worksheet_write_array_formula(worksheet, 3, 5, 18, 8, "{=_xlfn._xlws.FILTER(A5:D20,C5:C20=H2,\"\")}", NULL); */

    workbook_close(workbook);

    return 0;
}

Output:

Screenshot 2020-11-05 at 11 38 55

<!-- gh-comment-id:722325057 --> @jmcnamara commented on GitHub (Nov 5, 2020): That is already supported. You just need to prefix the formula with `_xlfn._xlws.` This is explained, somewhat, in this section of the docs: [Formulas added in Excel 2010 and later](http://libxlsxwriter.github.io/working_with_formulas.html#ww_formulas_future). For example: ```C #include "xlsxwriter.h" int main() { /* Create a new workbook and add a worksheet. */ lxw_workbook *workbook = workbook_new("array_formula.xlsx"); lxw_worksheet *worksheet = workbook_add_worksheet(workbook, NULL); worksheet_write_array_formula(worksheet, 3, 5, 18, 8, "_xlfn._xlws.FILTER(A5:D20,C5:C20=H2,\"\")", NULL); /* It can also be written like this: */ /* worksheet_write_array_formula(worksheet, 3, 5, 18, 8, "{=_xlfn._xlws.FILTER(A5:D20,C5:C20=H2,\"\")}", NULL); */ workbook_close(workbook); return 0; } ``` Output: ![Screenshot 2020-11-05 at 11 38 55](https://user-images.githubusercontent.com/94267/98236591-83e15600-1f5b-11eb-8f07-39db1e445eda.png)
Author
Owner

@Robert-M-Muench commented on GitHub (Nov 5, 2020):

Yes, I found this out after digging a bit around. Pretty cool. However, there is a little difference, as you can see:

image

In the upper case, you see that FILTER is not an array function. And I use a dynamic changing filter clause, which filters a list of 300+ rows down to the ones you see.

In the lower case, you see the effect written with libxlsxwriter, which uses an array function, which leads to all the "#NV" errors for rows that don't fit the filter criteria.

I took a look at the XML but can't find any difference between both versions. Maybe in the upper case, the cell->type is something different than what libxlsxwriter writes out (ARRAY_FORMULA_CELL):

image

Is there a way to write such an array formula without the {...} characters/type which indicates a (static) array function?

<!-- gh-comment-id:722487779 --> @Robert-M-Muench commented on GitHub (Nov 5, 2020): Yes, I found this out after digging a bit around. Pretty cool. However, there is a little difference, as you can see: ![image](https://user-images.githubusercontent.com/1826391/98267045-3c72be00-1f8b-11eb-9005-4be779c7a4a2.png) In the upper case, you see that FILTER is not an array function. And I use a dynamic changing filter clause, which filters a list of 300+ rows down to the ones you see. In the lower case, you see the effect written with libxlsxwriter, which uses an array function, which leads to all the "#NV" errors for rows that don't fit the filter criteria. I took a look at the XML but can't find any difference between both versions. Maybe in the upper case, the cell->type is something different than what libxlsxwriter writes out (ARRAY_FORMULA_CELL): ![image](https://user-images.githubusercontent.com/1826391/98267708-f833ed80-1f8b-11eb-90df-80112432f8f3.png) Is there a way to write such an array formula without the `{...}` characters/type which indicates a (static) array function?
Author
Owner

@Robert-M-Muench commented on GitHub (Nov 5, 2020):

The differences in the XML are for the row entries:

image

Left side is the one which doesn't produce the "#NV" errors.

<!-- gh-comment-id:722498838 --> @Robert-M-Muench commented on GitHub (Nov 5, 2020): The differences in the XML are for the row entries: ![image](https://user-images.githubusercontent.com/1826391/98270216-f0297d00-1f8e-11eb-9cae-a9578b6e9f69.png) Left side is the one which doesn't produce the "#NV" errors.
Author
Owner

@jmcnamara commented on GitHub (Nov 5, 2020):

I'll look into it a bit more and let you know what I find. I have some ideas about what the issue may be but I need to test them out.

<!-- gh-comment-id:722540948 --> @jmcnamara commented on GitHub (Nov 5, 2020): I'll look into it a bit more and let you know what I find. I have some ideas about what the issue may be but I need to test them out.
Author
Owner

@jmcnamara commented on GitHub (Nov 6, 2020):

I think the issue (or one of the issues) is that the cell contains a reference to a Metadata file that contains additional information about the formula.

For example here is the cell and formula elements from a file similar to the one created in the example above but created in Excel:

      <c r="F4" cm="1">
        <f t="array" ref="F4:I19">_xlfn._xlws.FILTER(A5:D20,C5:C20=H2,"")</f>
        <v>0</v>
      </c>

The difference between this and libxlsxwriter generated file is the cm="1" attribute. Removing this from the Excel file creates a {=FILTER()} style formula when opened in Excel.

The cm reference is to the xl/metadata.xml file.

This, or a secondary issue, may be the reason behind the #NV values. Could you attach the Excel file shown above and the libxlsxwriter equivalent to the GitHub issue and I'll dig into it a bit deeper. Or else email them to jmcnamara@cpan.org.

<!-- gh-comment-id:722724985 --> @jmcnamara commented on GitHub (Nov 6, 2020): I think the issue (or one of the issues) is that the cell contains a reference to a Metadata file that contains additional information about the formula. For example here is the cell and formula elements from a file similar to the one created in the example above but created in Excel: ```xml <c r="F4" cm="1"> <f t="array" ref="F4:I19">_xlfn._xlws.FILTER(A5:D20,C5:C20=H2,"")</f> <v>0</v> </c> ``` The difference between this and libxlsxwriter generated file is the `cm="1"` attribute. Removing this from the Excel file creates a `{=FILTER()}` style formula when opened in Excel. The `cm` reference is to the `xl/metadata.xml` file. This, or a secondary issue, may be the reason behind the `#NV` values. Could you attach the Excel file shown above and the libxlsxwriter equivalent to the GitHub issue and I'll dig into it a bit deeper. Or else email them to jmcnamara@cpan.org.
Author
Owner

@Robert-M-Muench commented on GitHub (Nov 6, 2020):

Example XLSX files send to you.

<!-- gh-comment-id:722992404 --> @Robert-M-Muench commented on GitHub (Nov 6, 2020): Example XLSX files send to you.
Author
Owner

@jmcnamara commented on GitHub (Nov 8, 2020):

Thanks for the files.

The bad news is that if you remove the cm=n reference from the Excel generated cell attributes the formula stops working as expected and starts to behave like the libxlsxwriter file. It still works for the original area but the {=FILTER()} is present and if you change the "Range to" value the displayed cell range doesn't change. And if the new range is smaller than the old range the #N/A values return. I'll send you the file for reference.

So unfortunately this won't be fixable without adding the metadata.xlm file, the metadata elements, the cell linkage and the .rels and Content_Types additions. That is quite a bit of work which doesn't have a good return on investment for such a small range of new dynamic array formulas. So, unfortunately, for now, this is probably a won't fix.

From an implementors point of view this is a rather disappointing change in the way formulas are handled in the file format. This is the first set of functions in 13 years that Excel didn't just load and re-evaluate directly from the worksheet.xml file.

<!-- gh-comment-id:723573203 --> @jmcnamara commented on GitHub (Nov 8, 2020): Thanks for the files. The bad news is that if you remove the `cm=n` reference from the Excel generated cell attributes the formula stops working as expected and starts to behave like the libxlsxwriter file. It still works for the original area but the `{=FILTER()}` is present and if you change the "Range to" value the displayed cell range doesn't change. And if the new range is smaller than the old range the `#N/A` values return. I'll send you the file for reference. So unfortunately this won't be fixable without adding the metadata.xlm file, the metadata elements, the cell linkage and the .rels and Content_Types additions. That is quite a bit of work which doesn't have a good return on investment for such a small range of new dynamic array formulas. So, unfortunately, for now, this is probably a won't fix. From an implementors point of view this is a rather disappointing change in the way formulas are handled in the file format. This is the first set of functions in 13 years that Excel didn't just load and re-evaluate directly from the worksheet.xml file.
Author
Owner

@Robert-M-Muench commented on GitHub (Nov 8, 2020):

The #N/A is ugly but would work in my use-case.

I think the dynamic array functions will become more and more mainstream because they are very powerful. And that MS did implement them in such a complicated way might be a hint, that they just don't want to let others use them...

How about estimating a sponsoring for this feature?

<!-- gh-comment-id:723587264 --> @Robert-M-Muench commented on GitHub (Nov 8, 2020): The `#N/A` is ugly but would work in my use-case. I think the dynamic array functions will become more and more mainstream because they are very powerful. And that MS did implement them in such a complicated way might be a hint, that they just don't want to let others use them... How about estimating a sponsoring for this feature?
Author
Owner

@jmcnamara commented on GitHub (Nov 9, 2020):

I'll try look into implementing it over the Christmas holidays.

<!-- gh-comment-id:724271555 --> @jmcnamara commented on GitHub (Nov 9, 2020): I'll try look into implementing it over the Christmas holidays.
Author
Owner

@Robert-M-Muench commented on GitHub (Nov 12, 2020):

Great! What sources do you use to dig through the XLSX file-format?

<!-- gh-comment-id:726389283 --> @Robert-M-Muench commented on GitHub (Nov 12, 2020): Great! What sources do you use to dig through the XLSX file-format?
Author
Owner

@jmcnamara commented on GitHub (Nov 14, 2020):

What sources do you use to dig through the XLSX file-format?

I mainly just unzip the files and diff them. I use a small Perl program to unzip the xlsx and fix some of the xml/vml files to make them more suitable for diffing: such as sort .rel entries. I then use Araxis Merge for a visual diff because it splits the single line xml data into multiple (virtual) lines which make the comparisons easier. I occasionally use xmllint --format for local inspection. And I sometimes refer to ECMA-376-1 Office Open XML File Formats — Fundamentals and Markup Language Reference specification (like in this case to see what the cm attribute was).

Nothing very fancy or complicated. :-)

<!-- gh-comment-id:727107943 --> @jmcnamara commented on GitHub (Nov 14, 2020): > What sources do you use to dig through the XLSX file-format? I mainly just unzip the files and diff them. I use a small Perl program to unzip the xlsx and fix some of the xml/vml files to make them more suitable for diffing: such as sort .rel entries. I then use Araxis Merge for a visual diff because it splits the single line xml data into multiple (virtual) lines which make the comparisons easier. I occasionally use `xmllint --format` for local inspection. And I sometimes refer to [ECMA-376-1 Office Open XML File Formats — Fundamentals and Markup Language Reference](http://www.ecma-international.org/publications/standards/Ecma-376.htm) specification (like in this case to see what the cm attribute was). Nothing very fancy or complicated. :-)
Author
Owner

@jmcnamara commented on GitHub (Apr 22, 2021):

Closing this and merging it with #327

<!-- gh-comment-id:824830873 --> @jmcnamara commented on GitHub (Apr 22, 2021): Closing this and merging it with #327
Sign in to join this conversation.
No milestone
No project
No assignees
1 participant
Notifications
Due date
The due date is invalid or out of range. Please use the format "yyyy-mm-dd".

No due date set.

Dependencies

No dependencies set.

Reference: github-starred/libxlsxwriter#252
No description provided.