-
-
Notifications
You must be signed in to change notification settings - Fork 2k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
feat(python): Write data at table level in write_excel
#17757
Conversation
Ahh... the contention is that |
Co-authored-by: Alexander Beedie <[email protected]>
Yeah it does (which is a bit weird, no?). I'm looking into the date formatting issue, it's coming back as YYYY-MM-DD regardless of output format. I'm off to see a movie now but I'll look into this later. |
It's coming back to me now;
Have fun ;) If you can find a way to address this issue then it looks like a sensible update to me. |
@alexander-beedie I figured it out, and might be useful info for you in the future, and doesn't really make sense hence why it's confusing: when the workbook's (failure) Write data (with no table), set column formatfrom datetime import date
import xlsxwriter
# specify default workbook date format
wb = xlsxwriter.Workbook("date_format.xlsx", {"default_date_format": "yyyy.dd.mm"})
ws = wb.add_worksheet("Date")
data = [date(2024, 1, 1), date(2024, 1, 2)]
ws.write_column(0, 0, [date(2024, 1, 1), date(2024, 1, 2)])
# assign column format (has no effect)
date_format = wb.add_format({"num_format": "mm-dd-yyyy"})
ws.set_column(0, 0, 10, date_format)
wb.close() Result: column format completely ignored. (failure) Write empty table, then write data, then set column formatNext, if we set the default date format, and use from datetime import date
import xlsxwriter
wb = xlsxwriter.Workbook("date_format.xlsx", {"default_date_format": "yyyy.dd.mm"})
ws = wb.add_worksheet("Date")
data = [[date(2024, 1, 1)], [date(2024, 1, 2)]]
date_format = wb.add_format({"num_format": "mm-dd-yyyy"})
ws.add_table("A1:A2", options={
"header_row": False,
"columns": [{"format": date_format}]
})
# write the data afterwards
ws.write_column(0, 0, [date(2024, 1, 1), date(2024, 1, 2)])
ws.set_column(0, 0, 10, date_format)
wb.close() (success) Supply the data with the table.If we supply the data to the from datetime import date
import xlsxwriter
wb = xlsxwriter.Workbook("date_format.xlsx", {"default_date_format": "yyyy.dd.mm"})
ws = wb.add_worksheet("Date")
data = [[date(2024, 1, 1)], [date(2024, 1, 2)]]
date_format = wb.add_format({"num_format": "mm-dd-yyyy"})
ws.add_table(
"A1:A2",
options={
"data": data, # !!! SEE HERE !!
"header_row": False,
"columns": [{"format": date_format}],
},
)
wb.close() Moving the data write to the |
Codecov ReportAll modified and coverable lines are covered by tests ✅
Additional details and impacted files@@ Coverage Diff @@
## main #17757 +/- ##
==========================================
+ Coverage 80.40% 80.53% +0.12%
==========================================
Files 1502 1503 +1
Lines 197041 197026 -15
Branches 2794 2800 +6
==========================================
+ Hits 158439 158676 +237
+ Misses 38088 37830 -258
- Partials 514 520 +6 ☔ View full report in Codecov by Sentry. |
write_excel
Nice discovery - any potential downsides to this approach? The tests can't really validate formatting, so do we see the expected formatting when eyeballing some of the more sophisticated examples manually? (I'll start double-checking some ;) |
I don't think so. The "heavily customized formatting/definition" parameter set in which includes the top/bottom formatting, column-specific formatting, and dtype-specific formatting. Also, I left in the column-formatting section which is probably not strictly necessary any more, I but I feel probably can't hurt if someone decides to manually add data below the table for whatever reason. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Hmm, I've tested on some other cases and it looks like the new code is not behaving quite the same in several instances - specifically, issues with the table header.
For example:
pl.DataFrame({
"id": ["a123", "b345", "c567", "d789"],
"values": [99, 45, 50, 85],
"misc": [1.2, 3.4, 5.6, 7.8],
}).write_excel(
"~/output.xlsx",
table_style={"style": "Table Style Medium 15"},
)
With this patch the second and third column header names end up black; they are present, but do not conform to the given table_style
, which defines them as bold/white, so they look like they aren't there 🤔
Before:
After:
Will need some further tinkering :)
(Interestingly there is also a 2-pixel per column increase in column width with the new code, though that's not important - just an odd observation!)
Thanks--I'll take a look at this tonight and do some more thorough testing. I wonder if applying the column-level formatting overrides the table formatting, since the numerical columns are applying a |
Excel is a dark art 😆 |
@alexander-beedie I reproduced, and indeed removing the column formatting fixed it. I think that we should simply leave the formatting to |
write_excel
write_excel
Nice; will take another run through it today 👍 |
write_excel
write_excel
Looks good :) |
Resolves #17756.
Previously,
write_column
was used to write data to the table, which writes formatting to each cell individually. This update writes the data in theadd_table
step, which is much simpler, and which also supplies formatting to the table, which in turn formats the columns. The result is simpler data write, simpler formatting, and (very minor) reduction in file size.