-
-
Notifications
You must be signed in to change notification settings - Fork 2k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
feat(python): add "pyxlsb" engine support to read_excel
(for excel binary workbook files)
#11248
Merged
stinodego
merged 4 commits into
pola-rs:main
from
alexander-beedie:read-binary-excel-workbooks
Sep 26, 2023
Merged
feat(python): add "pyxlsb" engine support to read_excel
(for excel binary workbook files)
#11248
stinodego
merged 4 commits into
pola-rs:main
from
alexander-beedie:read-binary-excel-workbooks
Sep 26, 2023
Conversation
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
alexander-beedie
requested review from
ritchie46 and
stinodego
as code owners
September 22, 2023 13:49
alexander-beedie
changed the title
feat(python): adds "pyxlsb" engine support to
feat(python): add "pyxlsb" engine support to Sep 22, 2023
read_excel
(for reading binary workbook files)read_excel
(for reading excel binary workbook files)
github-actions
bot
added
enhancement
New feature or an improvement of an existing feature
python
Related to Python Polars
labels
Sep 22, 2023
alexander-beedie
changed the title
feat(python): add "pyxlsb" engine support to
feat(python): add "pyxlsb" engine support to Sep 22, 2023
read_excel
(for reading excel binary workbook files)read_excel
(for excel binary workbook files)
stinodego
reviewed
Sep 25, 2023
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Great stuff as always! Some minor remarks/questions.
alexander-beedie
force-pushed
the
read-binary-excel-workbooks
branch
from
September 25, 2023 21:11
1d59209
to
a932419
Compare
stinodego
approved these changes
Sep 26, 2023
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
All right, good to go!
romanovacca
pushed a commit
to romanovacca/polars
that referenced
this pull request
Oct 1, 2023
…binary workbook files) (pola-rs#11248)
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Labels
Add this suggestion to a batch that can be applied as a single commit.
This suggestion is invalid because no changes were made to the code.
Suggestions cannot be applied while the pull request is closed.
Suggestions cannot be applied while viewing a subset of changes.
Only one suggestion per line can be applied in a batch.
Add this suggestion to a batch that can be applied as a single commit.
Applying suggestions on deleted lines is not supported.
You must change the existing code in this line in order to create a valid suggestion.
Outdated suggestions cannot be applied.
This suggestion has been applied or marked resolved.
Suggestions cannot be applied from pending reviews.
Suggestions cannot be applied on multi-line comments.
Suggestions cannot be applied while the pull request is queued to merge.
Suggestion cannot be applied right now. Please check back later.
Closes #11181 (and also closes #11184 by adding a
.. versionadded
tag to the docs).Adds support for the
pyxlsb
engine so that we can also read Excel Binary Workbook files (those with an ".xlsb" extension, which are not compatible with any of the existing ".xlsx" engines). Note that this engine does not currently autodetect datetime/date columns (it reads them in as Excel's native offset-Julian float), and therefore requires the use ofschema_overrides
to load them correctly (I have added a note in the docstring about this).Also: improves Date parsing for OpenOffice files via
read_ods
).Also: slightly updates
show_versions()
docs/example with the latest libs.Support for spreadsheet data is improving nicely; with this update
read_excel
can now read all of the major Excel formats (".xlsx", ".xlsm", "xlsb"), and we can handle the OpenOffice ".ods" format viaread_ods
.Example
Reading ".xlsb" files:
Before
After
FYI: @SaelKimberly has been experimenting with writing a potentially superior ".xlsb" reading engine; if & when this is ready (and well tested and available from
pypi
;) we can look at including that as the default.xlsb
engine instead - looking forward to it 👍