You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
I'm reading a large-ish excel file and get an error on the loadWorkbook(). I suspect this is an upstream error for something else, but I'm unable to resolve it. I use rig to roll back to an early R-version (and library) that doesn't produce the error. Just posting something here because I suspect it's something that other packages in python have noticed about -- see this thread, for example, nightscape/spark-excel#231
Expected behavior
Error: IOException (Java): The file appears to be potentially malicious. This file embeds more internal file entries than expected.
This may indicates that the file could pose a security risk.
You can adjust this limit via ZipSecureFile.setMaxFileCount() if you need to work with files which are very large.
Limits: MAX_FILE_COUNT: 1000
sessionInfo() output
R version 4.4.1 (2024-06-14)
Platform: x86_64-pc-linux-gnu
Running under: Pop!_OS 22.04 LTS
Matrix products: default
BLAS: /usr/lib/x86_64-linux-gnu/openblas-pthread/libblas.so.3
LAPACK: /usr/lib/x86_64-linux-gnu/openblas-pthread/libopenblasp-r0.3.20.so; LAPACK version 3.10.0
locale:
[1] LC_CTYPE=en_US.UTF-8 LC_NUMERIC=C LC_TIME=en_US.UTF-8 LC_COLLATE=en_US.UTF-8 LC_MONETARY=en_US.UTF-8
[6] LC_MESSAGES=en_US.UTF-8 LC_PAPER=en_US.UTF-8 LC_NAME=en_US.UTF-8 LC_ADDRESS=en_US.UTF-8 LC_TELEPHONE=en_US.UTF-8
[11] LC_MEASUREMENT=en_US.UTF-8 LC_IDENTIFICATION=en_US.UTF-8
time zone: America/Chicago
tzcode source: system (glibc)
attached base packages:
[1] stats graphics grDevices utils datasets methods base
other attached packages:
[1] magrittr_2.0.3 lubridate_1.9.3 forcats_1.0.0 stringr_1.5.1 dplyr_1.1.4 purrr_1.0.2 readr_2.1.5 tidyr_1.3.1 tibble_3.2.1
[10] ggplot2_3.5.1 tidyverse_2.0.0 readxl_1.4.3 XLConnect_1.0.10
loaded via a namespace (and not attached):
[1] vctrs_0.6.5 cli_3.6.3 rlang_1.1.4 stringi_1.8.4 generics_0.1.3 rJava_1.0-11 glue_1.7.0 colorspace_2.1-0 hms_1.1.3
[10] scales_1.3.0 fansi_1.0.6 grid_4.4.1 cellranger_1.1.0 munsell_0.5.1 tzdb_0.4.0 lifecycle_1.0.4 compiler_4.4.1 timechange_0.3.0
[19] pkgconfig_2.0.3 rstudioapi_0.16.0 R6_2.5.1 tidyselect_1.2.1 utf8_1.2.4 pillar_1.9.0 tools_4.4.1 withr_3.0.0 gtable_0.3.5
Additional environment information
No response
Description
I'm reading a large-ish excel file and get an error on the loadWorkbook(). I suspect this is an upstream error for something else, but I'm unable to resolve it. I use rig to roll back to an early R-version (and library) that doesn't produce the error. Just posting something here because I suspect it's something that other packages in python have noticed about -- see this thread, for example, nightscape/spark-excel#231
Expected behavior
Error: IOException (Java): The file appears to be potentially malicious. This file embeds more internal file entries than expected.
This may indicates that the file could pose a security risk.
You can adjust this limit via ZipSecureFile.setMaxFileCount() if you need to work with files which are very large.
Limits: MAX_FILE_COUNT: 1000
How to Reproduce
The file attached is an example of one that generates this error when I use loadWorkbook() on it for the most recent version of R: https://www.dropbox.com/scl/fi/9snzbt6sjrxz9noz3y1z2/University-of-Wisconsin-Madison.xlsx?rlkey=ron57jqpzcvhd9d7w9wg7aqdj&st=wfqohp76&dl=0 (This is public data from colleges of engineering about their program submitted to us annually.)
The text was updated successfully, but these errors were encountered: