Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

ValueError ordinal must be >= 1 #79

Open
VolodyaCO opened this issue Feb 17, 2021 · 2 comments
Open

ValueError ordinal must be >= 1 #79

VolodyaCO opened this issue Feb 17, 2021 · 2 comments

Comments

@VolodyaCO
Copy link

I'm trying to use parquet.reader(file_obj), but when I do on my parquet I find this error:

    for row in parquet.reader(fo):
  File "/home/vladimir/.local/share/virtualenvs/ComisionDeLaVerdad-FivqEOe7/lib/python3.7/site-packages/parquet/__init__.py", line 472, in reader
    dict_items = _read_dictionary_page(file_obj, schema_helper, page_header, cmd)
  File "/home/vladimir/.local/share/virtualenvs/ComisionDeLaVerdad-FivqEOe7/lib/python3.7/site-packages/parquet/__init__.py", line 395, in _read_dictionary_page
    return convert_column(values, schema_element) if schema_element.converted_type is not None else values
  File "/home/vladimir/.local/share/virtualenvs/ComisionDeLaVerdad-FivqEOe7/lib/python3.7/site-packages/parquet/converted_types.py", line 68, in convert_column
    return [datetime.date.fromordinal(d) for d in data]
  File "/home/vladimir/.local/share/virtualenvs/ComisionDeLaVerdad-FivqEOe7/lib/python3.7/site-packages/parquet/converted_types.py", line 68, in <listcomp>
    return [datetime.date.fromordinal(d) for d in data]

What can I do?

@jcrobak
Copy link
Owner

jcrobak commented Mar 27, 2021

Hi, did you open the file in binary mode? We recently updated the example in the readme https://github.com/jcrobak/parquet-python#example

@VolodyaCO
Copy link
Author

The error remains:

>>> import parquet
>>> with open("victimas_union_recat.parquet", "rb") as fo:
...   for row in parquet.reader(fo):
...     pass
... 
Traceback (most recent call last):
  File "<stdin>", line 2, in <module>
  File "/home/vladimir/.local/share/virtualenvs/ComisionDeLaVerdad-FivqEOe7/lib/python3.7/site-packages/parquet/__init__.py", line 472, in reader
    dict_items = _read_dictionary_page(file_obj, schema_helper, page_header, cmd)
  File "/home/vladimir/.local/share/virtualenvs/ComisionDeLaVerdad-FivqEOe7/lib/python3.7/site-packages/parquet/__init__.py", line 395, in _read_dictionary_page
    return convert_column(values, schema_element) if schema_element.converted_type is not None else values
  File "/home/vladimir/.local/share/virtualenvs/ComisionDeLaVerdad-FivqEOe7/lib/python3.7/site-packages/parquet/converted_types.py", line 68, in convert_column
    return [datetime.date.fromordinal(d) for d in data]
  File "/home/vladimir/.local/share/virtualenvs/ComisionDeLaVerdad-FivqEOe7/lib/python3.7/site-packages/parquet/converted_types.py", line 68, in <listcomp>
    return [datetime.date.fromordinal(d) for d in data]
ValueError: ordinal must be >= 1

I finally used pyarrow (as recommended by the pandas.read_parquet method)

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants