Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Exporting seems to convert some numbers to floats #306

Open
mikix opened this issue Oct 4, 2024 · 0 comments
Open

Exporting seems to convert some numbers to floats #306

mikix opened this issue Oct 4, 2024 · 0 comments
Labels
bug Something isn't working

Comments

@mikix
Copy link
Contributor

mikix commented Oct 4, 2024

Describe the bug
I noticed this when uploading data - that the version was being conveyed as a float (e.g. 1.0) in the parquet despite being defined as an int.

This only affects the parquet files, not the csv files. This may affect other code paths too, but I only happened to notice it with the version.

To Reproduce
Steps to reproduce the behavior:

  1. Take a study that has a version defined like:
CREATE TABLE data_metrics__meta_version AS
SELECT 1 AS data_package_version;

(You can even add CAST(1 AS INTEGER) if you like.
3. Build and export it.
4. Inspect the parquet file - it holds a float type there.

Expected behavior
I'd expect integer type.

Additional context
This seems to be due to DuckDB's col_pyarrow_types_from_sql (called by exporter.py) converting NUMBER to float64. However, this is not wrong per se:

>>> duckdb.sql("SELECT 1 AS i")
┌───────┐
│   i   │
│ int32 │
├───────┤
│     1 │
└───────┘
>>> duckdb.sql("SELECT 1 AS i").description
[('i', 'NUMBER', None, None, None, None, None)]
>>> duckdb.sql("SELECT CAST(1 AS FLOAT) AS i").description
[('i', 'NUMBER', None, None, None, None, None)]

So... ideally we'd ask duckdb for a better schema description - this one is super vague. (Note that you can't do CAST(1 as NUMBER) -- duckdb doesn't recognize it as a type -- don't know why it's using it in this description.)

@mikix mikix added the bug Something isn't working label Oct 4, 2024
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
bug Something isn't working
Projects
None yet
Development

No branches or pull requests

1 participant