Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Add software tests using Polars #35

Merged
merged 1 commit into from
Jan 29, 2024
Merged

Add software tests using Polars #35

merged 1 commit into from
Jan 29, 2024

Conversation

amotl
Copy link
Member

@amotl amotl commented Jan 29, 2024

About

Following up on validating pandas, this patch starts validating the Polars DataFrame library.

Details

Polars does not support writing to databases yet, nor does it support SQLAlchemy's AsyncEngine. So, this exercises pl.read_database() only, together with the crate and postgresql+psycopg_relaxed dialects, using urllib3 resp. Psycopg3.

References

/cc @surister, @SStorm

Copy link

codecov bot commented Jan 29, 2024

Codecov Report

All modified and coverable lines are covered by tests ✅

Comparison is base (d0c6791) 97.56% compared to head (93b2a8d) 97.56%.

Additional details and impacted files
@@           Coverage Diff           @@
##             main      #35   +/-   ##
=======================================
  Coverage   97.56%   97.56%           
=======================================
  Files           3        3           
  Lines          41       41           
=======================================
  Hits           40       40           
  Misses          1        1           
Flag Coverage Δ
unittests 97.56% <ø> (ø)

Flags with carried forward coverage won't be shown. Click here to find out more.

☔ View full report in Codecov by Sentry.
📢 Have feedback on the report? Share it here.

@amotl amotl requested review from seut and matriv January 29, 2024 02:44
@amotl amotl marked this pull request as ready for review January 29, 2024 02:44
Comment on lines +30 to +43
def test_psycopg_read_sql(cratedb_psql_host, cratedb_psql_port):
engine = sa.create_engine(
url=f"postgresql+psycopg_relaxed://crate@{cratedb_psql_host}:{cratedb_psql_port}",
isolation_level="AUTOCOMMIT",
use_native_hstore=False,
echo=True,
)
conn = engine.connect()
df = pl.read_database(
query=SQL_SELECT_STATEMENT,
connection=conn,
schema_overrides={"value": pl.Utf8},
)
assert_frame_equal(df, REFERENCE_FRAME)
Copy link
Member Author

@amotl amotl Jan 29, 2024

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Dear @surister,

I am only now reading your comment over at crate/crate#12952 (comment). Can I humbly ask you to check whether the operation on 600K rows worth of data is faster using the Psycopg3 driver, compared to the HTTP/JSON-based driver from crate-python?

pip install 'polars[pyarrow]<0.21' sqlalchemy-postgresql-relaxed
import polars as pl
import sqlalchemy as sa

engine = sa.create_engine("postgresql+psycopg_relaxed://crate@localhost/doc")

df = pl.read_database(
    query="SELECT * FROM metrics",
    connection=engine,
    schema_overrides={"value": pl.Utf8},
)
print(df)

With kind regards,
Andreas.

NB: This will not run the CrateDB SQLAlchemy dialect though, so it will probably not support any special data types yet. There is another patch which will bring everything together correspondingly.

This patch will, for example, unlock connectivity using the CrateDB SQLAlchemy dialect together with the Psycopg3 database driver.

engine = sa.create_engine("crate+psycopg://crate@localhost/doc")

Copy link

@surister surister Feb 6, 2024

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

  • read_database - postgresql+psycopg_relaxed 8.5s
  • read_database_uri - postgresql 5.13s
  • read_database - crate 8.8s

Polars does not support writing to databases yet, nor does it support
SQLAlchemy's AsyncEngine. So, this exercises `pl.read_database()` only,
together with the `crate` and `postgresql+psycopg_relaxed` dialects,
using urllib3 resp. Psycopg3.
@amotl amotl merged commit ddf2ad2 into main Jan 29, 2024
8 checks passed
@amotl amotl deleted the amo/test-polars branch January 29, 2024 17:28
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants