This repository has been archived by the owner on May 17, 2024. It is now read-only.
v0.3.0 - New algorithm for in-db diffing (joindiff) + tons of new features and bugfixes!
Big points:
- Added a new algorithm for in-db diffing that uses OUTER JOIN, called "joindiff".
- Much faster than the original "hashdiff" algorithm!
- Automatically chosen if both dbs are the same
- Validates that the key column is unique and contains no NULLs (joindiff only)
- Explicitly switch between algorithms using the
--algorithm
parameter.
- New feature to materialize joindiff results to DB
- New feature that diffs the schemas when both dbs are the same
- Added DuckDB support (thanks @jardayn!)
- Better support for alphanumerics
- Better support for boolean types
- Added
--version
switch - New and improved database and query interface, named "sqeleton"
- Tons of bugfixes and improvements!
What's Changed
- Join-diff (in-db) + new query builder by @erezsh in #242
- Bugfix: Joindiff crashed when no numeric columns were used. by @erezsh in #255
- Deprecate use of FixedAlphanum by @erezsh in #254
- Refactor tests oct2022 by @erezsh in #253
- General tests now include Presto, Trino & Vertica; Includes small fixes by @erezsh in #256
- Added --materialize-all-rows switch + tests by @erezsh in #258
- Various small fixes and refactors by @erezsh in #260
- Downgrade mysql-connector-python to 8.0.29 by @erezsh in #262
- Update documentation link by @williebsweet in #263
- Small changes by @erezsh in #264
- Added link on how to get a slack invite by @jardayn in #265
- link to docs and incorporate roman/gerard feedback by @leoebfolsom in #266
- Tiny Cleanup by @erezsh in #267
- tests for unique key constraints (if possible) instead of always actively validating (+ tests) by @erezsh in #257
- Attempt to fix PR #269 by @erezsh in #272
- Contrib improvements + Fixed Test by @jardayn in #269
- Refactor dialect by @erezsh in #271
- Tests: Improvements to CI flow + fixes by @erezsh in #274
- Bugfix in alphanums (reported by Guarav Singh) by @erezsh in #277
- Fix databricks by @pik94 in #273
- Added support for Boolean types by @erezsh in #282
- Fixed broken "How To Use" links in README. by @daniel-leicht in #290
- Fix for issue #286 by @erezsh in #291
- Materialize: rename and reorder columns by @erezsh in #287
- Revised CLI output to be more understandable and detailed by @erezsh in #292
- New DB Driver guide update by @jardayn in #288
- Duckdb driver for Issue #176 by @jardayn in #276
- Update typing of TableSegment().count() by @MattDelac in #293
- Refactor common database interface into Sqeleton (databases, queries) by @erezsh in #285
- Added DDB as an extra by @jardayn in #296
- More Sqeleton refactoring by @erezsh in #295
- Added InfoTree as a more descriptive alternative to .stats by @erezsh in #297
- Refactor tests to use insert_rows_in_batches(), instead of internally… by @erezsh in #299
- CLI: Better errors + tiny bugfix by @erezsh in #303
- Rudderstack poc by @kylemcnair in #298
- add databases we support to readme by @leoebfolsom in #309
- Nov22 sqeleton refactor by @erezsh in #308
- Fix readme link by @dlawin in #310
- List tables from schema by @erezsh in #311
- Tests: Set bisection_factor=2 for much faster tests; Fix random failures in test_string_keys by @erezsh in #312
- Nov24 - Small fixes to tests by @erezsh in #313
- Adjustments for PR #314 by @erezsh in #315
- return all duplicated rows by @pik94 in #314
- Cleanup by @erezsh in #320
- Added version and --version switch (issue #318) by @erezsh in #319
- data-diff now uses database A's now instead of cli's now. by @erezsh in #306
- extract methods for stats by @dlawin in #300
- connect(): Added support for shared connection; Database.is_closed property by @erezsh in #323
- Better error messages in databases; Default database in clickhouse is now 'default'. by @erezsh in #325
- diff_tables() now accepts all JoinDiffer params by @erezsh in #326
- CLI: Automatically choose joindiff is dbs are the same (don't rely just on syntax) by @erezsh in #328
- Add version module and add version to tracking by @kylemcnair in #327
- Dec2 cleanup by @erezsh in #329
- fix link to docs by @leoebfolsom in #330
- Fix _normalize_table_path to always return a pair by @erezsh in #333
New Contributors
- @williebsweet made their first contribution in #263
- @jardayn made their first contribution in #265
- @daniel-leicht made their first contribution in #290
- @MattDelac made their first contribution in #293
- @kylemcnair made their first contribution in #298
- @dlawin made their first contribution in #310
Full Changelog: v0.2.8...v0.3.0
Let us know what you think in Discussions!