🚀 UCM - OSS Unity Catalog Migration Assistant
UCM is a CLI that provides commands for migrating catalogs, schemas, and tables from Databricks to Unity Catalog.
- Python 3.9 or higher
- Databricks configuration file: https://docs.databricks.com/en/dev-tools/auth/config-profiles.html
To install the Unity Catalog Migrator, you can use the following commands:
pipx install unitycatalog-migrate
ucm migrate-catalog NAMES... --profile <databricks-profile>
ucm migrate-schema FULL_NAMES... --profile <databricks-profile>
where FULL_NAMES are in the format catalog.schema
ucm migrate-table FULL_NAMES.. --profile <databricks-profile>
where FULL_NAMES are in the format catalog.schema.table
The Unity Catalog Migrator uses the following environment variables:
- UC_HOST_URL: The URL of the Unity Catalog server. Default is
http://localhost:8080/api/2.1/unity-catalog
. - UC_TOKEN: The token to authenticate with the Unity Catalog server. Default is
None
. You can also pass the token as an argument to the command.
# First migrate catalog and schema
ucm migrate-catalog catalog1 --profile DATABRICKS_TEST
ucm migrate-schema catalog1.schema1 --profile DATABRICKS_TEST
# Use Databricks CLI to get all tables of schema
table_names=$(databricks tables list catalog1 schema1 --profile DATABRICKS_TEST | awk 'NR>1 {print $1}' | paste -sd ' ' -)
# Migrate the tables to Unity Catalog using
echo $table_names | xargs ucm migrate-table --profile DATABRICKS_TEST
[13:45:46] [ Success ] catalog1.schema1.table1
[ Success ] catalog1.schema1.table2
[ Success ] catalog1.schema1.table3
[ Success ] catalog1.schema1.table4
[ Success ] catalog1.schema1.table5
[ Skipped ] catalog1.schema1.table6
[ Success ] catalog1.schema1.table7
[ Failed ] catalog1.schema1.table8
┏━━━━━━━━━━━━━━━━━━━━━━━┳━━━━━━━┓
┃ Result ┃ Count ┃
┡━━━━━━━━━━━━━━━━━━━━━━━╇━━━━━━━┩
│ Successfully migrated │ 6 │
│ Skipped │ 1 │
│ Errors │ 1 │
└───────────────────────┴───────┘
┏━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━┳━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━┓
┃ Skipped ┃ Reason ┃
┡━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━╇━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━┩
│ catalog1.schema1.table6 │ Table already Exists ┃
│ │ ┃
└──────────────────────────────────────┴──────────────────────────────┘
┏━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━┳━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━┓
┃ Error ┃ Reason ┃
┡━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━╇━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━┩
│ catalog1.schema1.table8 │ Variant Type not supported ┃
│ │ ┃
└──────────────────────────────────────┴──────────────────────────────┘
- ucm skips already existing catalogs, schemas or tables (if desired, we can add a force option)
- system tables
- Variant datatype
- migrate functions command
- migrate models command
- update existing tables (sync)
- Contributions are welcome. Please fork and make a PR and I'll take a look asap.
- Star the repo if you like it.