Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Script to test ORM vs core + fixes #85

Merged
merged 12 commits into from
Sep 17, 2024

Conversation

MatBarba
Copy link
Contributor

  • Create a simple test script (to run manually) to test load all the tables defined in the ORM from a real core database, to flag any that fail to load
  • Fix a few table definitions by removing an extra primary key in the *_attrib tables, and replacing them with a primary_key parameter on the columns the core schema use as unique
  • NB: I also had to fix the string length to a definite value String(500), because SQLAlchemy needs that to create a primary key constraint in MySQL. This works now, but it might be problematic if we want to use the ORM to load very long attribs into a database. The best way to handle this really would be to alter the core SQL schema to have an explicit primary key as a simple autoincremented row ID.

Copy link
Collaborator

@JAlvarezJarreta JAlvarezJarreta left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Looks good, and thank you for the new test, quite useful.

src/python/tests/core/compare_core_model.py Outdated Show resolved Hide resolved
src/python/tests/core/compare_core_model.py Outdated Show resolved Hide resolved
src/python/tests/core/compare_core_model.py Outdated Show resolved Hide resolved
@JAlvarezJarreta
Copy link
Collaborator

Re. the primary key and Text, that restriction actually makes sense, and the core schema "is wrong": it does not make sense to have a primary key of 10000 characters. Something to fix in the new world IMO.

ens-LCampbell
ens-LCampbell previously approved these changes Sep 16, 2024
Copy link
Member

@ens-LCampbell ens-LCampbell left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Approved, just wanted to ask if the changes to models will now be in place but still requires changes to the scehma itself to conform ? How does that affect models utility in the interim ?

Copy link
Collaborator

@JAlvarezJarreta JAlvarezJarreta left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

"URL vs server details" discussion left, otherwise looks good 👍

@MatBarba
Copy link
Contributor Author

Approved, just wanted to ask if the changes to models will now be in place but still requires changes to the scehma itself to conform ? How does that affect models utility in the interim ?

Right now the SQL schema from ensembl should be compatible with the ORM from ensembl-py, but with the possibility that we would not be able to handle long value for the *_attrib tables.

If we apply my recommendation to add an explicit key to all *_attrib tables, then we would just need to change the value fields back to a Text type, and define the primary_key=True only for the new row.

@JAlvarezJarreta JAlvarezJarreta merged commit f89c530 into Ensembl:main Sep 17, 2024
5 checks passed
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants