Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Performance: SqlImplementation.entity_metadata_map() #676

Open
joeflack4 opened this issue Nov 9, 2023 · 5 comments · Fixed by #679
Open

Performance: SqlImplementation.entity_metadata_map() #676

joeflack4 opened this issue Nov 9, 2023 · 5 comments · Fixed by #679

Comments

@joeflack4
Copy link
Contributor

Overview

This method seems far too slow.

Example

In this PR, using mondo.db:

    exclusion_rels: List[Tuple[str, str]] = []
    for rel in oi.entities_metadata_statements(mondo_ids, ['obo:mondo#excluded_subClassOf']):
        if rel:
            exclusion_rels.append((rel[0], rel[2]))

Each iteration of this loop took on average 15.5 seconds. Estimating that to get through ~25k mondo_ids would take ~100 hours at this rate.

@matentzn
Copy link
Contributor

Thank you @joeflack4 this is a high priority for me! @hrshdhgd how can we get this up the ladder of priorities?

@hrshdhgd
Copy link
Collaborator

#679: PR on the way!

@hrshdhgd
Copy link
Collaborator

The latest version - v0.5.22 should be quicker!

@matentzn
Copy link
Contributor

I am trying this on 0.6.6 and it is still much too slow..

for curie in all_descendants:
    metadata_map = adapter.entity_metadata_map(curie)
    if "oio:inSubset" in metadata_map:
        list_of_subsets = metadata_map["oio:inSubset"]
        for subset in list_of_subsets:
            row = {
                "id": curie,
                "subset": subset
            }
        data_subsets.append(row)

I stopped this after 16 minutes..

@matentzn matentzn reopened this May 15, 2024
@joeflack4
Copy link
Contributor Author

@matentzn and others, I'm not sure if the code for .entity_metadata_map() is similar to .relationships_metadata(), but if so, it might be worth looking at this PR #659 where Chris and I discuss a few different performance refactoring options for that method.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging a pull request may close this issue.

3 participants