Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Inconsistent results of materials query #964

Open
fxcoudert opened this issue Jan 25, 2025 · 4 comments
Open

Inconsistent results of materials query #964

fxcoudert opened this issue Jan 25, 2025 · 4 comments
Labels
bug Something isn't working

Comments

@fxcoudert
Copy link

Python version

3.12.8

Pymatgen version

2025.1.24

Operating system version

macOS 15.2

Current behavior

The following code:

with MPRester(apikey) as mpr:
    mp_data = mpr.materials.summary.search(
        fields=["material_id", "deprecated", "formula_pretty", "nelements", "structure", "theoretical", "symmetry"]
    )
    print("Number of materials found:", len(mp_data))
    print("Database version", mpr.get_database_version())

returns

Number of materials found: 178580
Database version 2024.12.18

Of these, there are 169385 non deprecated materials, as returned by:

> sum(1 for x in mp_data if not x.deprecated)

This is consistent with the number of the web portal. Good. But now, consider this:

with MPRester(apikey) as mpr:
    mp_data = mpr.materials.summary.search(
        deprecated=False,
        fields=["material_id", "deprecated", "formula_pretty", "nelements", "structure", "theoretical", "symmetry"]
    )
    print("Number of materials found:", len(mp_data))
    print("Database version", mpr.get_database_version())

It is exactly the same query, except I ask for all non deprecated by passing deprecated=False. But it now returns:

Number of materials found: 153902
Database version 2024.12.18

Expected Behavior

I expect the two routes to return the same number (and same list) of non deprecated materials.

Minimal example

Relevant files to reproduce this bug

No response

@fxcoudert fxcoudert added the bug Something isn't working label Jan 25, 2025
@Andrew-S-Rosen
Copy link
Member

Andrew-S-Rosen commented Jan 28, 2025

Hi @fxcoudert! Just to check --- was this meant for pymatgen or for https://github.com/materialsproject/api? I want to make sure you get the quickest feedback possible.

CC @tschaume in case it's relevant to him.

@tschaume tschaume transferred this issue from materialsproject/pymatgen Jan 28, 2025
@tschaume
Copy link
Member

Thanks @Andrew-S-Rosen! The difference is due to the new ~15k GNoMe materials that are included in the API response if a user accepted its terms on the website. You can set include_gnome=False in mpr.materials.summary.search() to exclude GNoMe materials regardless of whether their terms have been accepted or not. HTH

@Andrew-S-Rosen
Copy link
Member

The all-knowing Patrick has spoken!!

@fxcoudert
Copy link
Author

fxcoudert commented Jan 28, 2025

Thanks @tschaume. I don't know if I have accepted the new terms or not, but what I am sure is that both queries were made at the same time, with the same function. So whether the terms were accepted or not, shouldn't the numbers be consistent? (169k in the first case, 154 in the second case)

PS: how can I check whether I have accepted the new terms or not? I can't seem to find the information in the dashboard for my account.

Re. @Andrew-S-Rosen: I have no idea if it is an API bug or a pymatgen bug. I have only queried through the pymatgen functions, not tried directly the API from another code.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
bug Something isn't working
Projects
None yet
Development

No branches or pull requests

3 participants