Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[BUG] #2179

Open
2 of 3 tasks
ved93 opened this issue Oct 15, 2024 · 0 comments
Open
2 of 3 tasks

[BUG] #2179

ved93 opened this issue Oct 15, 2024 · 0 comments
Labels
bug Something isn't working

Comments

@ved93
Copy link

ved93 commented Oct 15, 2024

Description

When running the example notebooks provided by the recommenders library, specifically when executing the following import statement:

from recommenders.datasets import movielens

I encounter an AttributeError stating that the module pandera has no attribute SchemaModel. This error prevents the notebook from running further and seems to be related to the movielens.py module within the recommenders.datasets package.

The full error message and traceback are provided in the Other Comments section below.

In which platform does it happen?

  • Platform: Databricks
  • Runtime Versions Tried:
    • Databricks Runtime 11.3 LTS (includes Apache Spark 3.3.0, Scala 2.12)
    • Databricks Runtime 12.2 LTS (includes Apache Spark 3.3.2, Scala 2.12)
    • Databricks Runtime 13.1 (includes Apache Spark 3.4.1, Scala 2.12)
    • Databricks Runtime 14.1 (includes Apache Spark 3.5.0, Scala 2.12)

Despite trying multiple runtimes, the issue persists across all of them.

How do we replicate the issue?

  1. Set Up Databricks Environment:

    • Launch an Databricks workspace.
    • Create a new cluster using any of the runtimes mentioned above.
  2. Install Required Libraries:

    • Install the recommenders library on the cluster:
      pip install recommenders
  3. Create a New Notebook:

    • In the workspace, create a new Python notebook attached to the cluster.
  4. Run the Following Code in a Notebook Cell:

    from recommenders.datasets import movielens
  5. Observe the Error:

    • The AttributeError should occur, indicating that pandera has no attribute SchemaModel.

Expected behavior (i.e. solution)

  • The import statement from recommenders.datasets import movielens should execute without any errors.
  • The movielens dataset module should be available for use in the notebook.
  • All example notebooks using the recommenders library should run successfully on Azure Databricks.

Willingness to contribute

  • Yes, I can contribute for this issue independently.
  • Yes, I can contribute for this issue with guidance from Recommenders community.
  • No, I cannot contribute at this time.

Other Comments

Full Error Message and Traceback:

AttributeError: module 'pandera' has no attribute 'SchemaModel'
---------------------------------------------------------------------------
AttributeError                            Traceback (most recent call last)
<ipython-input-1-abc123456789> in <module>
      1 # Other imports (if any)
----> 2 from recommenders.datasets import movielens
      3 # Rest of the code (if any)

~/databricks/python/lib/python3.10/site-packages/recommenders/datasets/movielens.py in <module>
    582     return not df[columns].duplicated().any()
    583 
--> 584 class MockMovielensSchema(pa.SchemaModel):
    585     """
    586     Mock dataset schema to generate fake data for testing purpose.

AttributeError: module 'pandera' has no attribute 'SchemaModel'

Additional Information:

  • I have verified that pandera is installed and up-to-date in the environment:
    import pandera
    print(pandera.__version__)   
  • Attempted solutions:
    • Upgrading pandera to the latest version did not resolve the issue.
    • Uninstalling and reinstalling pandera did not resolve the issue.
    • Downgrading pandera to earlier versions leads to other compatibility issues.
  • It appears that the recommenders library may not be compatible with the latest versions of pandera, or there may be an issue with how SchemaModel is imported or used in movielens.py.

Environment Details:

  • Python Version: 3.10.x (as per the Databricks runtime)
  • recommenders Library Version: Latest available via pip as of the date of this report.

Request:

  • Guidance on resolving this import error.
  • Confirmation on whether this is a known issue or a bug that needs fixing.
@ved93 ved93 added the bug Something isn't working label Oct 15, 2024
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
bug Something isn't working
Projects
None yet
Development

No branches or pull requests

1 participant