Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Top-level module of namespace package incorrectly raises NotATopLevelModule #141

Open
draustin opened this issue Feb 4, 2024 · 10 comments

Comments

@draustin
Copy link

draustin commented Feb 4, 2024

I have a namespace package top with the following files:

  • top/a.py does not import anything.
  • top/b/__init__.py contains import top.a.

When building the graph, Grimp reports that top.a is not top level. top.b works fine.

import grimp

graph=grimp.build_graph("top.b") # Works fine.
graph=grimp.build_graph("top.a") # Raises NotATopLevelModule.

Here's the traceback. Note that the exception class is raised, not an instance. Is this intentional?

Traceback (most recent call last):
  File "/Users/dane/git/import-linter-test/issue.py", line 4, in <module>
    graph=grimp.build_graph("top.a")
          ^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/Users/dane/git/import-linter-test/.venv/lib/python3.11/site-packages/grimp/application/usecases.py", line 52, in build_graph
    found_packages = _find_packages(
                     ^^^^^^^^^^^^^^^
  File "/Users/dane/git/import-linter-test/.venv/lib/python3.11/site-packages/grimp/application/usecases.py", line 81, in _find_packages
    package_directory = package_finder.determine_package_directory(
                        ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/Users/dane/git/import-linter-test/.venv/lib/python3.11/site-packages/grimp/adaptors/packagefinder.py", line 29, in determine_package_directory
    raise exceptions.NotATopLevelModule

I'm using v3.2 of grimp, Python 3.11.5.

@seddonym
Copy link
Owner

seddonym commented Feb 5, 2024

Thanks for raising the issue. 😄

You're right, currently Grimp only treats packages as namespace packages - it explicitly checks that it's a package rather than a module.

An even simpler example is to try to build the graph on an empty standalone Python file. This has always failed by design, but arguably that file could be seen as being in a namespace package of its parent.

I suspect that adjusting this behaviour would require a rethink of Grimp's machinery, which is fundamentally oriented around the idea of the root item being a package with a directory, but it might not be too bad. Still, is it worth the effort?

Interested to hear more about your use case - I would have thought that this requirement is quite niche, but perhaps I'm wrong. Also, is it the kind of thing you'd be interested in contributing a fix for?

@draustin
Copy link
Author

You're right, currently Grimp only treats packages as namespace packages - it explicitly checks that it's a package rather than a module.

I didn't understand this comment. In my example, the namespace package is top. The single-file module top.a, which causes the problem, is a regular single-file module. Did you mean it only treats packages in namespace packages?

It is my understanding that namespace package can consist of packages and single-file modules. For example, in this discussion, the namespace package snake_corp consists of the single file modules date_util.py and magic.py. Also, PEP420 has an example in which the namespace package child contains single-file modules one.py and two.py.

It is a pretty common pattern for a module to start out life as a single .py file, and then be converted into a package as it grows. Functionally, there is little distinction in Python between a file foo.py and foo/__init__.py with the same contents.

My particular use case is imposing some organization on my company's Python codebase, which is a single namespace package (the company name) split by topic and dependency into different portions. There are a lot of single-file modules. I started with your import-linter, which led me to Grimp.

I am interested in working on this feature.

@seddonym
Copy link
Owner

I didn't understand this comment. In my example, the namespace package is top. The single-file module top.a, which causes the problem, is a regular single-file module. Did you mean it only treats packages in namespace packages?

You're right, I was thinking that top.b is a namespace package, but it's top that's the namespace package. I meant to say that if you give Grimp a standalone .py module then it won't treat it as a module within a namespace package.

My particular use case is imposing some organization on my company's Python codebase, which is a single namespace package (the company name) split by topic and dependency into different portions.

And what's the motivation for making that top level package a namespace package, rather than just adding an __init__.py file?

@ashb
Copy link

ashb commented Dec 6, 2024

@seddonym Great module, found it from Talk Python To Me.

I'm trying to use it on Apache Airflow (of which I'm a member of the core dev team) and our use of explicit namespace packages seems to confuse things.

If I put a .importlint file in https://github.com/apache/airflow/tree/450132bc/providers containing this:

[importlinter]
root_packages=
  airflow
  airflow.providers
exclude_type_checking_imports = True

And run it in verbose node (inside the uv managed venv, uv sync etc) I get this error:

Verbose mode.
Building import graph (cache directory is .import_linter_cache)...
Traceback (most recent call last):
  File "/Users/ash/code/airflow/airflow/.venv/bin/lint-imports", line 8, in <module>
    sys.exit(lint_imports_command())
             ^^^^^^^^^^^^^^^^^^^^^^
  File "/Users/ash/code/airflow/airflow/.venv/lib/python3.12/site-packages/click/core.py", line 1157, in __call__
    return self.main(*args, **kwargs)
           ^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/Users/ash/code/airflow/airflow/.venv/lib/python3.12/site-packages/click/core.py", line 1078, in main
    rv = self.invoke(ctx)
         ^^^^^^^^^^^^^^^^
  File "/Users/ash/code/airflow/airflow/.venv/lib/python3.12/site-packages/click/core.py", line 1434, in invoke
    return ctx.invoke(self.callback, **ctx.params)
           ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/Users/ash/code/airflow/airflow/.venv/lib/python3.12/site-packages/click/core.py", line 783, in invoke
    return __callback(*args, **kwargs)
           ^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/Users/ash/code/airflow/airflow/.venv/lib/python3.12/site-packages/importlinter/cli.py", line 52, in lint_imports_command
    exit_code = lint_imports(
                ^^^^^^^^^^^^^
  File "/Users/ash/code/airflow/airflow/.venv/lib/python3.12/site-packages/importlinter/cli.py", line 99, in lint_imports
    passed = use_cases.lint_imports(
             ^^^^^^^^^^^^^^^^^^^^^^^
  File "/Users/ash/code/airflow/airflow/.venv/lib/python3.12/site-packages/importlinter/application/use_cases.py", line 57, in lint_imports
    raise e
  File "/Users/ash/code/airflow/airflow/.venv/lib/python3.12/site-packages/importlinter/application/use_cases.py", line 54, in lint_imports
    report = create_report(user_options, limit_to_contracts, cache_dir, show_timings, verbose)
             ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/Users/ash/code/airflow/airflow/.venv/lib/python3.12/site-packages/importlinter/application/use_cases.py", line 112, in create_report
    graph = _build_graph(
            ^^^^^^^^^^^^^
  File "/Users/ash/code/airflow/airflow/.venv/lib/python3.12/site-packages/importlinter/application/use_cases.py", line 162, in _build_graph
    return settings.GRAPH_BUILDER.build(
           ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/Users/ash/code/airflow/airflow/.venv/lib/python3.12/site-packages/importlinter/adapters/building.py", line 21, in build
    return grimp.build_graph(
           ^^^^^^^^^^^^^^^^^^
  File "/Users/ash/code/airflow/airflow/.venv/lib/python3.12/site-packages/grimp/application/usecases.py", line 52, in build_graph
    found_packages = _find_packages(
                     ^^^^^^^^^^^^^^^
  File "/Users/ash/code/airflow/airflow/.venv/lib/python3.12/site-packages/grimp/application/usecases.py", line 81, in _find_packages
    package_directory = package_finder.determine_package_directory(
                        ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/Users/ash/code/airflow/airflow/.venv/lib/python3.12/site-packages/grimp/adaptors/packagefinder.py", line 30, in determine_package_directory
    raise exceptions.NotATopLevelModule
grimp.exceptions.NotATopLevelModule

However, if I hack packagefinder.py to pass instead of raise exceptions.NotATopLevelModule it builds up a graph without problem.

Now, I'm about 30mins in to looking at grimp/import-linter so I don't understand much about it, but is that check even needed? If it is in some cases could we add a config option to turn it off? Will turning it off cause any other problems down the line?

@ashb
Copy link

ashb commented Dec 6, 2024

And if I don't put airflow.providers in the root_packages list, then it doesn't pick up any of the modules under it.

@seddonym
Copy link
Owner

seddonym commented Dec 6, 2024

Hi Ash,

Thanks for reaching out.

You should just be able to include airflow as the top level package (as there is an __init__.py file in there, I don't think namespace packages are relevant here, but correct me if I'm wrong).

I checked out the latest airflow code, pip installed grimp into a virtual env, and then ran the repl from airflow/providers/src:

>>> import grimp
>>> graph = grimp.build_graph("airflow")
>>> len(graph.modules)
1094
>>> "airflow.providers" in graph.modules
True

Could you say a bit more about what you're expecting to see instead, or does this resolve things for you?

@ashb
Copy link

ashb commented Dec 8, 2024

Ahhhh, the difference is where it's run from.

Even though airflow/providers/src is in python path, unless I run it from that folder it doesn't find the providers modules:

This is run from the root.

n [1]: import sys

In [2]: sys.path
Out[2]:
['/Users/ash/code/airflow/airflow/.venv/bin',
 '/Users/ash/code/airflow/airflow/.venv/lib/python3.12/site-packages/_pdbpp_path_hack',
 '/Users/ash/.local/share/uv/python/cpython-3.12.7-macos-aarch64-none/lib/python312.zip',
 '/Users/ash/.local/share/uv/python/cpython-3.12.7-macos-aarch64-none/lib/python3.12',
 '/Users/ash/.local/share/uv/python/cpython-3.12.7-macos-aarch64-none/lib/python3.12/lib-dynload',
 '',
 '/Users/ash/code/airflow/airflow/.venv/lib/python3.12/site-packages',
 '/Users/ash/code/airflow/airflow',
 '/Users/ash/code/airflow/airflow/task_sdk/src',
 '/Users/ash/code/airflow/airflow/providers/src']

In [3]: import grimp; graph = grimp.build_graph("airflow")

In [4]: len(graph.modules)
Out[4]: 652

In [5]: import grimp; graph = grimp.build_graph("airflow.providers")
[2024-12-08T10:39:01.970+0000] {caching.py:179} INFO - No cache file: .grimp_cache/airflow.providers.meta.json.
[2024-12-08T10:39:01.971+0000] {caching.py:204} INFO - No cache file: .grimp_cache/765654077fc2e7c245cde17888779a0457adda13.data.json.
[2024-12-08T10:39:02.996+0000] {caching.py:148} INFO - Wrote data cache file .grimp_cache/765654077fc2e7c245cde17888779a0457adda13.data.json.
[2024-12-08T10:39:02.997+0000] {caching.py:161} INFO - Wrote meta cache file .grimp_cache/airflow.providers.meta.json.

In [6]: len(graph.modules)
Out[6]: 1097

Curious.

@seddonym
Copy link
Owner

seddonym commented Dec 9, 2024

I think it's because airflow is higher up in your Python path than airflow/providers/src. That's resolving to a different Python package (this one).

I don't think this is a bug - it's just following the standard Python import resolution rules, do you agree?

@ashb
Copy link

ashb commented Dec 9, 2024

So with namespace packages that isn't true:

In [4]: airflow.__file__
Out[4]: '/Users/ash/code/airflow/airflow/airflow/__init__.py'

In [5]: airflow.providers.standard.operators.python.__file__
Out[5]: '/Users/ash/code/airflow/airflow/providers/src/airflow/providers/standard/operators/python.py'

However due to $reasons (various tooling, IDE support etc) we've currently got to use the old-school explicit namespace packages mechanism https://github.com/apache/airflow/blob/ee07d6339a3b1a1b4451c20f5109ab51dcd13ebe/airflow/__init__.py#L23-L26

In ruff for example we've worked around this by being explicit about what is a namespace package https://github.com/apache/airflow/blob/ee07d6339a3b1a1b4451c20f5109ab51dcd13ebe/pyproject.toml#L214

@seddonym
Copy link
Owner

Oh interesting, ok. So it sounds like Grimp doesn't support pgutil-style namespace packages (TIL!).

I think this could do with its own issue, so I've created one here. Happy to consider a pull request.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

3 participants