Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Tree contract #250

Draft
wants to merge 1 commit into
base: master
Choose a base branch
from
Draft

Conversation

K4liber
Copy link

@K4liber K4liber commented Jan 19, 2025

A contract that checks whether the dependency graph of modules forms a tree structure.

This contract verifies that the directed graph of module imports does not contain any cycles. A cycle in the graph implies that there is a circular dependency between modules, which violates the tree structure requirement.

Copy link
Owner

@seddonym seddonym left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks for taking an look at this!

I've left a few comments. Overall, the direction of travel really would be to implement most of this in Rust, in the Grimp library - and then just have a thin wrapper around it in Import Linter. But there's a bit to do before we get there.

My current feeling is it might make more sense to release this as a separate package for the time being, but I could be talked around!

Interested to know your thoughts.

"True" or "true" will be treated as True. (Optional.)
"""

type_name = "tree"
Copy link
Owner

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I had envisaged 'modular' being a good description of this structure, but the tree idea is an interesting take that I had not considered. I'd like to think a bit about whether we call it this, but maybe it's better than 'modular'.

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think "Tree" indicates directly what structure should modules/packages form in order to fill the contact. But I don't mind changing the name to "Modular". Discussion can be continued. Since it is just an easy to change naming, we can postpone the decision up until the day of the potential merge.

Copy link
Owner

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Agreed!

v_of_edge=imported_module_family_member,
)

cycles = list(nx.simple_cycles(G=nx_graph))
Copy link
Owner

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It would be worth running this on a real-life Python project that doesn't follow a tree structure. I suspect this may return a huge number of results, and possibly never complete. Perhaps we should add some kind of break after a certain number of cycles.

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I agree. I will run stress tests on some huge Python code bases.


def render_broken_contract(self, check: ContractCheck) -> None:
for cycle in check.metadata.get(self._CYCLES_METADATA_KEY, []):
output.print_error(text=f"Cycle found: {cycle}")
Copy link
Owner

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Would be interesting to see what cycles in a real repo look like, and if this is a helpful way to present them.

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

After running some real-life tests we can take a look at the log and conclude how to modify the logging to make the message easier to understand.

Copy link
Owner

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Good idea!

A cycle in the graph implies that there is a circular dependency between modules, which
violates the tree structure requirement.

The `check` method constructs a directed graph using NetworkX, where nodes represent modules
Copy link
Owner

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is an implementation detail, probably doesn't belong in the docstring.

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I agree. To be removed.

@@ -16,6 +16,7 @@ dependencies = [
"grimp>=3.2",
"tomli>=1.2.1; python_version < '3.11'",
"typing-extensions>=3.10.0.0",
"networkx>=3",
Copy link
Owner

@seddonym seddonym Jan 22, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'm reluctant to include networkx as a dependency, it used to be one, and we worked hard to remove it from Grimp (the underlying graph algorithm library, also maintained by me) so that Import Linter had fewer dependencies.

I would probably be more open to us copying the relevant simple cycles code from networkx - with a view to eventually moving the algorithm into Grimp (where it could use faster Rust graph libraries).

That said, it's not an unreasonable approach (though probably suboptimal performance-wise). One option you could consider is for you to create a separate Python library that defines this import linter contract - users could install that (along with networkx) if they want to use it. They could use the custom contract types mechanism to register the tree contract type. We might then be able to incorporate the work into Import Linter in due course.

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Since we only use simple_cycles algorithm from networkx it may be a good idea to just create such a function than creating an additional dependency. On the other hand, we delegate the maintenance of the functionality to a trusted and supported package. That was my thinking before I read this comment.

I guess the dependency has been already discussed and the decision is to not use networkx and implement graph-related functionalities from a scratch in Grimp. I will follow the concept.

So the next move will be to implement simple_cycles functionality in Grimp using Python. After some stress tests we will have data about potential algorithm optimization needs. I like the idea of using Rust to optimize. If the feature will not be useful for a huge code bases due to significant computational time, the Rust implementation seems like a really good approach to fix the problem.

Copy link
Owner

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I agree, that's where we want to get to.

In the interests of not wasting your time unnecessarily, I'm actively working with others on a Rust rewrite which isn't far off. So if you did want to contribute to Grimp on this, it's probably better to hold off until that lands (or just wait for me to implement). If you'd like to see this feature sooner, as I said you could experiment with releasing it as a separate library with a custom contract.

@K4liber K4liber marked this pull request as draft January 28, 2025 18:29
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants