Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Transform into a valid Python package and reorganize code as needed #157

Open
wants to merge 1 commit into
base: main
Choose a base branch
from

Conversation

abelbriggs1
Copy link

Background

Looking at many decomp repositories using asm-differ, it's very common for them to pull the asm-differ repository as a Git submodule, and invoke diff.py directly as a Python script. This works, but isn't ideal for several reasons:

  • Submodules can lead to pain for decomp contributors new to Git if they make a mistake when pulling the repository.
  • Submodules are not very compatible with Docker, and make it difficult to provide a universal, reproducible build environment for decompilation projects.
  • The user still needs to install the dependencies for asm-differ separately from the dependencies for the top-level repository. Most decomp repositories work around this by manually syncing their project's requirements.txt to asm-differ's dependencies, but this is a hassle and prone to breakage.

Instead, a better option for decomp repos would be to install asm-differ as a Python package (via pip install or poetry install). Though asm-differ doesn't have a PyPi package, it can still be installed from git using requirements.txt. Example:

asm-differ @ git+https://github.com/abelbriggs1/asm-differ@60a8a5d

This fixes all of the above logistical issues and improves the user experience, but requires some reorganization of the repository to support the new format.

Changes

This PR:

  • Reorganizes the repository as a Python package.
    • diff.py has been almost completely split up into several different loosely-organized Python files.
      • The top-level diff.py is now a wrapper around asm_differ/main.py.
    • No functionality or code has been (intentionally) changed.
  • Adds the asm-differ Python command.
    • The CLI is identical to the current version, just using a free-standing command instead of executing the ./diff.py script.
    • The top-level diff.py remains to ensure backwards compatibility with repos which use asm-differ as a submodule.

Notes

  • I'm not super familiar with the codebase, so some functions might not be organized in the correct location.
  • A couple of structures have circular dependencies, forcing us to define them in the same module. This leads to some awkward organization (Config, ArchSettings, AsmProcessor would ideally be separated).

Testing

I've verified that basic usage (asm-differ (start_func)) on a MIPSEL (PSX) repository works with both the old ./diff.py usage and the new asm-differ command. Auto-complete is still functional in both cases.

It would be ideal if we could get interested parties to test these changes on their own repositories and ensure nothing is broken.

@simonlindholm
Copy link
Owner

The suggested deployment story has historically not been submodules (I hate them too) but rather that projects copy diff.py into their own repo. Having diff.py be installable through pip may not be such a bad idea though... the number of places where it's used has grown quite a lot. Would it be possible to keep diff.py as a single file but still allow pip installing it?

I wonder if we can add a CI job to publish a pip release on each push. If e.g. pip requires some special directory structure that would help with that, and it would make install instructions easier.

(Two additional testing steps you can perform is to run mypy and also test.py.)

@abelbriggs1
Copy link
Author

The suggested deployment story has historically not been submodules (I hate them too) but rather that projects copy diff.py into their own repo. Having diff.py be installable through pip may not be such a bad idea though... the number of places where it's used has grown quite a lot. Would it be possible to keep diff.py as a single file but still allow pip installing it?

I did investigate this before splitting the file up (since I didn't want to try and split a 3500 line file!), but unfortunately I don't think this is possible. A Python package needs to have a directory structure similar to:

my_pkg/
-- __init__.py
-- my_module.py
-- my_subpkg/
---- __init__.py
---- my_submodule.py

in order to be valid. The minimum needed for asm-differ would have been:

asm_differ/
-- __init__.py
-- diff.py

This would allow the package to be installable, but it wouldn't be runnable as a standalone command without refactoring. The current diff.py relies on if __name__ == "__main__" to run correctly, and that isn't true if run from a module (e.g. [tool.poetry.scripts] entrypoint). Since a major refactor was required either way, I decided to split everything up so it would be easier to modify the code in the future.

I wonder if we can add a CI job to publish a pip release on each push. If e.g. pip requires some special directory structure that would help with that, and it would make install instructions easier.

With the package structure in this PR, it should be simple - all you would need to do is perform a GitHub release. pip can install Python packages from zip archives, so users can simply:

  1. download the GitHub release archive
  2. pip install (zip file)

if they want to install the package and its dependencies on their own.

(Two additional testing steps you can perform is to run mypy and also test.py.)

Will do this!

@simonlindholm
Copy link
Owner

This would allow the package to be installable, but it wouldn't be runnable as a standalone command without refactoring. The current diff.py relies on if __name__ == "__main__" to run correctly, and that isn't true if run from a module (e.g. [tool.poetry.scripts] entrypoint). Since a major refactor was required either way, I decided to split everything up so it would be easier to modify the code in the future.

That sounds like it could be a much smaller change, though? And still keep diff.py as a single file compatibility with the traditional distribution method, which I see as a big plus.

With the package structure in this PR, it should be simple - all you would need to do is perform a GitHub release. pip can install Python packages from zip archives, so users can simply:

  1. download the GitHub release archive
  2. pip install (zip file)

That sounds more complicated as a user than git cloning the repo, tbh, or pip installing with reference with a git link. I was moreso thinking if we could make an automated push to the pip registry so it's just pip install asm-differ.

@abelbriggs1
Copy link
Author

This would allow the package to be installable, but it wouldn't be runnable as a standalone command without refactoring. The current diff.py relies on if __name__ == "__main__" to run correctly, and that isn't true if run from a module (e.g. [tool.poetry.scripts] entrypoint). Since a major refactor was required either way, I decided to split everything up so it would be easier to modify the code in the future.

That sounds like it could be a much smaller change, though? And still keep diff.py as a single file compatibility with the traditional distribution method, which I see as a big plus.

If that's the preferred option, I can remove the reorganization/split and only make the changes necessary to install as package / run with standalone command.

With the package structure in this PR, it should be simple - all you would need to do is perform a GitHub release. pip can install Python packages from zip archives, so users can simply:

  1. download the GitHub release archive
  2. pip install (zip file)

That sounds more complicated as a user than git cloning the repo, tbh, or pip installing with reference with a git link.

The user also has those options. As a Python package, asm-differ would be installable via:

  1. Cloning the repository, then running pip install .
  2. Downloading a zipped GitHub release, then running pip install (zipfile)
  3. Running pip install git+https://github.com/simonlindholm/asm-differ.git@(commit)
  4. If they have a project which depends on asm-differ, adding asm-differ @ git+https://github.com/simonlindholm/asm-differ.git@(commit) to their requirements.txt

I was moreso thinking if we could make an automated push to the pip registry so it's just pip install asm-differ.

Oh, okay, you're talking about a PyPi release. I'm not very familiar with the process of registering/uploading a package to PyPi, but CI for automated deployment/upload is definitely possible.

@abelbriggs1 abelbriggs1 marked this pull request as ready for review May 19, 2024 18:40
@abelbriggs1
Copy link
Author

abelbriggs1 commented May 20, 2024 via email

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants