Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

feat(cli): Add support for multiple file exclusions and .gitingestignore file : issue #147 #150

Open
wants to merge 6 commits into
base: main
Choose a base branch
from

Conversation

AbhiRam162105
Copy link

@AbhiRam162105 AbhiRam162105 commented Jan 21, 2025

Enhanced CLI and Web Features: Multiple File Exclusion using CLI and .gitingestignore file

Description:

This PR significantly improves the gitingest CLI tool by introducing three major features:

1. Support for Multiple Exclusion Patterns

  • New Feature: The CLI now accepts multiple file exclusion patterns via the -e option.
  • Enhancements:
    • Command line pattern parsing has been enhanced to handle space-separated patterns.
    • Users can exclude multiple files in a single command:
      gitingest -e "LICENSE README.md package.json"
      or using multiple flags:
      gitingest -e LICENSE -e README.md -e package.json

2. .gitingestignore File Support

  • New Feature: Introduced support for a .gitingestignore file to specify files and directories to be ignored.
  • Details:
    • The .gitingestignore file follows a format similar to .gitignore:
      node_modules/
      LICENSE
      package.json
      pnpm-lock.yaml
      pnpm-workspace.yaml
      tsconfig
      
    • Patterns from both command line arguments and the .gitingestignore file are combined for comprehensive exclusion.
    • Works with both local repositories and remote (web) repositories:
      # For local repositories
      gitingest ./my-local-repo --ignore-file .gitingestignore
      
      # For remote repositories
      gitingest https://github.com/user/repo --ignore-file .gitingestignore
    • When using with remote repositories, the tool will:
      1. First clone the repository
      2. Look for the .gitingestignore file in the cloned repository
      3. Apply the ignore patterns during analysis
      4. Clean up the temporary files after completion

3. Branch Selection Support

  • New Feature: Added ability to specify which branch to clone and analyze via the -b/--branch option.
  • Details:
    • Users can now specify a branch when analyzing a repository:
      gitingest <repo_url> --branch main
    • If no branch is specified, the default branch of the repository is used.
    • Branch selection works seamlessly with other features like pattern exclusion.

Changes Made:

  1. CLI Updates:

    • Added new --branch/-b option
    • Enhanced pattern handling in --exclude-pattern/-e and --include-pattern/-i options
    • Added --ignore-file option with default value .gitingestignore
  2. Core Functionality:

    • Updated ingest function to handle branch selection
    • Implemented pattern parsing for space-separated patterns
    • Added ignore file parsing functionality with web repository support
    • Improved error handling and parameter validation
    • Added automatic cleanup for temporary files in web repository analysis
  3. Documentation:

    • Updated help messages for all CLI options
    • Added comprehensive docstrings for new parameters
    • Included usage examples for both local and web repositories
    • Added documentation for web repository behavior

Impact:
These enhancements make the gitingest CLI more flexible and user-friendly by:

  • Providing multiple ways to specify file exclusions
  • Supporting branch-specific analysis
  • Following familiar Git-like patterns for configuration
  • Supporting web repositories with the same feature set as local repositories
  • Improving the overall user experience and efficiency

Issue Addressed:

Usage Examples:

  1. Local Repository Analysis:
gitingest ./my-local-repo \
  -b develop \
  -e "*.pyc" \
  -e "node_modules/" \
  -i "*.py" \
  --ignore-file .gitingestignore
  1. Web Repository Analysis:
gitingest https://github.com/user/repo \
  -b develop \
  -e "*.pyc" \
  -e "node_modules/" \
  -i "*.py" \
  --ignore-file custom.gitingestignore

Note: The .gitingestignore file functionality works seamlessly with both gitingest cli and gitingest website, making it a versatile tool for managing file exclusions regardless of the repository location.

… and .gitingestignore support

- Added the ability to specify multiple file exclusion patterns using the `-e` option.
- Introduced support for a `.gitingestignore` file to define files and directories to be excluded.
- Improved command line pattern parsing to handle space-separated patterns.
- Combined exclusion patterns from both command line arguments and the `.gitingestignore` file for comprehensive exclusion.
- Removed the unused import 'os' to satisfy pylint checks.
- Corrected docstring issues to comply with darglint requirements.
- Updated documentation with clear examples illustrating the new features.

These enhancements provide users with a more flexible and user-friendly way to exclude files and directories, either through command line options or by using a `.gitingestignore` file.
@AbhiRam162105
Copy link
Author

@cyclotruc please review this PR.

@AbhiRam162105
Copy link
Author

@cyclotruc do you have any changes to suggest for this PR.

Copy link
Owner

@cyclotruc cyclotruc left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The goal for gitingest is to provide the exact same behaviour when it's accessed from the web or from the CLI

In that regard, it would be wise to apply this ignore pattern logic also when a .gitingestignore file is included is found in the repo from the web UI

@cyclotruc
Copy link
Owner

@AbhiRam162105 Thank you for this contribution!

I requested a change and I'll be happy to merge this PR once it's done

But I want to warn you that this file might be a temporary feature, there's in fact the project to create a .gitingest file, I'm not sure about the format yet, and this file will contain ignore patterns but also all the other settings possible for gitingest

When this happens, we'll throw a depreciation warning when this file is used, until we finally make the move to .gitingest

I thought about naming this file .gitingest directly but it sounds less natural for the depreciation warning

@AbhiRam162105
Copy link
Author

AbhiRam162105 commented Jan 25, 2025

Hey @cyclotruc I have made changes such that .gitingestignore file method for file exclusion works with both CLI and when it's used from the web. Now if the .gitingestignore file exists in the repository the file is parsed in both cases and exclusion the files occurs in both CLI and web. Please review the code and feel free to continue with .gitingestignore file until the .gitignore file method is in the works.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants