Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

feat: Add accurate codebase scope description to output header #330

Merged
merged 3 commits into from
Feb 1, 2025

Conversation

yamadashy
Copy link
Owner

@yamadashy yamadashy commented Feb 1, 2025

This PR improves how Repomix describes the scope of packed files in output headers. When users specify files via --include or --ignore, the header now correctly indicates that it contains a subset rather than the entire codebase.

related: #328

Changes

  • Header indicates "subset of codebase" when appropriate
  • Added processing details (comment removal, line numbers, etc.)
  • Maintained important base notes about binary/excluded files

Checklist

  • Run npm run test
  • Run npm run lint

@yamadashy yamadashy changed the title Feat/accurate codebase description feat: Add accurate codebase scope description to output header Feb 1, 2025
Copy link

codecov bot commented Feb 1, 2025

Codecov Report

Attention: Patch coverage is 95.40230% with 4 lines in your changes missing coverage. Please review.

Project coverage is 90.41%. Comparing base (92e510b) to head (f9ef427).
Report is 4 commits behind head on main.

Files with missing lines Patch % Lines
src/core/output/outputStyleDecorate.ts 95.34% 4 Missing ⚠️
Additional details and impacted files
@@            Coverage Diff             @@
##             main     #330      +/-   ##
==========================================
+ Coverage   90.27%   90.41%   +0.13%     
==========================================
  Files          48       48              
  Lines        2510     2588      +78     
  Branches      519      535      +16     
==========================================
+ Hits         2266     2340      +74     
- Misses        244      248       +4     

☔ View full report in Codecov by Sentry.
📢 Have feedback on the report? Share it here.

Copy link
Contributor

coderabbitai bot commented Feb 1, 2025

Warning

Rate limit exceeded

@yamadashy has exceeded the limit for the number of commits or files that can be reviewed per hour. Please wait 8 minutes and 55 seconds before requesting another review.

⌛ How to resolve this issue?

After the wait time has elapsed, a review can be triggered using the @coderabbitai review command as a PR comment. Alternatively, push new commits to this PR.

We recommend that you space out your commits to avoid hitting the rate limit.

🚦 How do rate limits work?

CodeRabbit enforces hourly rate limits for each developer per organization.

Our paid plans have higher rate limits than the trial, open-source and free plans. In all cases, we re-allow further reviews after a brief timeout.

Please see our FAQ for further information.

📥 Commits

Reviewing files that changed from the base of the PR and between 225712b and f9ef427.

📒 Files selected for processing (8)
  • README.md (1 hunks)
  • src/core/output/outputGenerate.ts (3 hunks)
  • src/core/output/outputGeneratorTypes.ts (1 hunks)
  • src/core/output/outputStyleDecorate.ts (2 hunks)
  • tests/core/output/outputStyleDecorate.test.ts (1 hunks)
  • tests/integration-tests/fixtures/packager/outputs/simple-project-output.txt (2 hunks)
  • tests/integration-tests/fixtures/packager/outputs/simple-project-output.xml (2 hunks)
  • website/server/cloudbuild.yaml (1 hunks)
📝 Walkthrough

Walkthrough

This pull request refactors the output generation process by restructuring how rendering contexts are handled. The previous inline RenderContext declaration in the output generation module is removed and redefined in a dedicated types file with additional, readonly properties. The createRenderContext and buildOutputGeneratorContext functions have been updated to integrate configuration data, ensuring that the generation header and other context elements now include the config parameter. In addition, the output styling module introduces a new ContentInfo interface and an analyzeContent function that assess content selection and processing details. The header and summary notes generation functions in this module are modified to leverage these enhancements, providing more dynamic and descriptive outputs. The changes also extend to updated unit tests, fixture text rewording, and a Cloud Run deployment configuration adjustment for increased CPU allocation.

Sequence Diagram(s)

sequenceDiagram
    participant U as User/Caller
    participant B as buildOutputGeneratorContext
    participant C as createRenderContext

    U->>B: Call buildOutputGeneratorContext(rootDir, config, allFilePaths, processedFiles)
    B-->>U: Return OutputGeneratorContext (including config)
    U->>C: Call createRenderContext(OutputGeneratorContext)
    C-->>U: Return RenderContext with integrated config data
Loading
sequenceDiagram
    participant U as User/Caller
    participant G as generateHeader
    participant A as analyzeContent

    U->>G: Call generateHeader(config, generationDate)
    G->>A: Invoke analyzeContent(config)
    A-->>G: Return ContentInfo
    G-->>U: Return Generated Header with details from ContentInfo
Loading

Possibly related PRs

  • XML Escaping #287: Modifies the RenderContext interface and its usage in createRenderContext, aligning with changes related to the parsableStyle property in that PR.
  • feat(output): remove repository URL from output files #261: Impacts the createRenderContext function and the RenderContext interface through the removal of the generateSummaryAdditionalInfo function, directly affecting the output generation context.

Thank you for using CodeRabbit. We offer it for free to the OSS community and would appreciate your support in helping us grow. If you find it useful, would you consider giving us a shout-out on your favorite social media?

❤️ Share
🪧 Tips

Chat

There are 3 ways to chat with CodeRabbit:

  • Review comments: Directly reply to a review comment made by CodeRabbit. Example:
    • I pushed a fix in commit <commit_id>, please review it.
    • Generate unit testing code for this file.
    • Open a follow-up GitHub issue for this discussion.
  • Files and specific lines of code (under the "Files changed" tab): Tag @coderabbitai in a new review comment at the desired location with your query. Examples:
    • @coderabbitai generate unit testing code for this file.
    • @coderabbitai modularize this function.
  • PR comments: Tag @coderabbitai in a new PR comment to ask questions about the PR branch. For the best results, please provide a very specific query, as very limited context is provided in this mode. Examples:
    • @coderabbitai gather interesting stats about this repository and render them as a table. Additionally, render a pie chart showing the language distribution in the codebase.
    • @coderabbitai read src/utils.ts and generate unit testing code.
    • @coderabbitai read the files in the src/scheduler package and generate a class diagram using mermaid and a README in the markdown format.
    • @coderabbitai help me debug CodeRabbit configuration file.

Note: Be mindful of the bot's finite context window. It's strongly recommended to break down tasks such as reading entire modules into smaller chunks. For a focused discussion, use review comments to chat about specific files and their changes, instead of using the PR comments.

CodeRabbit Commands (Invoked using PR comments)

  • @coderabbitai pause to pause the reviews on a PR.
  • @coderabbitai resume to resume the paused reviews.
  • @coderabbitai review to trigger an incremental review. This is useful when automatic reviews are disabled for the repository.
  • @coderabbitai full review to do a full review from scratch and review all the files again.
  • @coderabbitai summary to regenerate the summary of the PR.
  • @coderabbitai generate docstrings to generate docstrings for this PR. (Beta)
  • @coderabbitai resolve resolve all the CodeRabbit review comments.
  • @coderabbitai configuration to show the current CodeRabbit configuration for the repository.
  • @coderabbitai help to get help.

Other keywords and placeholders

  • Add @coderabbitai ignore anywhere in the PR description to prevent this PR from being reviewed.
  • Add @coderabbitai summary to generate the high-level summary at a specific location in the PR description.
  • Add @coderabbitai anywhere in the PR title to generate the title automatically.

CodeRabbit Configuration File (.coderabbit.yaml)

  • You can programmatically configure CodeRabbit by adding a .coderabbit.yaml file to the root of your repository.
  • Please see the configuration documentation for more information.
  • If your editor has YAML language server enabled, you can add the path at the top of this file to enable auto-completion and validation: # yaml-language-server: $schema=https://coderabbit.ai/integrations/schema.v2.json

Documentation and Community

  • Visit our Documentation for detailed information on how to use CodeRabbit.
  • Join our Discord Community to get help, request features, and share feedback.
  • Follow us on X/Twitter for updates and announcements.

@yamadashy yamadashy force-pushed the feat/accurate-codebase-description branch from 2098a61 to 225712b Compare February 1, 2025 09:27
Copy link

cloudflare-workers-and-pages bot commented Feb 1, 2025

Deploying repomix with  Cloudflare Pages  Cloudflare Pages

Latest commit: f9ef427
Status: ✅  Deploy successful!
Preview URL: https://ab0bf737.repomix.pages.dev
Branch Preview URL: https://feat-accurate-codebase-descr.repomix.pages.dev

View logs

Copy link
Contributor

@coderabbitai coderabbitai bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 0

🧹 Nitpick comments (5)
tests/integration-tests/fixtures/packager/outputs/simple-project-output.txt (1)

38-41: Clarify Exclusion Notes and Tweak Minor Wording
The “Notes:” section now explicitly outlines file exclusion rules based on configuration and ignore patterns. Consider the following minor suggestions:

  • Line 38: Evaluate whether a comma after “.gitignore rules” might improve readability.
  • Line 39: The phrase “binary files” appears twice; if this redundancy is not intentional, streamline the wording to avoid possible repetition.
    Overall, these changes enhance clarity, but a quick review for stylistic consistency would be beneficial.
🧰 Tools
🪛 LanguageTool

[uncategorized] ~38-~38: Possible missing comma found.
Context: ... have been excluded based on .gitignore rules and Repomix's configuration - Binary fi...

(AI_HYDRA_LEO_MISSING_COMMA)


[duplication] ~39-~39: Possible typo: you repeated a word.
Context: ...te list of file paths, including binary files - Files matching patterns in .gitignore are exc...

(ENGLISH_WORD_REPEAT_RULE)

src/core/output/outputStyleDecorate.ts (2)

3-18: Consider removing optional indicators for boolean fields in selection.
Currently, fields like include?, ignore?, etc., are declared as optional booleans. Since they are always assigned a definitive boolean value, you might simplify by removing ?, making them required booleans.

 interface ContentInfo {
   selection: {
     isEntireCodebase: boolean;
-    include?: boolean;
-    ignore?: boolean;
-    gitignore?: boolean;
-    defaultIgnore?: boolean;
+    include: boolean;
+    ignore: boolean;
+    gitignore: boolean;
+    defaultIgnore: boolean;
   };
   ...
 }

39-79: Refine processing description for clarity.
The phrase “processed where…” might read awkwardly if multiple transformations are performed. Replacing it with “processed with the following transformations…” can improve readability.

-    processingNotes.length > 0 ? ` The content has been processed where ${processingNotes.join(', ')}.` : '';
+    processingNotes.length > 0 ? ` The content has been processed with the following transformations: ${processingNotes.join(', ')}.` : '';
tests/core/output/outputStyleDecorate.test.ts (1)

11-38: Consider adding an edge case test for default ignore patterns.
When ignore.useDefaultPatterns = true but no explicit ignores are set, confirm whether the code still detects the entire codebase correctly.

Would you like me to propose a test snippet for this scenario?

src/core/output/outputGenerate.ts (1)

31-31: Convert Japanese comment to English.

For better maintainability and consistency, please translate the Japanese comment "configを追加" to English (e.g., "Added config parameter").

-    generationHeader: generateHeader(outputGeneratorContext.config, outputGeneratorContext.generationDate), // configを追加
+    generationHeader: generateHeader(outputGeneratorContext.config, outputGeneratorContext.generationDate), // Added config parameter
📜 Review details

Configuration used: CodeRabbit UI
Review profile: CHILL
Plan: Pro

📥 Commits

Reviewing files that changed from the base of the PR and between 92e510b and 2098a61.

📒 Files selected for processing (7)
  • src/core/output/outputGenerate.ts (3 hunks)
  • src/core/output/outputGeneratorTypes.ts (1 hunks)
  • src/core/output/outputStyleDecorate.ts (2 hunks)
  • tests/core/output/outputStyleDecorate.test.ts (1 hunks)
  • tests/integration-tests/fixtures/packager/outputs/simple-project-output.txt (2 hunks)
  • tests/integration-tests/fixtures/packager/outputs/simple-project-output.xml (2 hunks)
  • website/server/cloudbuild.yaml (1 hunks)
✅ Files skipped from review due to trivial changes (1)
  • tests/integration-tests/fixtures/packager/outputs/simple-project-output.xml
🧰 Additional context used
🪛 LanguageTool
tests/integration-tests/fixtures/packager/outputs/simple-project-output.txt

[uncategorized] ~38-~38: Possible missing comma found.
Context: ... have been excluded based on .gitignore rules and Repomix's configuration - Binary fi...

(AI_HYDRA_LEO_MISSING_COMMA)


[duplication] ~39-~39: Possible typo: you repeated a word.
Context: ...te list of file paths, including binary files - Files matching patterns in .gitignore are exc...

(ENGLISH_WORD_REPEAT_RULE)

⏰ Context from checks skipped due to timeout of 90000ms (8)
  • GitHub Check: Test (windows-latest, 23.x)
  • GitHub Check: Test (windows-latest, 22.x)
  • GitHub Check: Test (windows-latest, 21.x)
  • GitHub Check: Test (windows-latest, 20.x)
  • GitHub Check: Test (windows-latest, 19.x)
  • GitHub Check: Test coverage
  • GitHub Check: Analyze (javascript-typescript)
  • GitHub Check: Cloudflare Pages
🔇 Additional comments (9)
tests/integration-tests/fixtures/packager/outputs/simple-project-output.txt (1)

1-1: Header Wording Update & Subset Clarification
The introductory line now reads “This file is a merged representation of the entire codebase, combined into a single document.” While the simplified wording improves clarity, please verify that in scenarios where the --include or --ignore options are used the header is updated to explicitly indicate that the document represents a subset of the codebase rather than the entire repository.

src/core/output/outputStyleDecorate.ts (2)

20-37: Double-check isEntireCodebase logic.
When ignore.useDefaultPatterns = true, the codebase might still be partially excluded. Ensure this condition accurately represents the “entire codebase” scenario.

Would you like to run a script scanning references of isEntireCodebase for assumptions that default ignore patterns are excluded?


113-150: Code style and bullet usage are consistent.
The structure for notes is clear and easy to maintain. Great job!

src/core/output/outputGeneratorTypes.ts (1)

12-26: Good use of a read-only context interface.
The immutability and ReadonlyArray usage help prevent accidental mutations, which is a solid approach for maintaining a stable rendering configuration.

tests/core/output/outputStyleDecorate.test.ts (2)

40-85: Comprehensive coverage for generateHeader.
These tests effectively capture both entire codebase and subset usage, as well as security check toggles.


114-153: Thorough generateSummaryNotes testing.
All major scenarios for selection and processing are covered. This suite provides strong confidence in correctness.

src/core/output/outputGenerate.ts (2)

10-10: LGTM! Good separation of concerns.

Moving the RenderContext type to a dedicated types file improves code organization.


162-162: LGTM! Improved readability.

The added empty line improves readability by separating the error handling from the return statement.

website/server/cloudbuild.yaml (1)

41-41: Verify the need for increased CPU allocation.

The CPU allocation has been doubled from 1 to 2. While this could support the enhanced output generation process, please verify:

  1. Are there performance metrics justifying this increase?
  2. Have you considered the cost implications?
  3. Would vertical autoscaling be a more cost-effective solution?

Copy link
Contributor

@coderabbitai coderabbitai bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 0

🧹 Nitpick comments (5)
tests/integration-tests/fixtures/packager/outputs/simple-project-output.txt (3)

39-39: Improve binary file exclusion clarity.
This note indicates that binary files are not included in the packed representation while also directing users to the Repository Structure for a complete list (which includes binary files). To avoid potential confusion, consider rephrasing it for clarity. For example:

-Binary files are not included in this packed representation. Please refer to the Repository Structure section for a complete list of file paths, including binary files
+Binary files are omitted from this packed representation. For a complete list of all file paths—including those for binary files—please refer to the Repository Structure section.
🧰 Tools
🪛 LanguageTool

[duplication] ~39-~39: Possible typo: you repeated a word.
Context: ...te list of file paths, including binary files - Files matching patterns in .gitignore are exc...

(ENGLISH_WORD_REPEAT_RULE)


40-40: Evaluate potential redundancy in exclusion notes.
Line 40 mentions that files matching patterns in .gitignore are excluded. Since line 38 already touches on exclusions based on .gitignore and configuration, consider whether this point could be merged or reworded to reduce duplication and improve readability.


41-41: Confirm default ignore patterns note clarity.
The statement on line 41 is clear and concise, informing users that files matching default ignore patterns are excluded. Optionally, if further consolidation of exclusion notes is desired for a more unified style, you might merge it with the note on line 40.

src/core/output/outputStyleDecorate.ts (1)

42-55: Consider extracting description logic to a separate function.

The description generation logic could be more maintainable if extracted to a dedicated function.

+const generateSelectionDescription = (info: ContentInfo): string => {
+  if (info.selection.isEntireCodebase) {
+    return 'This file is a merged representation of the entire codebase';
+  }
+  const parts = [];
+  if (info.selection.include) {
+    parts.push('specifically included files');
+  }
+  if (info.selection.ignore) {
+    parts.push('files not matching ignore patterns');
+  }
+  return `This file is a merged representation of a subset of the codebase, containing ${parts.join(' and ')}`;
+};

 export const generateHeader = (config: RepomixConfigMerged, generationDate: string): string => {
   const info = analyzeContent(config);
-
-  // Generate selection description
-  let description: string;
-  if (info.selection.isEntireCodebase) {
-    description = 'This file is a merged representation of the entire codebase';
-  } else {
-    const parts = [];
-    if (info.selection.include) {
-      parts.push('specifically included files');
-    }
-    if (info.selection.ignore) {
-      parts.push('files not matching ignore patterns');
-    }
-    description = `This file is a merged representation of a subset of the codebase, containing ${parts.join(' and ')}`;
-  }
+  const description = generateSelectionDescription(info);
🧰 Tools
🪛 GitHub Check: codecov/patch

[warning] 52-53: src/core/output/outputStyleDecorate.ts#L52-L53
Added lines #L52 - L53 were not covered by tests

src/core/output/outputGenerate.ts (1)

31-31: Replace Japanese comment with English.

-    generationHeader: generateHeader(outputGeneratorContext.config, outputGeneratorContext.generationDate), // configを追加
+    generationHeader: generateHeader(outputGeneratorContext.config, outputGeneratorContext.generationDate), // Added config parameter
📜 Review details

Configuration used: CodeRabbit UI
Review profile: CHILL
Plan: Pro

📥 Commits

Reviewing files that changed from the base of the PR and between 2098a61 and 225712b.

📒 Files selected for processing (6)
  • src/core/output/outputGenerate.ts (3 hunks)
  • src/core/output/outputGeneratorTypes.ts (1 hunks)
  • src/core/output/outputStyleDecorate.ts (2 hunks)
  • tests/core/output/outputStyleDecorate.test.ts (1 hunks)
  • tests/integration-tests/fixtures/packager/outputs/simple-project-output.txt (2 hunks)
  • tests/integration-tests/fixtures/packager/outputs/simple-project-output.xml (2 hunks)
🚧 Files skipped from review as they are similar to previous changes (3)
  • tests/integration-tests/fixtures/packager/outputs/simple-project-output.xml
  • src/core/output/outputGeneratorTypes.ts
  • tests/core/output/outputStyleDecorate.test.ts
🧰 Additional context used
🪛 GitHub Check: codecov/patch
src/core/output/outputStyleDecorate.ts

[warning] 52-53: src/core/output/outputStyleDecorate.ts#L52-L53
Added lines #L52 - L53 were not covered by tests


[warning] 138-139: src/core/output/outputStyleDecorate.ts#L138-L139
Added lines #L138 - L139 were not covered by tests

🪛 LanguageTool
tests/integration-tests/fixtures/packager/outputs/simple-project-output.txt

[duplication] ~39-~39: Possible typo: you repeated a word.
Context: ...te list of file paths, including binary files - Files matching patterns in .gitignore are exc...

(ENGLISH_WORD_REPEAT_RULE)

⏰ Context from checks skipped due to timeout of 90000ms (5)
  • GitHub Check: Test (windows-latest, 23.x)
  • GitHub Check: Test (windows-latest, 22.x)
  • GitHub Check: Test (windows-latest, 21.x)
  • GitHub Check: Test (windows-latest, 20.x)
  • GitHub Check: Cloudflare Pages
🔇 Additional comments (8)
tests/integration-tests/fixtures/packager/outputs/simple-project-output.txt (2)

1-1: Verify header scope accuracy.
The header states "merged representation of the entire codebase, combined into a single document." Please verify that this description is accurate—especially in scenarios where file selection via --include or --ignore should result in a "subset of codebase" header as outlined in the PR objectives.


38-38: Approved file exclusion note.
The note on line 38 clearly explains that some files may be excluded based on both .gitignore rules and Repomix's configuration. This adds useful context for users reviewing the packed output.

src/core/output/outputStyleDecorate.ts (4)

3-18: LGTM! Well-structured interface design.

The ContentInfo interface provides a clear and logical separation between content selection and processing flags.


20-37: LGTM! Clean and efficient implementation.

The function correctly maps configuration values to the ContentInfo interface, with clear logic for determining content selection and processing details.


39-79: LGTM! Comprehensive header generation with accurate scope description.

The implementation aligns perfectly with the PR objectives by clearly distinguishing between entire codebase and subset representations.

Please add test coverage for the ignore patterns description at lines 52-53.

🧰 Tools
🪛 GitHub Check: codecov/patch

[warning] 52-53: src/core/output/outputStyleDecorate.ts#L52-L53
Added lines #L52 - L53 were not covered by tests


113-150: LGTM! Comprehensive summary notes generation.

The implementation provides clear and well-organized notes about file selection and processing details.

Please add test coverage for the empty lines removal note at lines 138-139.

🧰 Tools
🪛 GitHub Check: codecov/patch

[warning] 138-139: src/core/output/outputStyleDecorate.ts#L138-L139
Added lines #L138 - L139 were not covered by tests

src/core/output/outputGenerate.ts (2)

10-10: LGTM! Correct type import.

The import statement correctly reflects the relocation of the RenderContext type.


Line range hint 162-169: LGTM! Correct context building.

The function correctly includes the config object needed for the updated header generation.

@yamadashy yamadashy force-pushed the feat/accurate-codebase-description branch from b5d274f to f9ef427 Compare February 1, 2025 09:45
@yamadashy yamadashy merged commit 8f9209c into main Feb 1, 2025
54 checks passed
@yamadashy yamadashy deleted the feat/accurate-codebase-description branch February 1, 2025 09:56
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

1 participant