Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

XML Escaping #287

Merged
merged 11 commits into from
Jan 19, 2025
Merged

XML Escaping #287

merged 11 commits into from
Jan 19, 2025

Conversation

atollk
Copy link
Contributor

@atollk atollk commented Jan 12, 2025

See #282

Adds a parsableStyle flag to ensure the output strictly follows the specification of the output style.

Checklist

  • Run npm run test
  • Run npm run lint

Copy link

stackblitz bot commented Jan 12, 2025

Review PR in StackBlitz Codeflow Run & review this pull request in StackBlitz Codeflow.

Copy link
Contributor

coderabbitai bot commented Jan 12, 2025

📝 Walkthrough

Walkthrough

This pull request introduces enhancements for generating parsable XML output in the Repomix project. Key changes include the addition of the fast-xml-parser dependency, updates to the configuration schema to include a new parsableStyle option, and modifications to the output generation logic to facilitate XML-based output. The implementation allows users to generate structured XML representations of repository analysis, supported by new configuration options and improved output generation capabilities.

Changes

File Change Summary
package.json Added fast-xml-parser dependency (version ^4.5.1)
src/config/configSchema.ts - Added parsableStyle boolean option
- Updated multiple output configuration defaults
- Enhanced configuration schema
src/core/output/outputGenerate.ts - Introduced RenderContext interface
- Added generateParsableXmlOutput function
- Modified generateOutput to support different output styles
tests/cli/actions/defaultAction.test.ts Added parsableStyle: false to test configurations
tests/cli/cliRun.test.ts Added parsableStyle: false to mocked configurations
tests/config/configSchema.test.ts Updated test configurations to include parsableStyle
tests/core/output/outputGenerate.test.ts Added new test case for XML output generation
src/cli/actions/defaultAction.ts Introduced conditional logic for parsableStyle in CLI configuration
src/cli/cliRun.ts Added parsableStyle to CliOptions interface and command-line options
src/core/output/outputStyles/markdownStyle.ts Modified getMarkdownTemplate to use dynamic markdown code block delimiter
website/server/src/remoteRepo.ts Added parsableStyle to cliOptions
website/server/src/schemas/request.ts Added outputParsable field to packOptionsSchema
website/server/src/types.ts Added outputParsable property to PackOptions interface
README.md Made formatting adjustments for improved readability

Possibly related PRs

  • fix(template): Fix template break line #244: The changes in this PR involve modifications to the markdown template rendering, which is indirectly related to the new XML output generation introduced in the main PR that utilizes the fast-xml-parser library.
  • feat(output): remove repository URL from output files #261: This PR modifies the output files by removing the repository URL, which is relevant to the output generation changes in the main PR, particularly in the context of how output is structured and rendered.

Thank you for using CodeRabbit. We offer it for free to the OSS community and would appreciate your support in helping us grow. If you find it useful, would you consider giving us a shout-out on your favorite social media?

❤️ Share
🪧 Tips

Chat

There are 3 ways to chat with CodeRabbit:

  • Review comments: Directly reply to a review comment made by CodeRabbit. Example:
    • I pushed a fix in commit <commit_id>, please review it.
    • Generate unit testing code for this file.
    • Open a follow-up GitHub issue for this discussion.
  • Files and specific lines of code (under the "Files changed" tab): Tag @coderabbitai in a new review comment at the desired location with your query. Examples:
    • @coderabbitai generate unit testing code for this file.
    • @coderabbitai modularize this function.
  • PR comments: Tag @coderabbitai in a new PR comment to ask questions about the PR branch. For the best results, please provide a very specific query, as very limited context is provided in this mode. Examples:
    • @coderabbitai gather interesting stats about this repository and render them as a table. Additionally, render a pie chart showing the language distribution in the codebase.
    • @coderabbitai read src/utils.ts and generate unit testing code.
    • @coderabbitai read the files in the src/scheduler package and generate a class diagram using mermaid and a README in the markdown format.
    • @coderabbitai help me debug CodeRabbit configuration file.

Note: Be mindful of the bot's finite context window. It's strongly recommended to break down tasks such as reading entire modules into smaller chunks. For a focused discussion, use review comments to chat about specific files and their changes, instead of using the PR comments.

CodeRabbit Commands (Invoked using PR comments)

  • @coderabbitai pause to pause the reviews on a PR.
  • @coderabbitai resume to resume the paused reviews.
  • @coderabbitai review to trigger an incremental review. This is useful when automatic reviews are disabled for the repository.
  • @coderabbitai full review to do a full review from scratch and review all the files again.
  • @coderabbitai summary to regenerate the summary of the PR.
  • @coderabbitai generate docstrings to generate docstrings for this PR. (Beta)
  • @coderabbitai resolve resolve all the CodeRabbit review comments.
  • @coderabbitai configuration to show the current CodeRabbit configuration for the repository.
  • @coderabbitai help to get help.

Other keywords and placeholders

  • Add @coderabbitai ignore anywhere in the PR description to prevent this PR from being reviewed.
  • Add @coderabbitai summary to generate the high-level summary at a specific location in the PR description.
  • Add @coderabbitai anywhere in the PR title to generate the title automatically.

CodeRabbit Configuration File (.coderabbit.yaml)

  • You can programmatically configure CodeRabbit by adding a .coderabbit.yaml file to the root of your repository.
  • Please see the configuration documentation for more information.
  • If your editor has YAML language server enabled, you can add the path at the top of this file to enable auto-completion and validation: # yaml-language-server: $schema=https://coderabbit.ai/integrations/schema.v2.json

Documentation and Community

  • Visit our Documentation for detailed information on how to use CodeRabbit.
  • Join our Discord Community to get help, request features, and share feedback.
  • Follow us on X/Twitter for updates and announcements.

@atollk
Copy link
Contributor Author

atollk commented Jan 12, 2025

Not yet done: Adding the new option to the website; making the Markdown output parsable as well, for consistency.

Copy link
Contributor

@coderabbitai coderabbitai bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 1

🧹 Nitpick comments (4)
src/core/output/outputGenerate.ts (2)

55-57: Remove unused 'config' parameter in 'generateParsableXmlOutput' function

The config parameter is not used within the generateParsableXmlOutput function. Removing it improves code clarity.

Apply this diff to remove the unused parameter:

-const generateParsableXmlOutput = async (
-  config: RepomixConfigMerged,
   renderContext: RenderContext,
 ): Promise<string> => {

92-95: Align function signature of 'generateParsableMarkdownOutput' with other output generators

For consistency and future maintainability, consider updating generateParsableMarkdownOutput to accept config and renderContext as parameters, similar to generateParsableXmlOutput and generateHandlebarOutput.

Apply this diff to align the function signature and update the call:

 const generateParsableMarkdownOutput = async (
+  config: RepomixConfigMerged,
+  renderContext: RenderContext,
 ): Promise<string> => {
   // TODO
   throw new Error('TODO');
 };

 // Update the call in generateOutput function
 switch (config.output.style) {
   case 'xml':
     return generateParsableXmlOutput(config, renderContext);
   case 'markdown':
-    return generateParsableMarkdownOutput();
+    return generateParsableMarkdownOutput(config, renderContext);
   default:
     return generateHandlebarOutput(config, renderContext);
 }
tests/cli/cliRun.test.ts (1)

50-50: Consider adding specific test cases for the parsableStyle flag.

While the property is correctly added to mock configurations, there are no test cases specifically verifying the behavior of this new flag.

Consider adding test cases like:

test('should handle --parsable-style flag', async () => {
  await executeAction('.', process.cwd(), { parsableStyle: true });

  expect(defaultAction.runDefaultAction).toHaveBeenCalledWith(
    '.',
    process.cwd(),
    expect.objectContaining({
      output: expect.objectContaining({
        parsableStyle: true,
      }),
    }),
  );
});

Also applies to: 88-88

tests/cli/actions/defaultAction.test.ts (1)

24-24: Add test cases for the parsableStyle flag following existing patterns.

The file already has comprehensive test coverage for similar output flags. Consider adding a new describe block for the parsableStyle flag following the same pattern as other output flags.

Example implementation:

describe('parsableStyle flag', () => {
  it('should handle --parsable-style flag', async () => {
    const options: CliOptions = {
      parsableStyle: true,
    };

    await runDefaultAction('.', process.cwd(), options);

    expect(configLoader.mergeConfigs).toHaveBeenCalledWith(
      process.cwd(),
      expect.anything(),
      expect.objectContaining({
        output: {
          parsableStyle: true,
        },
      }),
    );
  });

  it('should handle explicit --no-parsable-style flag', async () => {
    const options: CliOptions = {
      parsableStyle: false,
    };

    await runDefaultAction('.', process.cwd(), options);

    expect(configLoader.mergeConfigs).toHaveBeenCalledWith(
      process.cwd(),
      expect.anything(),
      expect.objectContaining({
        output: {
          parsableStyle: false,
        },
      }),
    );
  });
});
📜 Review details

Configuration used: CodeRabbit UI
Review profile: CHILL
Plan: Pro

📥 Commits

Reviewing files that changed from the base of the PR and between fcbbd84 and a4af550.

⛔ Files ignored due to path filters (1)
  • package-lock.json is excluded by !**/package-lock.json
📒 Files selected for processing (7)
  • package.json (1 hunks)
  • src/config/configSchema.ts (1 hunks)
  • src/core/output/outputGenerate.ts (4 hunks)
  • tests/cli/actions/defaultAction.test.ts (1 hunks)
  • tests/cli/cliRun.test.ts (2 hunks)
  • tests/config/configSchema.test.ts (2 hunks)
  • tests/core/output/outputGenerate.test.ts (2 hunks)
🔇 Additional comments (4)
tests/core/output/outputGenerate.test.ts (1)

34-61: Good addition of test case for parsable XML output

The new test effectively verifies the XML output generation when parsableStyle is enabled, ensuring the functionality works as expected.

src/config/configSchema.ts (1)

59-59: Addition of 'parsableStyle' property enhances configurability

Adding the parsableStyle boolean property with a default value of false integrates well into the existing configuration schema and allows users to enable parsable output formats.

tests/config/configSchema.test.ts (1)

67-67: LGTM! Test coverage for the new parsableStyle flag looks good.

The new configuration property is consistently added across test cases with appropriate default values.

Also applies to: 151-151

package.json (1)

69-69: LGTM! The fast-xml-parser dependency is appropriate for XML handling.

The package is well-maintained and widely used in the community. The caret range allows for minor updates while maintaining compatibility.

Copy link
Contributor

@coderabbitai coderabbitai bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 1

🧹 Nitpick comments (2)
tests/core/output/outputGenerate.test.ts (1)

34-61: Enhance XML test coverage.

While the basic XML parsing test is good, consider adding test cases for:

  1. XML content with special characters that need escaping (<, >, &, ", ')
  2. More complex XML structures with nested elements
   const mockProcessedFiles: ProcessedFile[] = [
-    { path: 'file1.txt', content: '<div>foo</div>' },
-    { path: 'dir/file2.txt', content: 'if (a && b)' },
+    { path: 'file1.txt', content: '<div>foo & bar</div>' },
+    { path: 'dir/file2.txt', content: 'if (a && b) { return "<xml>"; }' },
+    { path: 'file3.xml', content: '<root><child attr="value & quote">nested & content</child></root>' },
   ];
src/core/output/outputGenerate.ts (1)

55-69: Consider refactoring the markdown delimiter calculation for better readability.

The nested map/reduce operations for calculating markdownCodeBlockDelimiter could be simplified for better maintainability.

Consider extracting this logic into a separate function:

+ const calculateMarkdownDelimiter = (files: ReadonlyArray<ProcessedFile>): string => {
+   const maxBackticks = files
+     .flatMap(file => file.content.match(/`*/g) ?? [])
+     .reduce((max, match) => Math.max(max, match.length), 0);
+   return '`'.repeat(Math.max(3, maxBackticks + 1));
+ };

  const createRenderContext = (outputGeneratorContext: OutputGeneratorContext): RenderContext => {
    return {
      // ... other properties
-     markdownCodeBlockDelimiter: '`'.repeat(
-       Math.max(
-         3,
-         1 +
-           outputGeneratorContext.processedFiles
-             .map(
-               (file) =>
-                 file.content
-                   .match(/`*/g)
-                   ?.map((s) => s.length)
-                   .reduce((acc, l) => (l > acc ? l : acc), 0) ?? 0,
-             )
-             .reduce((acc, l) => (l > acc ? l : acc), 0),
-       ),
-     ),
+     markdownCodeBlockDelimiter: calculateMarkdownDelimiter(outputGeneratorContext.processedFiles),
    };
  };
📜 Review details

Configuration used: CodeRabbit UI
Review profile: CHILL
Plan: Pro

📥 Commits

Reviewing files that changed from the base of the PR and between a4af550 and f448016.

📒 Files selected for processing (8)
  • src/cli/actions/defaultAction.ts (1 hunks)
  • src/cli/cliRun.ts (2 hunks)
  • src/config/configSchema.ts (2 hunks)
  • src/core/output/outputGenerate.ts (4 hunks)
  • src/core/output/outputStyles/markdownStyle.ts (1 hunks)
  • tests/cli/actions/defaultAction.test.ts (2 hunks)
  • tests/cli/cliRun.test.ts (3 hunks)
  • tests/core/output/outputGenerate.test.ts (2 hunks)
🚧 Files skipped from review as they are similar to previous changes (3)
  • tests/cli/actions/defaultAction.test.ts
  • tests/cli/cliRun.test.ts
  • src/config/configSchema.ts
🔇 Additional comments (8)
src/core/output/outputStyles/markdownStyle.ts (1)

43-45: Great improvement using dynamic code block delimiters!

The change from hardcoded delimiters to {{{../markdownCodeBlockDelimiter}}} is a good solution for handling nested code blocks and ensuring the output is properly parsable.

src/cli/cliRun.ts (2)

24-24: LGTM! Interface update is correct.

The optional boolean property is properly typed.


51-51: Well-documented CLI option!

The description clearly explains the purpose of the --parsableStyle flag.

src/cli/actions/defaultAction.ts (1)

115-117: LGTM! Consistent implementation pattern.

The handling of parsableStyle follows the established pattern and correctly preserves other output settings.

tests/core/output/outputGenerate.test.ts (1)

63-87: LGTM! Comprehensive markdown test coverage.

The test cases effectively verify:

  • Proper header rendering
  • Nested code block handling with increased delimiter count
  • Various content types
src/core/output/outputGenerate.ts (3)

22-36: Well-structured interface definition with proper immutability!

The RenderContext interface is well-designed with:

  • Clear and descriptive property names
  • Appropriate use of readonly for immutability
  • Proper typing for all properties

133-141: Verify the handling of parsableStyle for markdown output.

The code routes to generateHandlebarOutput for markdown style even when parsableStyle is true. This seems inconsistent with the PR objectives which mention making markdown output parsable.

Consider implementing a generateParsableMarkdownOutput function or document why markdown doesn't need special handling when parsableStyle is true.


124-142: Consider adding tests for edge cases.

The output generation logic handles multiple formats and configurations. Consider adding tests for:

  • Empty file content
  • Special characters in file paths
  • Very large files
  • Invalid XML characters in content

Would you like me to help create test cases for these scenarios?

@atollk
Copy link
Contributor Author

atollk commented Jan 16, 2025

@yamadashy I'll also add a checkbox with the new feature to the website, unless there is a reason not to?

Copy link
Contributor

@coderabbitai coderabbitai bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 0

🔭 Outside diff range comments (1)
src/core/output/outputGenerate.ts (1)

Line range hint 106-121: Add error handling for template compilation.

While the implementation is correct, it lacks error handling for template compilation failures.

Apply this diff to add error handling:

   const compiledTemplate = Handlebars.compile(template);
-  return `${compiledTemplate(renderContext).trim()}\n`;
+  try {
+    return `${compiledTemplate(renderContext).trim()}\n`;
+  } catch (error) {
+    throw new RepomixError(
+      `Failed to compile template: ${error instanceof Error ? error.message : 'Unknown error'}`
+    );
+  }
🧹 Nitpick comments (8)
website/server/src/types.ts (1)

9-9: Consider aligning property name with PR description.

The property name outputParsable differs from parsableStyle mentioned in the PR description. Consider using consistent naming across the codebase to avoid confusion.

-  outputParsable?: boolean;
+  parsableStyle?: boolean;
website/server/src/schemas/request.ts (1)

26-26: Align schema property name with PR description.

The schema property name should match the naming used across the codebase. Additionally, the schema validation looks correct.

-    outputParsable: z.boolean().optional(),
+    parsableStyle: z.boolean().optional(),
website/server/src/remoteRepo.ts (2)

10-10: Review import path crossing package boundaries.

The import path ../../../src/cli/actions/defaultAction.js traverses outside the website directory, which could cause issues with package boundaries. Consider importing from the package instead.

-import type {DefaultActionRunnerResult} from "../../../src/cli/actions/defaultAction.js";
+import type {DefaultActionRunnerResult} from "repomix";

54-54: Consider consistent property naming.

The mapping between outputParsable and parsableStyle bridges the naming inconsistency. This could be simplified if the property names were consistent throughout the codebase.

-    parsableStyle: validatedData.options.outputParsable,
+    parsableStyle: validatedData.options.parsableStyle,
src/core/output/outputGenerate.ts (2)

38-43: Consider optimizing the regex pattern.

The implementation is correct and ensures proper code block escaping. However, the regex pattern could be optimized.

Apply this diff to optimize the regex pattern:

-    .flatMap((file) => file.content.match(/`*/g) ?? [])
+    .flatMap((file) => file.content.match(/`+/g) ?? [])

The + quantifier is more efficient than * here as we're only interested in sequences of one or more backticks.


123-141: Consider explicit default style handling.

The implementation correctly routes output generation based on style and parsable flag. However, the default case could be more explicit.

Apply this diff to improve default case handling:

   if (!config.output.parsableStyle) return generateHandlebarOutput(config, renderContext);
   switch (config.output.style) {
     case 'xml':
       return generateParsableXmlOutput(renderContext);
     case 'markdown':
       return generateHandlebarOutput(config, renderContext);
     default:
-      return generateHandlebarOutput(config, renderContext);
+      // Plain text style is the default
+      return generateHandlebarOutput(config, renderContext);
   }
README.md (2)

200-200: Add language identifiers to code blocks.

Several code blocks are missing language identifiers, which affects syntax highlighting.

Add appropriate language identifiers to the code blocks in the prompt examples section, for example:

- ```
+ ```text
This file contains my entire codebase...

Also applies to: 208-208, 216-216, 224-224

🧰 Tools
🪛 Markdownlint (0.37.0)

200-200: null
Fenced code blocks should have a language specified

(MD040, fenced-code-language)


637-637: Use proper Markdown link syntax for URLs.

Replace bare URLs with proper Markdown link syntax.

Apply this change:

- https://docs.anthropic.com/en/docs/build-with-claude/prompt-engineering/long-context-tips
+ [Long Context Tips](https://docs.anthropic.com/en/docs/build-with-claude/prompt-engineering/long-context-tips)
🧰 Tools
🪛 Markdownlint (0.37.0)

637-637: null
Bare URL used

(MD034, no-bare-urls)

📜 Review details

Configuration used: CodeRabbit UI
Review profile: CHILL
Plan: Pro

📥 Commits

Reviewing files that changed from the base of the PR and between f448016 and 55ef7e5.

📒 Files selected for processing (5)
  • README.md (22 hunks)
  • src/core/output/outputGenerate.ts (4 hunks)
  • website/server/src/remoteRepo.ts (3 hunks)
  • website/server/src/schemas/request.ts (1 hunks)
  • website/server/src/types.ts (1 hunks)
🧰 Additional context used
🪛 Markdownlint (0.37.0)
README.md

200-200: null
Fenced code blocks should have a language specified

(MD040, fenced-code-language)


208-208: null
Fenced code blocks should have a language specified

(MD040, fenced-code-language)


216-216: null
Fenced code blocks should have a language specified

(MD040, fenced-code-language)


224-224: null
Fenced code blocks should have a language specified

(MD040, fenced-code-language)


637-637: null
Bare URL used

(MD034, no-bare-urls)

🪛 LanguageTool
README.md

[style] ~246-~246: The phrase ‘feel free to’ is used quite frequently. Consider using a less frequent alternative to set your writing apart from others and make it sound more professional.
Context: ...tting the most out of AI code analysis Feel free to join the discussion and share your own ...

(FEEL_FREE_TO_STYLE_ME)

🔇 Additional comments (4)
website/server/src/remoteRepo.ts (1)

68-68: LGTM! Good type safety practice.

Adding explicit type annotation for the result variable improves type safety and code clarity.

src/core/output/outputGenerate.ts (2)

3-3: LGTM! Well-structured interface design.

The RenderContext interface is comprehensive and clearly defines all necessary properties for rendering context. The addition of the XMLBuilder import supports the new XML output functionality.

Also applies to: 22-36


66-104: LGTM! Robust XML generation with proper error handling.

The implementation includes:

  • Well-structured XML document
  • Conditional sections based on configuration
  • Proper error handling with specific error messages
README.md (1)

478-498: LGTM! Clear documentation of configuration options.

The configuration options table is well-structured and provides clear descriptions for all options, including the new parsableStyle option.

Copy link

codecov bot commented Jan 18, 2025

Codecov Report

Attention: Patch coverage is 88.31169% with 9 lines in your changes missing coverage. Please review.

Project coverage is 92.12%. Comparing base (997f752) to head (84644cd).
Report is 12 commits behind head on main.

Files with missing lines Patch % Lines
src/core/output/outputGenerate.ts 87.32% 9 Missing ⚠️
Additional details and impacted files
@@            Coverage Diff             @@
##             main     #287      +/-   ##
==========================================
- Coverage   92.20%   92.12%   -0.08%     
==========================================
  Files          44       44              
  Lines        2168     2236      +68     
  Branches      474      491      +17     
==========================================
+ Hits         1999     2060      +61     
- Misses        169      176       +7     

☔ View full report in Codecov by Sentry.
📢 Have feedback on the report? Share it here.

@yamadashy
Copy link
Owner

yamadashy commented Jan 18, 2025

@atollk

@yamadashy I'll also add a checkbox with the new feature to the website, unless there is a reason not to?

Yes, you can add the checkbox to the website, but there are a few steps we need to follow:

  1. First, we need to release this feature to npm, as repomix.com uses the npm package version
  2. After the release, I'll need to update the website myself as I'm the only one who can deploy changes

So let's focus on merging and releasing this feature first, and then I'll handle the website update. Sorry for the complicated process!

Copy link
Contributor

@coderabbitai coderabbitai bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 0

🧹 Nitpick comments (2)
src/core/output/outputGenerate.ts (2)

38-43: Add input validation to calculateMarkdownDelimiter

While the function logic is solid, consider adding input validation to handle edge cases.

 const calculateMarkdownDelimiter = (files: ReadonlyArray<ProcessedFile>): string => {
+  if (!files?.length) {
+    return '```';
+  }
   const maxBackticks = files
     .flatMap((file) => file.content.match(/`+/g) ?? [])
     .reduce((max, match) => Math.max(max, match.length), 0);
   return '`'.repeat(Math.max(3, maxBackticks + 1));
 };

139-146: Simplify output style routing logic

The default case duplicates the markdown case. Consider simplifying the logic.

   if (!config.output.parsableStyle) return generateHandlebarOutput(config, renderContext);
   switch (config.output.style) {
     case 'xml':
       return generateParsableXmlOutput(renderContext);
-    case 'markdown':
-      return generateHandlebarOutput(config, renderContext);
     default:
       return generateHandlebarOutput(config, renderContext);
   }
📜 Review details

Configuration used: CodeRabbit UI
Review profile: CHILL
Plan: Pro

📥 Commits

Reviewing files that changed from the base of the PR and between 55ef7e5 and 2ebb27c.

📒 Files selected for processing (2)
  • src/core/output/outputGenerate.ts (4 hunks)
  • website/server/src/remoteRepo.ts (1 hunks)
🚧 Files skipped from review as they are similar to previous changes (1)
  • website/server/src/remoteRepo.ts
🔇 Additional comments (3)
src/core/output/outputGenerate.ts (3)

22-36: Well-structured interface definition!

The RenderContext interface is well-designed with:

  • Clear readonly properties preventing accidental mutations
  • Comprehensive coverage of all output requirements
  • Logical grouping of related properties

61-62: LGTM! Good integration of new features

The new properties are well-integrated:

  • escapeFileContent is properly derived from parsableStyle
  • markdownCodeBlockDelimiter uses the new calculation function

66-104: Consider XML content sanitization

While the XML structure is well-organized, direct insertion of file content without sanitization could lead to XML injection or malformed output if the content contains special XML characters.

Consider running this script to check for potential XML special characters in file contents:

@atollk
Copy link
Contributor Author

atollk commented Jan 18, 2025

Makes sense to me. I added the backend logic for the website already, which shouldn't be an issue, because it's basically invisible to users. Unless you want me to revert that, I think this is ready to be merged.

@atollk atollk marked this pull request as ready for review January 18, 2025 10:29
Copy link
Contributor

@coderabbitai coderabbitai bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 0

🧹 Nitpick comments (4)
tests/core/output/outputGenerate.test.ts (1)

63-87: Consider adding assertions for nested code block escaping.

The test case is well-structured, but could be enhanced by explicitly verifying that nested code blocks are properly escaped.

Add assertions to verify the escaping of nested code blocks:

     expect(output).toContain('## File: dir/file2.txt');
     expect(output).toContain('````\n```\ncontent2\n```\n````');
+    // Verify nested code block escaping
+    expect(output).not.toContain('`````');  // Should not over-escape
+    expect(output.match(/`{4}\n[^`].*\n`{4}/s)).not.toBeNull();  // Should be properly wrapped
src/core/output/outputGenerate.ts (1)

38-43: Add input validation for files array.

The function handles the calculation well, but should validate the input array.

Add input validation:

 const calculateMarkdownDelimiter = (files: ReadonlyArray<ProcessedFile>): string => {
+  if (!Array.isArray(files)) {
+    throw new RepomixError('Invalid input: files must be an array');
+  }
   const maxBackticks = files
     .flatMap((file) => file.content.match(/`+/g) ?? [])
     .reduce((max, match) => Math.max(max, match.length), 0);
   return '`'.repeat(Math.max(3, maxBackticks + 1));
 };
README.md (2)

482-482: Add missing article "the".

Fix the grammar in the parsableStyle description.

-| `output.parsableStyle`           | Whether to escape the output based on the chosen style schema. Note that this can increase token count.                      | `false`                |
+| `output.parsableStyle`           | Whether to escape the output based on the chosen style schema. Note that this can increase the token count.                      | `false`                |
🧰 Tools
🪛 LanguageTool

[uncategorized] ~482-~482: You might be missing the article “the” here.
Context: ...yle schema. Note that this can increase token count. | false ...

(AI_EN_LECTOR_MISSING_DETERMINER_THE)


637-637: Format bare URL as a proper link.

Convert the bare URL to a proper markdown link.

-https://docs.anthropic.com/en/docs/build-with-claude/prompt-engineering/long-context-tips
+[Long Context Tips](https://docs.anthropic.com/en/docs/build-with-claude/prompt-engineering/long-context-tips)
🧰 Tools
🪛 Markdownlint (0.37.0)

637-637: null
Bare URL used

(MD034, no-bare-urls)

📜 Review details

Configuration used: CodeRabbit UI
Review profile: CHILL
Plan: Pro

📥 Commits

Reviewing files that changed from the base of the PR and between 2ebb27c and 785f7b6.

⛔ Files ignored due to path filters (1)
  • package-lock.json is excluded by !**/package-lock.json
📒 Files selected for processing (14)
  • README.md (22 hunks)
  • package.json (1 hunks)
  • src/cli/actions/defaultAction.ts (1 hunks)
  • src/cli/cliRun.ts (2 hunks)
  • src/config/configSchema.ts (2 hunks)
  • src/core/output/outputGenerate.ts (4 hunks)
  • src/core/output/outputStyles/markdownStyle.ts (1 hunks)
  • tests/cli/actions/defaultAction.test.ts (2 hunks)
  • tests/cli/cliRun.test.ts (3 hunks)
  • tests/config/configSchema.test.ts (2 hunks)
  • tests/core/output/outputGenerate.test.ts (2 hunks)
  • website/server/src/remoteRepo.ts (1 hunks)
  • website/server/src/schemas/request.ts (1 hunks)
  • website/server/src/types.ts (1 hunks)
🚧 Files skipped from review as they are similar to previous changes (11)
  • website/server/src/types.ts
  • website/server/src/remoteRepo.ts
  • src/cli/actions/defaultAction.ts
  • website/server/src/schemas/request.ts
  • package.json
  • src/core/output/outputStyles/markdownStyle.ts
  • tests/config/configSchema.test.ts
  • src/cli/cliRun.ts
  • src/config/configSchema.ts
  • tests/cli/actions/defaultAction.test.ts
  • tests/cli/cliRun.test.ts
🧰 Additional context used
🪛 LanguageTool
README.md

[uncategorized] ~114-~114: You might be missing the article “the” here.
Context: ...Gitingest, which is better suited for Python ecosystem and data science workflows: h...

(AI_EN_LECTOR_MISSING_DETERMINER_THE)


[style] ~246-~246: The phrase ‘feel free to’ is used quite frequently. Consider using a less frequent alternative to set your writing apart from others and make it sound more professional.
Context: ...tting the most out of AI code analysis Feel free to join the discussion and share your own ...

(FEEL_FREE_TO_STYLE_ME)


[uncategorized] ~482-~482: You might be missing the article “the” here.
Context: ...yle schema. Note that this can increase token count. | false ...

(AI_EN_LECTOR_MISSING_DETERMINER_THE)

🪛 Markdownlint (0.37.0)
README.md

200-200: null
Fenced code blocks should have a language specified

(MD040, fenced-code-language)


208-208: null
Fenced code blocks should have a language specified

(MD040, fenced-code-language)


216-216: null
Fenced code blocks should have a language specified

(MD040, fenced-code-language)


224-224: null
Fenced code blocks should have a language specified

(MD040, fenced-code-language)


637-637: null
Bare URL used

(MD034, no-bare-urls)

⏰ Context from checks skipped due to timeout of 90000ms (8)
  • GitHub Check: Test (windows-latest, 23.x)
  • GitHub Check: Test (windows-latest, 22.x)
  • GitHub Check: Test (windows-latest, 21.x)
  • GitHub Check: Test (windows-latest, 20.x)
  • GitHub Check: Test (windows-latest, 19.x)
  • GitHub Check: Test (windows-latest, 18.x)
  • GitHub Check: Test (windows-latest, 18.0.0)
  • GitHub Check: Analyze (javascript-typescript)
🔇 Additional comments (3)
tests/core/output/outputGenerate.test.ts (1)

34-61: LGTM! Well-structured XML test case.

The test case thoroughly validates the XML output by:

  1. Using proper XML content in test data
  2. Parsing the output with XMLParser
  3. Verifying both structure and content
src/core/output/outputGenerate.ts (2)

66-104: LGTM! Well-implemented XML generation with proper error handling.

The XML generation is well-structured with:

  1. Clear document hierarchy
  2. Proper error handling
  3. Conditional sections based on flags

139-147: LGTM! Clean and maintainable output routing.

The output routing logic is clear and follows good practices:

  1. Early return for non-parsable styles
  2. Clear switch statement for parsable styles
  3. Default case handled properly

Copy link
Contributor

@coderabbitai coderabbitai bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 0

🧹 Nitpick comments (5)
README.md (5)

50-51: Consider adding more context about security checks.

While the description is accurate, it would be helpful to briefly mention what kind of sensitive information Secretlint can detect (e.g., API keys, passwords, tokens).

- **Security-Focused**: Incorporates [Secretlint](https://github.com/secretlint/secretlint) for robust security checks
-   to detect and prevent inclusion of sensitive information.
+ **Security-Focused**: Incorporates [Secretlint](https://github.com/secretlint/secretlint) for robust security checks
+   to detect and prevent inclusion of sensitive information like API keys, passwords, and access tokens.

314-314: Add indentation to improve XML readability.

The directory structure within the XML example would be more readable with proper indentation.

 <directory_structure>
-src/
-cli/
-cliOutput.ts
-index.ts
+  src/
+    cli/
+      cliOutput.ts
+      index.ts

Also applies to: 319-321


519-529: Add example patterns in the configuration example.

The include and ignore patterns could be more illustrative with real-world examples.

   "include": [
-    "**/*"
+    "**/*",
+    "src/**/*.{js,ts}",
+    "docs/**/*.md"
   ],
   "ignore": {
     "useGitignore": true,
     "useDefaultPatterns": true,
     // Patterns can also be specified in .repomixignore
     "customPatterns": [
-      "additional-folder",
-      "**/*.log"
+      "temp/",
+      "**/*.log",
+      "**/*.{png,jpg,gif}",
+      "coverage/"
     ]

200-200: Add language specifiers to fenced code blocks.

Some code blocks are missing language specifiers, which help with syntax highlighting.

- ```
+ ```text
Your prompt here

Also applies to: 208-208, 216-216, 224-224

🧰 Tools
🪛 Markdownlint (0.37.0)

200-200: null
Fenced code blocks should have a language specified

(MD040, fenced-code-language)


638-638: Wrap bare URL in angle brackets.

For better markdown formatting, wrap the bare URL in angle brackets.

- https://docs.anthropic.com/en/docs/build-with-claude/prompt-engineering/long-context-tips
+ <https://docs.anthropic.com/en/docs/build-with-claude/prompt-engineering/long-context-tips>
🧰 Tools
🪛 Markdownlint (0.37.0)

638-638: null
Bare URL used

(MD034, no-bare-urls)

📜 Review details

Configuration used: CodeRabbit UI
Review profile: CHILL
Plan: Pro

📥 Commits

Reviewing files that changed from the base of the PR and between 785f7b6 and 84644cd.

📒 Files selected for processing (2)
  • README.md (23 hunks)
  • src/cli/cliRun.ts (2 hunks)
🚧 Files skipped from review as they are similar to previous changes (1)
  • src/cli/cliRun.ts
🧰 Additional context used
🪛 Markdownlint (0.37.0)
README.md

200-200: null
Fenced code blocks should have a language specified

(MD040, fenced-code-language)


208-208: null
Fenced code blocks should have a language specified

(MD040, fenced-code-language)


216-216: null
Fenced code blocks should have a language specified

(MD040, fenced-code-language)


224-224: null
Fenced code blocks should have a language specified

(MD040, fenced-code-language)


638-638: null
Bare URL used

(MD034, no-bare-urls)

🪛 LanguageTool
README.md

[style] ~246-~246: The phrase ‘feel free to’ is used quite frequently. Consider using a less frequent alternative to set your writing apart from others and make it sound more professional.
Context: ...tting the most out of AI code analysis Feel free to join the discussion and share your own ...

(FEEL_FREE_TO_STYLE_ME)

🔇 Additional comments (7)
README.md (7)

33-34: LGTM! Clear and concise feature description.

The split into two lines improves readability while maintaining clarity about the tool's purpose and compatibility with various AI tools.


342-346: LGTM! Excellent explanation of XML benefits.

The explanation effectively communicates the advantages of using XML tags with AI systems, supported by a relevant quote from Anthropic's documentation.


479-499: LGTM! Well-structured configuration table.

The configuration table is comprehensive and well-organized, with clear descriptions for each option, including the new parsableStyle option.


570-581: LGTM! Clear explanation of ignore patterns.

The documentation effectively explains the multiple methods available for ignoring files and their priority order.


594-596: LGTM! Important clarification about binary files.

The note about binary files provides crucial information about how they are handled in the output.


600-602: LGTM! Clear explanation of custom instructions.

The documentation effectively explains the purpose and benefits of using custom instructions.


658-660: LGTM! Comprehensive security check documentation.

The security check feature is well-documented, with clear explanations of its purpose and functionality.

@yamadashy
Copy link
Owner

@atollk
Thank you for the implementation!
I made a few minor tweaks, but the implementation is solid and I've tested it locally with no issues.
Thanks for the refactoring as well!

I'll be releasing a new version soon, followed by server deployment and UI updates. Thank you!

@yamadashy yamadashy merged commit c774b8e into yamadashy:main Jan 19, 2025
53 checks passed
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants