Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Adding CSV Parser #1996

Merged
merged 14 commits into from
Oct 2, 2024
Merged

Adding CSV Parser #1996

merged 14 commits into from
Oct 2, 2024

Conversation

saravana87
Copy link
Contributor

Purpose

Does this introduce a breaking change?

When developers merge from main and run the server, azd up, or azd deploy, will this produce an error?
If you're not sure, try it out on an old environment.

[ ] Yes
[x] No

Does this require changes to learn.microsoft.com docs?

This repository is referenced by this tutorial
which includes deployment, settings and usage instructions. If text or screenshot need to change in the tutorial,
check the box below and notify the tutorial author. A Microsoft employee can do this for you if you're an external contributor.

[ ] Yes
[x] No

Type of change

[ ] Bugfix
[x] Feature
[ ] Code style update (formatting, local variables)
[ ] Refactoring (no functional changes, no api changes)
[ ] Documentation content changes
[ ] Other... Please describe:

Code quality checklist

See CONTRIBUTING.md for more details.

  • The current tests all pass (python -m pytest).
  • I added tests that prove my fix is effective or that my feature works
  • I ran python -m pytest --cov to verify 100% coverage of added lines
  • I ran python -m mypy to check for type errors
  • I either used the pre-commit hooks or ran ruff and black manually on my code.

saravana87 and others added 7 commits September 25, 2024 01:57
I am adding the CSV parser python code.
it works with basic CSV files.
updating the csv parser code and importing the CsvParser class
Added CSV Test file
Formatted the file
Formatted the file
@pamelafox
Copy link
Collaborator

Looks like there are still a few ruff issues.

@pamelafox
Copy link
Collaborator

You need to run ruff check --fix . on the codebase. I'd be happy to make the change myself, but since your pull request is from the main branch, GitHub does not allow me to push changes to your branch, so you will need to make all the necessary changes to pass the CI checks. Thanks!

@pamelafox
Copy link
Collaborator

The CI now fails on the mypy step. Here's what I suggest as the change to csvparser:

    async def parse(self, content: IO) -> AsyncGenerator[Page, None]:
        # Check if content is in bytes (binary file) and decode to string
        content_str: str
        if isinstance(content, (bytes, bytearray)):
            content_str = content.decode("utf-8")
        elif hasattr(content, "read"):  # Handle BufferedReader
            content_str = content.read().decode("utf-8")

The next CI step is black. You can run black . locally to fix the formatting.

The step after that is running tests. I think those are failing currently as you haven't brought in the current changes from the main branch. There should only be a single line difference in prepdocs.py, the line that adds CSVParser, but it's instead showing many lines changes.

The CONTRIBUTING.md has details about how to run these steps locally, by the way:
https://github.com/Azure-Samples/azure-search-openai-demo/blob/main/CONTRIBUTING.md#submit-pr

@saravana87
Copy link
Contributor Author

The CI now fails on the mypy step. Here's what I suggest as the change to csvparser:

    async def parse(self, content: IO) -> AsyncGenerator[Page, None]:
        # Check if content is in bytes (binary file) and decode to string
        content_str: str
        if isinstance(content, (bytes, bytearray)):
            content_str = content.decode("utf-8")
        elif hasattr(content, "read"):  # Handle BufferedReader
            content_str = content.read().decode("utf-8")

The next CI step is black. You can run black . locally to fix the formatting.

The step after that is running tests. I think those are failing currently as you haven't brought in the current changes from the main branch. There should only be a single line difference in prepdocs.py, the line that adds CSVParser, but it's instead showing many lines changes.

The CONTRIBUTING.md has details about how to run these steps locally, by the way: https://github.com/Azure-Samples/azure-search-openai-demo/blob/main/CONTRIBUTING.md#submit-pr

Thanks

The CI now fails on the mypy step. Here's what I suggest as the change to csvparser:

    async def parse(self, content: IO) -> AsyncGenerator[Page, None]:
        # Check if content is in bytes (binary file) and decode to string
        content_str: str
        if isinstance(content, (bytes, bytearray)):
            content_str = content.decode("utf-8")
        elif hasattr(content, "read"):  # Handle BufferedReader
            content_str = content.read().decode("utf-8")

The next CI step is black. You can run black . locally to fix the formatting.

The step after that is running tests. I think those are failing currently as you haven't brought in the current changes from the main branch. There should only be a single line difference in prepdocs.py, the line that adds CSVParser, but it's instead showing many lines changes.

The CONTRIBUTING.md has details about how to run these steps locally, by the way: https://github.com/Azure-Samples/azure-search-openai-demo/blob/main/CONTRIBUTING.md#submit-pr

Thanks for the code. I added this.
I did formatting with black

@pamelafox
Copy link
Collaborator

Have you made a commit with the changes to run black and to update prepdocs.py to match main? I only see the mypy commit, and CI is still failing.

@pamelafox
Copy link
Collaborator

I've made a few changes to merge this with main, and it's now looking good. I've also tested it locally. It has the same drawback of other files where the browser downloads the CSV instead of rendering it, but that's not unique to the CSV file format. Merging!

@pamelafox pamelafox merged commit 2dd7ba9 into Azure-Samples:main Oct 2, 2024
9 checks passed
@saravana87
Copy link
Contributor Author

Thank you for helping me to push the files in the main.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants