-
-
Notifications
You must be signed in to change notification settings - Fork 50
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Make CsvFileReader.readNext() private #100
Comments
Would be nice to have this (or an equivalent) to be able to just read the header row of a CSV file |
I think it would be good to have access to basic CSV parser functionality, without the line length checks or header parsing logic. As a fallback in case you need to parse very unusual CSVs. Or to parse malformed CSVs for error reporting or other purposes. If a file contains blank rows, or rows with differing numbers of fields, then it is still possible to load these files into Excel. Spreadsheet or generic CSV parsing software does not care about the number of fields, or headers, it only cares about the most basic structure of a CSV. Even a malformed CSV with blank rows or rows with a different number of fields will be laid out in table form with row and column numbers. This type of software consistently puts cells in the same row and column numbers, regardless of the vendor. I have tested this with Excel, Libre Office and Modern CSV and they all give consistent row numbers for files with blank rows, differing numbers of fields and fields containing newlines. Using
With I do agree that the behavior of |
Thank you for your very useful input. I completely agree with your opinion. |
I'm currently using
If For my second use case, there currently are only utility methods to |
It seems readNext() is line by line read and can work in very less memory which readAllSequence() will force loading all the csv in memory first. What is the alternate if memory is constraint or we are suppose to process very large csvs |
I am using readNext() because I want to skip rows that fail (due to escapeChar or delimiter issues or anything else) and continue reading the rest of the lines. I do so be try and catch when reading each line of the csv. With readAllSequence() I get an error for the whole file and can't just neglect failed rows and read others. If it's possible with some other way than readNext() please let me know. |
Problem
If we use
readNext
method, some options (e.g.excessFieldsRowBehaviour
option) are bypassed and invalid.There does not appear to be a use case for this method, so we are considering making it a private method.
If you have feedback, please comment.
The text was updated successfully, but these errors were encountered: