-
Notifications
You must be signed in to change notification settings - Fork 93
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Fuzzy matching against csv header #62
base: master
Are you sure you want to change the base?
Conversation
To test run ./fuzzy_load.sh |
catch (IllegalArgumentException e){ | ||
System.err.println("Could not connect to the cluster, check your hosts"); | ||
//e.printStackTrace(); | ||
System.exit(0); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
System.exit should really be avoided, why not throw an IOException like on line 461?
Same thing below
@@ -100,7 +98,7 @@ public Boolean parse(String toparse) throws ParseException { | |||
return new Boolean("TRUE"); | |||
if (boolFalse.equalsIgnoreCase(toparse)) | |||
return new Boolean("FALSE"); | |||
throw new ParseException("Boolean was not TRUE (" + boolTrue + ") or FALSE (" + boolFalse + ")", 0); | |||
throw new ParseException("Boolean was not TRUE (" + boolTrue + ") or FALSE (" + boolFalse + ")" + " it was "+ toparse, 0); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
nit: would read better with some punctuation, ex:
+ ". It was "+ toparse
List<String> columnNames = cdp.getColumnNames(); | ||
try { | ||
String header = reader.readLine(); | ||
//this didn' work |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Remove or revisit?
} | ||
resultColumns[i]=currentColumn; | ||
i++; | ||
columnNames.remove(columnNames.contains(currentColumn)); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
columnNames.contains(...) returns a boolean; passing any boolean to columnNames.remove(...) will have no effect
Perhaps this was meant to remove currentColumn from columnNames if it exists in the list. In this case it is fine to just call columnNames.remove(currentColumn)
if (delimiter == null){ | ||
delimiter = ","; | ||
} | ||
List<String> headerList = Arrays.asList(header.split(delimiter)); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
When performing a basic split, it won't account for the various CSV edge cases. Would be more robust to use a CSV parser to do the split (like Univocity, used elsewhere in this project)
No description provided.