All 3 examples in tutorial do not work, for different reasons #8

brian-arnold · 2023-11-07T18:57:10Z

Hello! As a first test of whether your software works, I went through each example in your tutorial and each failed for a different reason, one of which is the same error previously reported in another issue. These could be due to errors unique to the data processing in the tutorial examples or to software issues, but I didn't probe further. Have you run these examples on your end? I copied and pasted your commands from the tutorial and double checked that everything was right, but I suppose I could have missed something.

Errors:

In example 1, I also get the same error previously posted during train test split:

File "../../scripts/parsers/fasta2explainn.py", line 147, in _to_ExplaiNN
df2 = pd.DataFrame(data, columns=list(range(len(data[0]))))
IndexError: list index out of range

In example 2, during the step to subsample 100k sequences:

File "/Users/bjarnold/Princeton_EEB/Kocher/test/ExplaiNN/scripts/utils/subsample-seqs-by-gc.py", line 95, in _subsample_seqs_by_GC
norm_factor = subsample / sum([len(v) for v in gc_regroups.values()])
ZeroDivisionError: division by zero

In example 3, during model training:

File "/Users/bjarnold/miniforge3/envs/explainn/lib/python3.10/site-packages/pandas/io/parsers/c_parser_wrapper.py", line 234, in read
chunks = self._reader.read_low_memory(nrows)
File "parsers.pyx", line 843, in pandas._libs.parsers.TextReader.read_low_memory
File "parsers.pyx", line 904, in pandas._libs.parsers.TextReader._read_rows
File "parsers.pyx", line 879, in pandas._libs.parsers.TextReader._tokenize_rows
File "parsers.pyx", line 890, in pandas._libs.parsers.TextReader._check_tokenize_status
File "parsers.pyx", line 2058, in pandas._libs.parsers.raise_parser_error
pandas.errors.ParserError: Error tokenizing data. C error: Expected 3 fields in line 3, saw 4

On top of these errors, there are several discrepancies between the tutorial PDF you uploaded and the slides you make available on google docs, including some slides completely missing (e.g. for example 3) or typos in commands (where the input and output file for a script are the same).

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

All 3 examples in tutorial do not work, for different reasons #8

All 3 examples in tutorial do not work, for different reasons #8

brian-arnold commented Nov 7, 2023

All 3 examples in tutorial do not work, for different reasons #8

All 3 examples in tutorial do not work, for different reasons #8

Comments

brian-arnold commented Nov 7, 2023