[BUG] Not all granular features are getting generated #78

neomatrix369 · 2023-03-13T12:11:08Z

Describe the bug

After running the notebook(s) on Kaggle/local machine we can see that not all granular features are getting generated for e.g. these fields 'repeated_letters_count', 'repeated_digits_count', 'repeated_spaces_count', 'repeated_whitespaces_count',
'repeated_punctuations_count', 'english_characters_count', 'non_english_characters_count' in addition to the others are not part of the dataframe, either it's not detected or something else is amiss.

To Reproduce

Run the notebook on Kaggle i.e. https://www.kaggle.com/code/neomatrix369/nlp-profiler-simple-dataset and it fails at the cell that looks for repeat characters, etc...

Version information:

NLP Profiler Version 0.0.3 - issue is not relevant to environment or any other technical parameter.
The version on the master branch also behaves in the same manner.

Additional context

From the logs on https://www.kaggle.com/code/neomatrix369/nlp-profiler-simple-dataset#Installation-and-import-libraries/packages - the 0.0.3 version on PyPi worked in the past and for some time has not been working.

neomatrix369 added the bug Something isn't working label Mar 13, 2023

neomatrix369 self-assigned this Mar 13, 2023

neomatrix369 mentioned this issue Mar 13, 2023

Refactor: reformatting python code across all the source files #73

Merged

7 tasks

neomatrix369 added granular feature(s) Low-level/granular feature(s) 1. high-priority Important to fix the issue as soon as possible labels Mar 13, 2023

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

[BUG] Not all granular features are getting generated #78

[BUG] Not all granular features are getting generated #78

neomatrix369 commented Mar 13, 2023 •

edited

Loading

[BUG] Not all granular features are getting generated #78

[BUG] Not all granular features are getting generated #78

Comments

neomatrix369 commented Mar 13, 2023 • edited Loading

neomatrix369 commented Mar 13, 2023 •

edited

Loading