Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Error in specific sentence #79

Open
Yongwoo-Eg-Kim opened this issue Nov 9, 2024 · 0 comments
Open

Error in specific sentence #79

Yongwoo-Eg-Kim opened this issue Nov 9, 2024 · 0 comments

Comments

@Yongwoo-Eg-Kim
Copy link

Yongwoo-Eg-Kim commented Nov 9, 2024

Error in specific sentence

Removing "50lbs" from text, works fine

skill_extractor.annotate("""Manage multiple tasks associated with each project required for Workforce Planning Strong organizational skills and to communicate information gathered Analytical experience with large amounts of data and understanding of long term care terminology Creative problem solving based on information from PBJ and Scheduling Insights Develop programs and processes to improve overall PBJ and Scheduling insights Qualifications Senior Level Requires years experience and associates degree in related field Bachelors degree preferred Physical Requirements Sitting standing bending reaching stretching stooping walking and moving intermittently during working hours Must be to lift at least 50lbs Must be to maintain verbal and written communication with co workers supervisors residents family members visitors vendors and all business associates outside of the health campus""")


ValueError Traceback (most recent call last)
Cell In[4], line 1
----> 1 skill_extractor.annotate("""Manage multiple tasks associated with each project required for Workforce Planning Strong organizational skills and to communicate information gathered Analytical experience with large amounts of data and understanding of long term care terminology Creative problem solving based on information from PBJ and Scheduling Insights Develop programs and processes to improve overall PBJ and Scheduling insights Qualifications Senior Level Requires years experience and associates degree in related field Bachelors degree preferred Physical Requirements Sitting standing bending reaching stretching stooping walking and moving intermittently during working hours Must be to lift at least 50lbs Must be to maintain verbal and written communication with co workers supervisors residents family members visitors vendors and all business associates outside of the health campus """)

File ~.conda\envs\tft\lib\site-packages\skillNer\skill_extractor_class.py:137, in SkillExtractor.annotate(self, text, tresh)
135 # process pseudo submatchers output conflicts
136 to_process = skills_on_token + skills_low_form + skills_uni_full
--> 137 process_n_gram = self.utils.process_n_gram(to_process, text_obj)
139 return {
140 'text': text_obj.transformed_text,
141 'results': {
(...)
145 }
146 }

File ~.conda\envs\tft\lib\site-packages\skillNer\utils.py:212, in Utils.process_n_gram(self, matches, text_obj)
209 lens = []
210 for sk_id in skill_ids:
211 # score skill given span
--> 212 scored_sk_obj = self.retain(
213 text_obj, span, sk_id, look_up, corpus)
214 span_scored_skills.append(scored_sk_obj)
215 types.append(scored_sk_obj['type'])

File ~.conda\envs\tft\lib\site-packages\skillNer\utils.py:137, in Utils.retain(self, text_obj, span, skill_id, sk_look, corpus)
133 # end
135 if type_ == 'oneToken':
136 # if skill is n_gram (n>2)
--> 137 score = self.compute_w_ratio(
138 real_id, [text_obj[ind].lemmed for ind in s_gr_n])
140 if type_ == 'fullUni':
141 score = 1

File ~.conda\envs\tft\lib\site-packages\skillNer\utils.py:113, in Utils.compute_w_ratio(self, skill_id, matched_tokens)
111 # favorize the matched tokens uphead
112 late_match_penalty_coef = 0.1
--> 113 token_ids = sum([(1-late_match_penalty_coef*skill_name.index(token))
114 for token in matched_tokens])
116 return token_ids/skill_len

File ~.conda\envs\tft\lib\site-packages\skillNer\utils.py:113, in (.0)
111 # favorize the matched tokens uphead
112 late_match_penalty_coef = 0.1
--> 113 token_ids = sum([(1-late_match_penalty_coef*skill_name.index(token))
114 for token in matched_tokens])
116 return token_ids/skill_len

ValueError: 'maintain' is not in list

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

1 participant