Skip to content

Commit

Permalink
Yiddish: Document tweak to avoid empty stem
Browse files Browse the repository at this point in the history
  • Loading branch information
ojwb committed Jan 22, 2025
1 parent d79c3fc commit 73d8d5b
Showing 1 changed file with 9 additions and 1 deletion.
10 changes: 9 additions & 1 deletion algorithms/yiddish/stemmer.tt
Original file line number Diff line number Diff line change
Expand Up @@ -112,7 +112,7 @@
</p>

<ul>
<li>If the word begins with גע (except for געלט and געבן) it is replaced with "GE" and the cursor is advanced.</li>
<li>If the word begins with גע (unless it begins געלט or געבן or is exactly גע) it is replaced with "GE" and the cursor is advanced.</li>
<li>
Next, if the word begins with any verbal prefix, the cursor is advanced past this prefix.
Prefixes include (niked added for clarity, not included in algorithm):
Expand Down Expand Up @@ -195,6 +195,14 @@

Finally, all remaining GE and TSU are deleted.

<h2>History of functional changes to the algorithm</h2>

<ul>
<li>Snowball 2.3.0: Added an exception to the check to replace prefix גע with
GE to not replace when גע is the entire input word. This avoids generating an
empty stem in this case.
</ul>

<h2>The same algorithm in Snowball</h2>

[% highlight_file('yiddish') %]
Expand Down

0 comments on commit 73d8d5b

Please sign in to comment.