You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
Describe the bug
Segmenter removes space of English words in code-mixed sentence, for example this sentence:
這是Career Centre
To reproduce
Here is the code:
import pycantonese
from pycantonese.word_segmentation import Segmenter
segmenter = Segmenter()
pyseg = pycantonese.segment("這是Career Centre", cls=segmenter)
for word in pyseg:
print(word)
The output is:
這是
CareerCentre
Expected behavior
The expected output is:
這是
Career Centre
or
這是
Career
Centre
System (please complete the following information):
Operating System: macOS Sonoma 14.0 (23A344)
PyCantonese version: 3.4.0
The text was updated successfully, but these errors were encountered:
Describe the bug
Segmenter removes space of English words in code-mixed sentence, for example this sentence:
To reproduce
Here is the code:
The output is:
Expected behavior
The expected output is:
or
System (please complete the following information):
3.4.0
The text was updated successfully, but these errors were encountered: