You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
When dealing with a long statement of facts quoted from legal text, the text is not split up within left double quotations and write double quotations. this is different than the " characterI cannot share the text here as it deals with sensitive content.
import pysbd
seg = pysbd.Segmenter(language='en')
sentences = seg.segment(above_text)
Returns a lot of length 1 and does not split by sentences. The expected behavior is to split up into sentences within the quotations.
The text was updated successfully, but these errors were encountered:
I confirm the... bug? Not sure if it is a bug or intentional but
import pysbd
text = 'He said "hello. And then world."'
seg = pysbd.segmenter.Segmenter(language='en', clean=True)
print(seg.segment(text))
['He said "hello. And then world."']
I was expecting
[
'He said',
'"hello.',
'And then world."'
]
It almost does the right thing when using single quotes. Almost. The sentence is split correctly, but it considers the terminating single quote as a sentence
import pysbd
text = "He said 'hello. And then world.'"
seg = pysbd.segmenter.Segmenter(language='en', clean=True)
print(seg.segment(text))
When dealing with a long statement of facts quoted from legal text, the text is not split up within left double quotations and write double quotations. this is different than the " characterI cannot share the text here as it deals with sensitive content.
import pysbd
seg = pysbd.Segmenter(language='en')
sentences = seg.segment(above_text)
Returns a lot of length 1 and does not split by sentences. The expected behavior is to split up into sentences within the quotations.
The text was updated successfully, but these errors were encountered: