-
Notifications
You must be signed in to change notification settings - Fork 1.9k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
多级标题 #148
Comments
@jefferyvvv 目前由于人力有限,一级标题并没有实现。实现方法如下: |
多级标题如果能支持会更好。 |
后续是否会有对多级标题的支持的计划? |
@drunkpig 标题的层级非常重要,请问能安排人优先解决吗? |
对于文字型pdf, 直接获取字体大小,通过比较不同标题和正文的,似乎更加精确 |
没那么简单,一篇文档的字体大小完全是不受限制的,文中出现的字体大小可能有非常多,且正文页完全有可能比标题还大或者相同 |
目前我基于本地进行的一些开发中,针对标题层级的提取,只能根据特定的文档格式按照规则进行提取,不太能有普世的提取方式。 |
@shibainu-gbq 标题的形式太多了,段落间距,字体,颜色,粗细,背景都能决定是不是标题。很难有普世的方法。 |
持续关注 |
在huggingface和modelscope的在线demo上,上线了供预览测试的标题分级功能,可以自行测试。 |
The text was updated successfully, but these errors were encountered: