Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

_get_all_page 解析每个section 内容的时候,当section 跨页的时候,for 循环的range 应该是 (star… #253

Open
wants to merge 1 commit into
base: main
Choose a base branch
from

Conversation

thomasdongcn
Copy link

@thomasdongcn thomasdongcn commented Jul 16, 2023

  1. _get_all_page 解析每个section 内容的时候,如果section 跨页,for 循环的range 应该是 (start_page, end_page+1),python range(start, stop[, step]),stop: 计数到 stop 结束,但不包括 stop。如果一个section 是内容在第一页和第二页,如果用 range(0, 1),结果就是只包含了第一页的内容。

2.另外,在循环中,应该使用当前页索引 page_i,而不是用 start_page。

调试参数:"args": [
"--query",
"chatgpt robot",
"--page_num",
"2",
"--max_results",
"3",
"--days",
"40",
"--save_image",
"true"
]

第一个文献“RM-PRT: Realistic Robotic Manipulation Simulator and Benchmark with Progressive Reasoning Tasks”,解析 Introduction 的时候,Introduction 内容页范围[0,2]

3.还是用问题2中的测试参数,第一个文献“RM-PRT: Realistic Robotic Manipulation Simulator and Benchmark with Progressive Reasoning Tasks” 可以看到用section 关键字匹配 section 内容的方式,问题比较大。有些匹配到section 并不是真实存在的section。
image

…t_page, end_page+1)。在循环中,应该使用当前页索引 page_i,而不是用 start_page
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

1 participant