Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Tokens #13

Open
Hodge931 opened this issue Jul 15, 2024 · 1 comment
Open

Tokens #13

Hodge931 opened this issue Jul 15, 2024 · 1 comment
Labels
question Further information is requested

Comments

@Hodge931
Copy link

In the paper, it is reported that the average consumed token is 42,376 per example. Table 2 in the paper says that, 246 lines of code are left after figuring out the edit locations, and thus are processed during the repair phase. If there are 5 tokens on average for each line, and 42 patches are generated per bug, 246542=51,660 tokens will be included in the prompt during the repair. There will be more to further include the generated tokens and the token consumption during localization. Am I missing something? Thanks a lot!

@brutalsavage
Copy link
Contributor

brutalsavage commented Jul 15, 2024

Hi, when we sample multiple patches per bug, we do not need to consume the input tokens (i.e., prompt tokens) multiple times, we only need to generate the patch (i.e., completion tokens). You can try it with the OpenAI API with multiple samples (you will observe that the cost and token number increases linearly with number of completion tokens not the number of prompt tokens -- which is only consumed once in the same set of samples)

@brutalsavage brutalsavage added the question Further information is requested label Jul 15, 2024
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
question Further information is requested
Projects
None yet
Development

No branches or pull requests

2 participants