Tokens #13

Hodge931 · 2024-07-15T03:53:12Z

In the paper, it is reported that the average consumed token is 42,376 per example. Table 2 in the paper says that, 246 lines of code are left after figuring out the edit locations, and thus are processed during the repair phase. If there are 5 tokens on average for each line, and 42 patches are generated per bug, 246542=51,660 tokens will be included in the prompt during the repair. There will be more to further include the generated tokens and the token consumption during localization. Am I missing something? Thanks a lot!

brutalsavage · 2024-07-15T05:05:54Z

Hi, when we sample multiple patches per bug, we do not need to consume the input tokens (i.e., prompt tokens) multiple times, we only need to generate the patch (i.e., completion tokens). You can try it with the OpenAI API with multiple samples (you will observe that the cost and token number increases linearly with number of completion tokens not the number of prompt tokens -- which is only consumed once in the same set of samples)

brutalsavage added the question Further information is requested label Jul 15, 2024

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Tokens #13

Tokens #13

Hodge931 commented Jul 15, 2024

brutalsavage commented Jul 15, 2024 •

edited by dengyinlin

Loading

Tokens #13

Tokens #13

Comments

Hodge931 commented Jul 15, 2024

brutalsavage commented Jul 15, 2024 • edited by dengyinlin Loading

brutalsavage commented Jul 15, 2024 •

edited by dengyinlin

Loading