improve feed converters to normalized form #148

Ram-jannumahanthi · 2024-10-06T20:59:48Z

Ram-jannumahanthi
Oct 6, 2024

Firstly, wonderful work guys, I'm hoping to contribute to the repository and wondering if you have any thoughts on optimizing the data normalization utils? currently it seems like the data util binancefutures.convert reads one line at a time and processes it, is it possible to somehow optimize this using batch reads and/or reading a batch and processing each line concurrently? I do understand that you can only know the full state of the next timestamp by knowing the state at the previous timestamp, but can we not do the evaluation in partial first and then perform an aggregation step to update the state of the whole batch along with outlier handling?

because i currently see that the convert function takes a long time, I ask so that I can invest my time in the right place, please let me know if there's a possibility of optimization in this context. Thank you

nkaz001 · 2024-10-07T15:26:58Z

nkaz001
Oct 7, 2024
Maintainer

I'm unsure if it can be done concurrently due to the row dependencies needed to resolve the correct event order.

In the past, a different logic was used. It may help to find the better way: https://github.com/nkaz001/hftbacktest/blob/599a4e0e3aca91e98c8207be638a216ebd232dbe/hftbacktest/data/validation.py.

I think there are areas where we can optimize in the same logic. The overhead possibly comes from parsing the CSV, creating the structured array, and possibly the merge sort.

Anyway, please feel free to submit a PR that includes a performance comparison and ensures the outcome remains the same.

1 reply

Ram-jannumahanthi Oct 7, 2024
Author

will do!

nkaz001 · 2024-10-07T15:27:54Z

nkaz001
Oct 7, 2024
Maintainer

There is the discord server: https://discord.gg/K9pDsWrPEe

0 replies

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

improve feed converters to normalized form #148

{{title}}

Replies: 2 comments 1 reply

{{title}}

{{title}}

{{title}}

Select a reply

improve feed converters to normalized form #148

Ram-jannumahanthi Oct 6, 2024

Replies: 2 comments · 1 reply

nkaz001 Oct 7, 2024 Maintainer

Ram-jannumahanthi Oct 7, 2024 Author

nkaz001 Oct 7, 2024 Maintainer

Ram-jannumahanthi
Oct 6, 2024

Replies: 2 comments 1 reply

nkaz001
Oct 7, 2024
Maintainer

Ram-jannumahanthi Oct 7, 2024
Author

nkaz001
Oct 7, 2024
Maintainer