-
Notifications
You must be signed in to change notification settings - Fork 160
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
ACH file stream #1119
Comments
How large of files are we talking about? With Nacha's limit of 10k lines in each file that should fit into memory pretty easily. We've heard of people parsing files and emitting an event for each EntryDetail record (with the BatchHeader on the event), which can produce a few thousand events. |
I'm open to adding a |
I don't have a precise file size in mind. I am most motivated to minimize the memory requirement for processing and safe guard against OOM crashes.
Where by chance does this information come from? I have not found a good source for this. The moov library itself states:
at 102_000_010 lines of 94 UTF-8 single-byte chars, the file size would be around 76Gb if I didn't mess up my math. I don't see any applications with anywhere near that many. It does make the memory usage unbounded if that assumption is right though. The application I have in mind cannot utilize an arbitrary memory footprint.
I think this is fine too, I would want it to be more general though than just entries since I do need the file header and batch headers. As another alternative |
I believe the 10k line limit comes from some ODFIs and vendors we were talking with when designing this library. I'm sure that varies and I wasn't able to find a limit in our Nacha rules handbook. You bring up a good point in that 76GB is probably too much to consume inside of a reader. We can make that configurable and lower the default. |
Next Record/Batch/Entry make sense, but I'd like to look at what those entail a bit more before adding them. |
We've had an |
Off the cuff, it looks like it would. I want to see the implementation details, but the documentation suggests it does what I hope. The only issue I see is that the docs state: "IAT entries are not currently supported." on the In practice, we just scaled the memory sizing available to the application that utilizes the file read. I think doing that worked really well. We see spikes near 500 MB during peak file processing. So that is manageable by just scaling. We could save that by just using the iterator. Is there a workaround for IATs? What is the behavior? Will the iterator skip or error? |
ACH Version
1.18
What were you trying to do?
I am trying to adapt the reader implementation to allow for a streamed approach that could keep from having a whole file in memory at once. The use case is that in large-volume applications with a service-based architecture reading a whole file into memory at once may not be possible. This would make some validations impossible of course and that is a trade-off consumers would need to make.
What did you expect to see?
The text was updated successfully, but these errors were encountered: