Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Cursor API loads the entire XML document into RAM #38

Open
sajidanower23 opened this issue Dec 18, 2018 · 2 comments
Open

Cursor API loads the entire XML document into RAM #38

sajidanower23 opened this issue Dec 18, 2018 · 2 comments
Assignees
Labels
enhancement New feature or request

Comments

@sajidanower23
Copy link
Contributor

For the purposes of parsing XML, the Cursor API of the package xml-conduit is being used.

Cursor parses the entire document in one go and loads it all into memory, which means that large documents (we are talking Gigabytes) might cause a problem. In the case of an Operating System error in production environment, we may have to move toward using a streaming API (which loads and parses the document line by line).

It is worth refactoring most of the code in the Parser module so that making that shift is easier.

Original Author: ano002

(Moved with github-migration-0.1.0.0 (package github-migration-0.1.0.0 revision df9f38b))

@sajidanower23 sajidanower23 self-assigned this Dec 18, 2018
@sajidanower23 sajidanower23 added the enhancement New feature or request label Dec 18, 2018
@axman6
Copy link

axman6 commented Dec 18, 2018

I can't remember which libraries do this, but it might be possible to use a SAX style parser which streams the data in - a little more difficult to work with perhaps but probably not too much.

Original Author: mas17k

@sarafalamaki
Copy link
Contributor

I doubt we'd ever have to deal with data that big. Events are tiny, and our system is event based. If we have to parse millions of them from one XML file for some reason, then it might be an issue. An easier fix would be to just split the file up in that case.

Original Author: fal05c

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
enhancement New feature or request
Projects
None yet
Development

No branches or pull requests

3 participants