Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Change service working_directory to not slurp file into memory #1215

Closed
wants to merge 1 commit into from
Closed

Change service working_directory to not slurp file into memory #1215

wants to merge 1 commit into from

Conversation

billdueber
Copy link
Contributor

Fixes #1128

Method #copy_stream_to_working_directory previously used ruby's
copy_stream, which reads the whole source stream into memory.

New #copy_stream_to_file uses IO#readpartial to ferry it around in
4k chunks.

IO#readpartial is preferred to #read because it doesn't assume the stream
is complete when first referenced. 4k is a common heap page size and
should be fairly efficient.

@samvera/hyrax-code-reviewers

Fixes #1128

Method `#copy_stream_to_working_directory` previously used ruby's
`copy_stream`, which reads the whole source stream into memory.

New `#copy_stream_to_file` uses `IO#readpartial` to ferry it around in
4k chunks.

`IO#readpartial` is preferred to `#read` because it doesn't assume the stream
is complete when first referenced. 4k is a common heap page size and
should be fairly efficient.
@mjgiarlo
Copy link
Member

@atz mind reviewing?

@jcoyne
Copy link
Member

jcoyne commented Jun 14, 2017

My understanding was that the problem was on this line

copy_stream_to_working_directory(id, file.original_name, StringIO.new(file.content))

specifically: StringIO.new(file.content) I wonder if we could stop using Hydra::PCDM::File#content and instead create a Hydra::PCDM::File#content_as_stream

@jcoyne
Copy link
Member

jcoyne commented Jun 14, 2017

@billdueber Could we do something like:

fs.original_file.stream.each { |chunk| f.write(chunk) }

@atz
Copy link
Contributor

atz commented Jun 14, 2017

@jcoyne: cannot assume original_file is what matters.

My first reaction is that I don't care, I plan on removing Hyrax::WorkingDirectory anyway as it becomes unnecessary with carrierwave. But secondly, why is the exception catching inside the loop? Don't you keep looping after the rescue?

@billdueber
Copy link
Contributor Author

billdueber commented Jun 14, 2017

The exception should not be inside the loop. The code doesn't do what I set out to address. I feel like I'm in one of those "wanna get away" SouthWest commercials.

Sooo....closing.

@billdueber billdueber closed this Jun 14, 2017
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

4 participants