Skip to content

Commit

Permalink
Remove Files.exist from the .of(Path) method in SamInputResource (#1128)
Browse files Browse the repository at this point in the history
* According to @drozen: "This operation is expensive over certain NIO filesystems (eg., takes on the order of seconds over GCS), so forcing it at reader creation time could potentially add hours to the wall-clock time of a task that needed to access many bams over GCS."
  • Loading branch information
Yossi Farjoun authored and lbergelson committed May 31, 2018
1 parent a4ea695 commit 8b942f3
Showing 1 changed file with 15 additions and 9 deletions.
24 changes: 15 additions & 9 deletions src/main/java/htsjdk/samtools/SamInputResource.java
Original file line number Diff line number Diff line change
Expand Up @@ -38,6 +38,7 @@
import java.net.URL;
import java.nio.channels.SeekableByteChannel;
import java.nio.file.FileSystemNotFoundException;
import java.nio.file.FileSystems;
import java.nio.file.Files;
import java.nio.file.Path;
import java.util.function.Function;
Expand Down Expand Up @@ -93,17 +94,22 @@ public static SamInputResource of(final File file) {
/** Creates a {@link SamInputResource} reading from the provided resource, with no index. */
public static SamInputResource of(final Path path) {

if (Files.isRegularFile(path) && Files.exists(path)) {
return new SamInputResource(new PathInputResource(path));
} else {
// in the case of named pipes and other non-seekable paths there's a bug in the implementation of
// java's GZIPInputStream which inappropriately uses .available() and then gets confused with the result
// of 0. For reference see:
// https://bugs.java.com/view_bug.do?bug_id=7036144
// https://github.com/samtools/htsjdk/pull/1077
// https://github.com/samtools/htsjdk/issues/898
// in the case of named pipes and other non-seekable paths there's a bug in the implementation of
// java's GZIPInputStream which inappropriately uses .available() and then gets confused with the result
// of 0. For reference see:
// https://bugs.java.com/view_bug.do?bug_id=7036144
// https://github.com/samtools/htsjdk/pull/1077
// https://github.com/samtools/htsjdk/issues/898

// This still doesn't support the case where someone is creating a named pipe in a non-default
// file system and then using it as input and passing a GZIPed into the other end of the pipe.

// To work around this bug, we fall back to using a FileInputResource rather than a PathInputResource
// when we encounter a non-regular file using the default NIO filesystem (file://)
if (path.getFileSystem() == FileSystems.getDefault() && !Files.isRegularFile(path)) {
return of(path.toFile());
} else {
return new SamInputResource(new PathInputResource(path));
}
}

Expand Down

0 comments on commit 8b942f3

Please sign in to comment.