Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Implement io.IOBase.seekable method #50

Merged
merged 1 commit into from
Oct 3, 2024
Merged

Conversation

mxmlnkn
Copy link
Contributor

@mxmlnkn mxmlnkn commented Oct 2, 2024

SSHFile inherits from io.IOBase, but even though seek can be called on it, the seekable method is not overwritten and therefore returns False via the default implementation in io.IOBase.

sshfs/file.py Outdated
@@ -81,6 +81,9 @@ async def _open_file(self):
def readable(self):
return "r" in self.mode

def seekable(self):
return "r" in self.mode or "w" in self.mode
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Just to double check: can it be also '+' (update)?

Also, is it actually enforced? How do we actually open files, write to them? E.g do we always seek to the end in the append mode on each write?

Copy link
Contributor Author

@mxmlnkn mxmlnkn Oct 3, 2024

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Just to double check: can it be also '+' (update)?

Also, is it actually enforced? How do we actually open files, write to them? E.g do we always seek to the end in the append mode on each write?

Note that you have this code on line 21:

        # TODO: support r+ / w+ / a+
        if mode not in {"rb", "wb", "ab"}:

Which makes the questions for + currently moot and untestable, but I'd also like to see support for +-mode. I'd also like to see support for open via flags like in asnycssh, but this is not that important. Coming back to your question, yes + mode should also be seekable. I mean, in the end, it only means that it can read and write instead of only doing one of both.

Append mode is more difficult for me to decide on because I have never used it. It seems to be enforced inside asyncssh.sftp.SFTPFile and it seems to actually allow for seeking in general, but it only has an effect for read. I am not sure where it is enforced exactly, but there is this comment in the write method.

Some code for testing assumptions:

import os
import sshfs

path = "foo"

if os.path.exists(path):
    os.remove(path)
with open(path, "wb") as file:
    assert file.seekable()
    file.write(b"123456")
with open(path, "ab") as file:
    assert file.seekable()
    file.write(b"789")
    assert file.tell() == 9
    file.seek(3)
    assert file.tell() == 3
    # file.read(1)  # io.UnsupportedOperation: read
    file.write(b"xyz")
    # Note that tell does NOT return 12 here as one might be inclined to think!
    # Instead of doing implicit seeks before each write, append simply always appends.
    # https://docs.python.org/release/3.7.2/library/functions.html#open
    # > 'a' for appending (which on some Unix systems, means that all writes
    # > append to the end of the file regardless of the current seek position).
    assert file.tell() == 6
with open(path, "ab+") as file:
    assert file.seekable()
    file.seek(6)
    assert file.tell() == 6
    assert file.read() == b"789xyz"
with open(path, "rb") as file:
    assert file.read() == b"123456789xyz"

fs = sshfs.SSHFileSystem("127.0.0.1", port=2022)

if fs.exists(path):
    fs.delete(path)
with fs.open(path, "wb") as file:
    file.write(b"123456")
with fs.open(path, "ab") as file:
    file.write(b"789")
    assert file.tell() == 9
    file.seek(3)
    assert file.tell() == 3
    file.write(b"xyz")
    # This is different from the built-in open file.
    assert file.tell() == 12
with fs.open(path, "ab+") as file:
    file.seek(6)
    assert file.tell() == 6
    assert file.read() == b"789xyz"
with fs.open(path, "rb") as file:
    assert file.read() == b"123456789xyz"

For the test with a+ via sshfs, I simply commented out the mode check mentioned above.

Note that I would have delegated to asyncssh.sftp.SFTPFile.seekable if there were such a method in the first place ...

I think it might be more correct to return true for seekable in the case of a+ and all other modes except for simply a, for which I am conflicted. seek theoretically works in the way that it does not raise an exception and calling tell after it will return the seeked-to offset, but read is not allowed in pure a mode and writing will always be done to the end...

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

But as a+ is not yet supported, the current commit should be sufficiently correct in my opinion.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I have added asserts for seekable to the Python file object and it returns true even for a mode. Therefore, I have pushed the change to simply always return True from inside seekable.

@shcheklein shcheklein merged commit 5b29464 into fsspec:main Oct 3, 2024
1 check passed
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants