Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Changes to be consistent with GenericFileSystem #46

Open
ryaminal opened this issue Apr 22, 2024 · 0 comments
Open

Changes to be consistent with GenericFileSystem #46

ryaminal opened this issue Apr 22, 2024 · 0 comments

Comments

@ryaminal
Copy link
Contributor

Hi All.

While looking at using the GenericFileSystem and its rsync function, I've noticed a few inconsistencies that sshfs has with handling paths with the protocol(sftp://).

One fix was done in #43 but there appear to be a few more. One proposed solution was to take this upstream to the implementation of GenericFileSystem(see this discussion). But it is becoming apparent that this should be handled in the filesystem implementation. In this case, sshfs.

I was hoping it would be as easy as adding self._strip_protocol(lpath) to the references in _get_file and _put_file` and related methods, but i think it's a bit more nuanced than a first glance.

I've written a test case or two to illustrate the problem:

def test_path_with_protocol(fs: SSHFileSystem, remote_dir):
    # this would have failed before PR 43
    assert fs.isdir("/.") == fs.isdir("sftp:///.")

    # absolute path
    assert fs._strip_protocol("sftp:///.") == fs._strip_protocol("/.")
    # blah is detected as host and removed, even though a relative path may have been intended?
    assert fs._strip_protocol("sftp://blah") == fs._strip_protocol("")
    # another example of a potentially intended relative path but parsed as absolute
    assert fs._strip_protocol("sftp:///.") == fs._strip_protocol("/.")

The largest problem is with relative paths being passed in. The current _strip_protocol doesn't seem to be relative path friendly. I imagine this is mostly a non-issue for folks not using the GenericFileSystem or anything else that passes in protocol-aware paths. But, according to other FS implementations and the recommendation in this PR to fsspec, filesystem implementations should be able to handle a path with or without a protocol.

Curios on everyone's thoughts on doing this and if there are ideas on the best way to implement.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

1 participant