Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Improve handling of st_ino #25

Closed
GoogleCodeExporter opened this issue Mar 16, 2015 · 3 comments · Fixed by #107
Closed

Improve handling of st_ino #25

GoogleCodeExporter opened this issue Mar 16, 2015 · 3 comments · Fixed by #107

Comments

@GoogleCodeExporter
Copy link

Currently pyfakefs handles st_ino as something optional, which is probably OK 
for many scenarios.

But it would be really nice, if st_ino would be handled properly. 

I'm working on a sort of file syncing application, which needs to detect all FS 
modifications. Comparing st_ino of two files/dirs (e.g. a file from an old FS 
snapshot and a file from a new FS snapshot) is a typical and reliable way to 
detect that both files/dirs correspond to the same object on the file system. 
If st_ino values are the same, then it is the same object (at least on most 
UNIX file systems). 

Therefore, I'd like to propose the following:
1) When a new file or directory is created, it should get a new unique st_ino 
value

2) When objects (files or dirs) are moved inside the same file system, their 
st_ino fields should be preserved.
    (BTW, fake_filesystem_shutil currently tries to preserve most st_* stats for 
files, but st_ino is not included into this set of preserved fields. And for
directory moves, no st_* stats are preserved at all for some reason)

3) Optional: currently fake_filesystem_shutil.move uses copying to move files 
and directories. This does not scale for big nested directories, because it 
needs to move each nested sub-component separately. May be it could use 
"rename" whenever possible? It could provide significant speed-ups.

Original issue reported on code.google.com by romixlev on 6 Jan 2014 at 12:15

@mrbean-bremen
Copy link
Member

I have been thinking about this in connection with the file system space and the newly introduced hard link support. I propose to introduce support for both st_ino and st_dev to cover this:

  • add st_ino as proposed
  • add a default st_dev (instead of None), set st_dev according to the parent directory on file/directory creation
  • under Windows, instead of st_dev the result of splitdrive() shall be used (maybe st_dev can be used internally); not sure yet about automatic creation of a new file system if adding a file on a new drive
  • add something like AddRoot() to FakeFilesystem to be able to add a new file system root with a new st_dev, with an optional total_space parameter to set the available space per file system
  • adapt the space handling for newly created files accordingly, adapt disk_usage to account for the path parameter
  • change shutil.move() to use os.rename() if on the same file system, otherwise copy2, as proposed in point 3 (will work with os.rename if PR Rename fix #101 is merged)
  • adapt os.rename() to throw if on different file systems (seems to be the default on most Unix systems, have to check under Windows)
  • please add whatever I forgot

@jmcgeheeiv: If this is ok, I may have a go at this somewhere in the next weeks (after the open PRs are merged, to have a consistent start).
If you are going to make a new pypi release, as you proposed, I think it would be a good time before this is added.

@jmcgeheeiv
Copy link
Contributor

How ambitious and impressive.

We will need to make sure that AddRoot() is properly documented with an example.

In the meantime, I'll get on those other PRs and make a release.

@mrbean-bremen
Copy link
Member

Ok, thanks. I won't comment on "ambitious and impressive" :p
I have a question regarding st_ino:
Both CreateFile() and CreateDirectory() have an optional inode argument. What is the use case for these? Will this still be needed, if we automatically give each file/directory a unique inode number? If we allow to set set an arbitrary number, this may clash with an existing number (which generally will just increase for each new object) and I'm not sure how to handle this. I see these possibilities:

  • remove the argument and document it in the release notes
  • make the argument deprecated and issue a warning if using it
  • add a backwards compatibility option to support the argument instead of the automatic assignment
    What do you think?

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging a pull request may close this issue.

3 participants