Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[Bug] Faulty detection logic if both file creation and deletion happens between detection calls #15

Open
Hiroko103 opened this issue Sep 29, 2023 · 2 comments

Comments

@Hiroko103
Copy link

Hi!

The logic behind change detection is flawed.
It becomes an issue if both file deletion and creation happens between two 'findChanges' call.
(Theoretically it can be prevented by polling for changes fast enough, but I can't rely on it, since many file operations can happen even in a few milliseconds, and polling can be expensive.)

Problematic source part:

if (count($finderFileHashes) > count($cacheFileHashes)) {
foreach ($finderFileHashes as $file => $hash) {
$this->processFileFromFilesystem($file, $hash);
}
} else {
foreach ($cacheFileHashes as $file => $hash) {
$this->processFileFromCache($file, $hash);
}
}

Based just on the file count, it is not possible to detect all 3 cases (addition, deletion, updates) at once.
The code should compare differences from the perspective of both of them.
- Cache compared to actual files
- Actual files compared to cache

The library wrongly assumes that if:
"Current count of files" > "file count in cache", then only these two could have happened:

  • Any/some file was updated
  • New file has been added

AND IF

"Current count of files" <= "file count in cache", then only these two could have happened:

  • Any/some file was updated
  • Some files were deleted

Examples:

1. Library initialized
2. Files in folder: ['a.txt', 'b.txt', 'c.txt']
3. Delete: 'c.txt'
4. Add: 'd.txt'
5. Detect changes

In this case, the library will report that 'c.txt' has been deleted, but won't report that 'd.txt' have been added.

Another similar example:

1. Library initialized
2. Files in folder: ['a.txt', 'b.txt', 'c.txt']
3. Delete: 'c.txt'
4. Add: 'd.txt'
5. Add: 'e.txt'
6. Detect changes

Now the library will report that 'd.txt' and 'e.txt' have been added, but won't report that 'c.txt' have been deleted.

I'm attaching a test script with the example scenario.

index.zip

@Hiroko103 Hiroko103 changed the title [Bug] Faulty detection logics if both file creation and deletion happens at the same time [Bug] Faulty detection logic if both file creation and deletion happens between detection calls Sep 29, 2023
@ruudk
Copy link

ruudk commented Oct 6, 2023

Noticed the same, this fixes the problem it seems:

private function findChangesAgainstCache() : void
    {
        $this->calculateHashOfFilesFromFinder();

        $finderFileHashes = $this->fileHashesFromFinder;
        $cacheFileHashes = $this->cache->getAll();

        foreach (array_diff_key($finderFileHashes, $cacheFileHashes) as $file => $hash) {
            $this->cache->write($file, $hash);
            $this->newFiles[] = $file;
        }

        foreach (array_diff_key($cacheFileHashes, $finderFileHashes) as $file => $hash) {
            $this->cache->delete($file);
            $this->deletedFiles[] = $file;
        }

        foreach (array_intersect_key($finderFileHashes, $cacheFileHashes) as $file => $hash) {
            if ($hash === $cacheFileHashes[$file]) {
                continue;
            }

            $this->cache->write($file, $hash);
            $this->updatedFiles[] = $file;
        }
    }

Just sharing it here. Don't want to create a PR because I already copied this code into my project. Don't want to have a dependency.

@Hiroko103
Copy link
Author

Thank you for sharing it. I really appreciate it. :)

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants