You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
A single invocation of pkgfile -b takes around 230 ms on my laptop (Ryzen 5 4650U, 40GB DDR4, 512GB NVMe SSD).
After playing around with different binary names to search, with binary names at the start of the data files, and at the end of the data files, the invocation time doesn't differ by much. Therefore I'm guessing that most of the time is spent on loading the cache files.
I would like to propose a faster storage format, maybe using a hash index, to bring down the search time.
$ time pkgfile -b abiword
extra/abiword
real 0m0.237s
user 0m0.335s
sys 0m0.036s
$ timeforiin {1..10};do pkgfile -b abiword >/dev/null;done
extra/abiword
extra/abiword
extra/abiword
extra/abiword
extra/abiword
extra/abiword
extra/abiword
extra/abiword
extra/abiword
extra/abiword
real 0m2.280s
user 0m3.185s
sys 0m0.294s
I should do some proper profiling here, but what you're asking for only benefits a small fraction of potential queries. Yes, a hash table could benefit queries like 'pacman' or '/usr/bin/pacman', but you're back to a full scan for anything involving globbing or regex. It does, admittedly, also help with things like #27.
An embedded DB like SQLite might work here to provide a more flexible query language with indexing capabilities. That would be generally useful, but also presents a pretty substantial rewrite of the existing code. I guess that's sort of true of any storage format change...
A single invocation of
pkgfile -b
takes around 230 ms on my laptop (Ryzen 5 4650U, 40GB DDR4, 512GB NVMe SSD).After playing around with different binary names to search, with binary names at the start of the data files, and at the end of the data files, the invocation time doesn't differ by much. Therefore I'm guessing that most of the time is spent on loading the cache files.
I would like to propose a faster storage format, maybe using a hash index, to bring down the search time.
This is my installation:
These are the pkgfile cache files:
The text was updated successfully, but these errors were encountered: