Could ack speed up with a per-directory cache of some kind? #333

qpwo · 2021-01-29T21:04:04Z

I hope this isn't too naive but I couldn't find anything on it. I have a directory with hundreds of MBs of source code and every search takes a long time. Would it be possible to make a saved cache index for a directory and update it for updated files when you make a new search?

petdance · 2021-01-29T21:06:45Z

I've played with that in my head for quite a while, but never actually tried it.

The indexing tools at https://beyondgrep.com/more-tools/ may be ideas about what you could use. Also, if you're looking for functions and variables a lot, using ctags may do 90% of what you're looking for.

petdance · 2021-02-03T14:53:40Z

Aside from the cache question: How many hundreds of MBs do you have, and how long are searches taking? One thing we've had trouble with over the years is that some folks have systems where ack takes far longer than we would expect it to, and we haven't been able to figure out why. I'm wondering if you might be in that situation as well. See #194 for example.

n1vux · 2021-09-28T05:32:29Z

Tuning the OS filecache reservation and/or switching from spinning iron-oxide to SSD can greatly improve read speed.

I have doubts about one tool having both cached-index mode and grep mode, and i wouldn't want to give up extemporaneous usage of ack.
Alas the existing tools that build inverted indexes in Perl do not appear to under maintenance (plucene, and relatives).
(There's a newer one that uses BDB or PG, uh, no.)

I've installed swish-e for both cached-index search and natural-language spanning-lines use cases.
While the name expands to "Simple Web Indexing System for Humans - Enhanced" and it rather expects you'll expose it locally with CGI and Nginx or Apache, it has a commandline interface and API too. (It's in Ubuntu package manager and probably available for any platform of interest. Has Perl API that seems newer than any P?Luc(y|ene).) While it's designed for natural language, it's recommended self-demo is searching it's source code.

Before i can make full use of it i need to figure out how to capture metadata about a document along with its contents, and what's my needed metadata schema ... ugh. I should remember from 20+ years ago in late Web1.0 when i was buying a bleeding edge indexing engine that Information Retrieval is NOT as easy as it looks!.

petdance added the feature label Mar 13, 2021

n1vux mentioned this issue Oct 15, 2021

Paragraph mode (aka paragrep, multiline) #171

Open

petdance changed the title ~~Cache index for directory?~~ Could ack speed up with a per-directory cache of some kind? Jun 6, 2022

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Could ack speed up with a per-directory cache of some kind? #333

Could ack speed up with a per-directory cache of some kind? #333

qpwo commented Jan 29, 2021

petdance commented Jan 29, 2021

petdance commented Feb 3, 2021

n1vux commented Sep 28, 2021

Could ack speed up with a per-directory cache of some kind? #333

Could ack speed up with a per-directory cache of some kind? #333

Comments

qpwo commented Jan 29, 2021

petdance commented Jan 29, 2021

petdance commented Feb 3, 2021

n1vux commented Sep 28, 2021