diff --git a/README.md b/README.md index 2cdaf11..0f3f6aa 100644 --- a/README.md +++ b/README.md @@ -7,6 +7,7 @@ This addon provides a [Meilisearch](https://www.meilisearch.com/) search driver * PHP 8.1+ * Laravel 9+ * Statamic 4 +* Meilisearch 1.0+ ### Installation @@ -146,143 +147,3 @@ http { ``` Then restart the server, or run `sudo service nginx restart`. - -### Quirks - -meilisearch can only index 1000 words... which isn't so great for long markdown articles. - -> [!NOTE] -> As of version 0.24.0 the 1000 word limit [no longer exists](https://github.com/meilisearch/meilisearch/issues/1770) on documents, which makes the driver a lot more suited for longer markdown files you may use on Statamic. - -#### Solution 1 -On earlier versions, you can overcome this by breaking the content into smaller chunks: - -```php -'articles' => [ - 'driver' => 'meilisearch', - 'searchables' => ['collection:articles'], - 'fields' => ['id', 'title', 'locale', 'url', 'date', 'type', 'content'], - 'transformers' => [ - 'date' => function ($date) { - return $date->format('F jS, Y'); // February 22nd, 2020 - }, - 'content' => function ($content) { - // determine the number of 900 word sections to break the content field into - $segments = ceil(Str::wordCount($content) / 900); - - // count the total content length and figure out how many characters each segment needs - $charactersLimit = ceil(Str::length($content) / $segments); - - // now create X segments of Y characters - // the goal is to create many segments with only ~900 words each - $output = str_split($content, $charactersLimit); - - $response = []; - foreach ($output as $key => $segment) { - $response["content_{$key}"] = utf8_encode($segment); - } - - return $response; - } - ], -], -``` - -This will create a few extra fields like `content_1`, `content_2`, ... `content_12`, etc. When you perform a search it'll still search through all of them and return the most relevant result, but it's not possible to show highlights anymore for matching words on the javascript client. You'll have trouble figuring out if you should show `content_1` or `content_8` highlights. So if you go this route, make sure each entry has a synopsis you could show instead of highlights. I wouldn't recommend it at the moment. - - -#### Solution 2 -If you need a lot more fine-grained control, and need to break content down into paragraphs or even sentences. You could use a artisan command to parse the entries in a Statamic collection, split the content and store it in a database. Then sync the individual items to meilisearch using the `php artisan scout:import` command. - -1. Create a new database migration (make sure the migration has an origin UUID so you can link them to the parent entry) -2. Create a new Model and add the `searchables` trait from Scout. -3. Create an artisan command to parse all the entries and bulk import existing ones - -```php -private function parseAll() -{ - // disable search - Articles::withoutSyncingToSearch(function () { - // process all - $transcripts = Entry::query() - ->where('collection', 'articles') - ->where('published', true) - ->get() - ->each(function ($entry) { - // push individual paragraphs or sentences to a collection - $segments = $entries->customSplitMethod(); - - $segments->each(function ($data) { - try { - $article = new Article($data); - $article->save(); - } catch (\Illuminate\Database\QueryException $e) { - dd($e); - } - }); - }); - }); - - $total = Article::all()->count(); - $this->info("Imported {$total} entries into the articles index."); - - $this->info("Bulk import the records with: "); - $this->info("php artisan scout:import \"App\Models\Article\" --chunk=100"); -} -``` - -4. Add some Listeners to the EventServiceProvider to watch for update or delete events on the collection (to keep it in sync) - -```php -protected $listen = [ - 'Statamic\Events\EntrySaved' => [ - 'App\Listeners\ScoutArticleUpdated', - ], - 'Statamic\Events\EntryDeleted' => [ - 'App\Listeners\ScoutArticleDeleted', - ], -]; -``` - -4. Create the Event Listeners, for example: - -```php -public function handle(EntryDeleted $event) -{ - if ($event->entry->collectionHandle() !== 'articles') return; - - // get the ID of the original transcript - $id = $event->entry->id(); - - // delete all from Scout with this origin ID - $paragraphs = Article::where('origin', $id); - $paragraphs->unsearchable(); - $paragraphs->delete(); -} - -public function handle(EntrySaved $event) -{ - // ... same as above ... - - // if state:published - if (!$event->entry->published()) return; - - // TODO: split $event->entry into paragraphs again and save them to the database, - // they will re-sync automatically with the Searchables Trait. -} -``` - -5. Create a placeholder, or empty index into the search config so you can create the index on meilisearch before importing the existing entries - -```php -// required as a placeholder where we store the paragraphs later -'articles' => [ - 'driver' => 'meilisearch', - 'searchables' => [], // empty - 'settings' => [ - 'filterableAttributes' => ['type', 'entity', 'locale'], - 'distinctAttribute' => 'origin', // if you only want to return one result per entry - // any search settings - ], -], -```