Release V6.0.1 · mikegoatly/lifti

Note: v6.0.0 was only available for a few minutes due of a nuget publishing error. v6.0.1 should be considered the first official v6 release

There are a couple of breaking changes in this release, most of which are due to renaming of types. Some guidance can be found below for how to deal with them.

New features

Score boosting!
- Score boosting as part of a query - grand^3 will boost the score of words matching "grand".
- Boosting of object fields - .WithField("Name", c => c.Name, scoreBoost: 1.5D).
- Boosting object scores based on a freshness date, e.g. the date it was last updated.
- Boosting object scores based on a magnitude value, e.g. a star rating.
Custom stemmers
Characters can now be escaped in LIFTI queries and field names in LIFTI queries can contain spaces.
Enhanced query execution logic
Removed dependency on System.Collections.Immutable - only the netstandard2 version of the library now pulls in any dependencies. For net6 to net8, only built in types are used.

Performance increases

There was a significant amount of work done to improve performance and memory usage of building an index, index (de)serialization and searching.

All tests were run with Benchmark.NET:
BenchmarkDotNet=v0.13.5, OS=Windows 11 (10.0.22631.3007)
Intel Core i7-1065G7 CPU 1.30GHz, 1 CPU, 8 logical and 4 physical cores
The results below are a comparison of the previous v5 version of LIFTI against the code in the v6.0.0 branch, running on .NET 8.

Index construction

Populating an index with 200 Wikipedia entries in a single batch

v5 Mean (μs)	v5 Allocated (KB)	v6 Mean (μs)	v6 Allocated (KB)
1,134.2	567,623.8	952.6	286,617.6

Populating each of the 200 Wikipedia entries one at a time (i.e. a new snapshot created after each document)

v5 Mean (μs)	v5 Allocated (KB)	v6 Mean (μs)	v6 Allocated (KB)
4,284.4	1,370,649.9	1,212.4	613,540.2

Searching

Lots of individual optimisations including:

Merge sorting results during unions and intersections for queries containing more than one part
Optimised collection of effected results during wildcard and fuzzy match query parts
Early application of field filters when matching results
Weighting of query parts to analyse optimal execution order so that documents can be eliminated from collection in other parts of the query.

make for some nice gains for various query types.

Query	v5 Mean (μs)	v5 Allocated (KB)	v6 Mean (μs)	v6 Allocated (KB)
"also has a"	169.74	379.19	52.71	122.97
(confiscation & th*) \| "and they"	1,203.69	1,557.29	105.23	185.02
*	193,333.07	103,612.99	62,298.80	13,152.30
?and ?they ?also	1,725.66	1,658.12	439.60	243.45
and	they	417.70	819.98	104.23
and ~ they	132.89	294.22	42.20	95.61
and ~10> they	132.64	297.67	43.34	97.04
and > they	214.03	455.75	106.16	169.17
and they also	283.82	565.34	56.02	109.51
co*on	445.27	798.77	180.04	263.47
con??*	2.21	2.30	1.96	1.97
confiscation	4.03	2.70	3.66	2.29
th*	2,277.00	2,914.76	569.76	412.60
Title=?great	416.08	399.17	108.86	34.50

Deprecated:

ItemMetadata.Item/DocumentMetadata.Item -> use Key property
IFullTextIndex.Items -> use Metadata property
FullTextIndexBuilder.WithDuplicateItemBehavior -> use WithDuplicateKeyBehavior method
IndexOptions.DuplicateItemBehavior -> use DuplicateKeyBehavior property
ScoredToken.ItemId -> use DocumentId property
QueryTokenMatch.ItemId -> use DocumentId property
ItemMetadata.Count -> IndexMetadata.DocumentCount
ItemMetadata.GetMetadata -> IndexMetadata.GetDocumentMetadata

Technically breaking

IdPool and IIdPool are now internal - These weren't really exposed before anyway
Removed interface IItemMetadata - just using DocumentMetadata going forwards
QueryContext no longer has ApplyTo method
IIndexNavigator: added Snapshot property
IIndexNavigator: added overloads for GetExactMatches and GetExactAndChildMatches that allow for the current QueryContext to be passed in so unnecessary results are not collected.
IIndexNavigator: new additional methods AddExactMatches and AddExactAndChildMatches that allow you to efficiently collect matches using a DocumentMatchCollector before converting it to an IntermediateQueryResult.
IQueryPart now has double CalculateWeighting(Func<IIndexNavigator> navigatorCreator) method to help the query processing logic evaluate the most efficient order of execution.
TItem generic type parameter name has been renamed to TObject.
All query part types are now sealed
New method IIndexNavigator.ExactMatchCount()
IntermediateQueryResult constructors are no longer public
Index serialization interfaces have been reworked. This shouldn't affect anyone because it was technically impossible to write your own serializers based upon them due to a lack of publicly accessible methods for rehydrating an index.
IIndexNavigatorBookmark now implements IDisposable - you don't technically have to dispose it, but doing so will return it to a pool and allow it to be reused.

Querying changes

ScoredFieldMatch is now quite different and no longer publicly constructable. The only place you would have encountered this is in a custom scorer, and that's no longer necessary.

Several types that are only likely to have been used internally are gone:

FieldMatch
QueryTokenMatch
CompositeTokenMatchLocation
SingleTokenMatchLocation
ITokenLocationMatch
TokenLocationMatch

Breaking

DuplicateItemBehavior enum -> renamed to DuplicateKeyBehavior
DuplicateItemBehavior.ReplaceItem -> use DuplicateKeyBehavior.Replace instead
IQueryContext -> Just use concrete QueryContext this affects IQueryPart.Evaluate as it now takes QueryContext
IIndexNodeFactory.CreateNode now takes concrete types ChildNodeMap and DocumentTokenMatchMap instead of ImmutableDictionary and ImmutableList respectively.
A maximum of 31 different object types can now be configured against a single FullTextIndexBuilder (i.e. 31 distinct calls to WithObjectTokenization) - if anyone is actually indexing more that 31 object types, I'd be very interested to understand your scenario!

The rest of these will only affect you if you are explicitly referencing the type names in your code:

ItemPhrases -> renamed to DocumentPhrases
ItemMetadata -> renamed to DocumentMetadata
IItemStore -> renamed to IIndexMetadata

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

V6.0.1