Skip to content

Releases: mikegoatly/lifti

v3.4.0

08 Oct 10:59
Compare
Choose a tag to compare
  • The default value for fuzzy matching maximum sequential edits has changed from termLength / 4 to termLength < 4 ? 1 : termLength / 4. Without this, fuzzy matching is disabled for search terms of less than 4 characters, which was unintended.
  • When explicitly configuring parameters for the fuzzy search operator in a LIFTI query, the comma can now be omitted if you don't want to override the max sequential edits value.

v3.3.0

03 Oct 22:24
Compare
Choose a tag to compare
  • Added support for customization of fuzzy matching parameters at an index and query level #49
  • Added an alternative simple query parser for when the full LIFTI query syntax isn't needed
  • The IQueryParser for an index is now accessible as IFullTextIndex.QueryParser
  • New configuration point FullTextQueryBuilder.WithDefaultJoiningOperator allows for control over whether an "and" or "or" operator is used to combine search terms without an explicit operator between them.

v3.2.0

30 Sep 20:33
Compare
Choose a tag to compare
  • Added convenience method CombineAll to OrQueryOperator and AndQueryOperator allowing for the easy combining of a series of query parts.
  • Added Query.Empty as a static instance of an empty query.

v3.1.0

25 Jul 18:26
9fb00c3
Compare
Choose a tag to compare

Fixes #46 - This release allows for multiple indexes to be serialized into, and deserialized from the same file, one after another, e.g.:

var serializer = new BinarySerializer<string>();
using (var stream = File.Open(fileName, FileMode.CreateNew))
{
    await serializer.SerializeAsync(index1, stream, false);
    await serializer.SerializeAsync(index2, stream, true);
}

using (var stream = File.Open(fileName, FileMode.Open))
{
    var deserializedIndex1 = new FullTextIndexBuilder<string>().Build();
    var deserializedIndex2 = new FullTextIndexBuilder<string>().Build();
    await serializer.DeserializeAsync(deserializedIndex1, stream, false);
    await serializer.DeserializeAsync(deserializedIndex2, stream, true);

    deserializedIndex1.Search("Foo").Should().HaveCount(1);
    deserializedIndex2.Search("Bar").Should().HaveCount(1);
}

v3.0.1

11 Feb 22:51
16776df
Compare
Choose a tag to compare

This release bug fix for adjacent token searching, e.g. "the quick" should return only documents with those words in that specific order. There was a bug whereby searching for "the the" would return documents that only contained "the", not just documents containing two "the"s. #45

v3.0.0

08 Feb 21:14
0ae4424
Compare
Choose a tag to compare

New features

Fuzzy matching

You can now query an index with a fuzzy match by prefixing a search
term with ? or by configuring the query parser to use fuzzy matching by default.

Wildcard searching

A more controlled way of fuzzy matching text, you can now use * anywhere in a search term to indicate zero or more or any
character, and % to indicate a single character match. For example:

  • searching for %%%ing returns all words that start with any 3 characters and end with ing.
  • searching for *ing returns all words that end with ing.

Other minor enhancements

  • Passing manually constructed IQuerys to IFullTextIndex.Search
  • New WithQueryParser overload on FullTextIndexBuilder allows for configuration of fuzzy matching by default when parsing queries.

Breaking changes

Most of these won't affect anyone using LIFTI under normal conditions - only indexes with custom implementations of certain interfaces will need modification.

  • IIndexNavigator.GetExactAndChildMatches, IIndexNavigator.GetExactMatches and IScorer.Score all have an additional parameter weighting that should be used as a multiplier for the scores of any matches.
  • New method IIndexNavigator.CreateBookmark used to capture a snapshot of the index navigator's state that can be reapplied by calling IIndexNavigatorBookmark.Apply. Once a bookmark has been applied, EnumerateIndexedWords
    cannot be used and will throw an exception.
  • New method IIndexNavigator.EnumerateNextCharacters - enumerates any reachable characters from the current location in the index
  • IQueryTokenizer and QueryParser are now an internal; both are used as part of the LIFTI query parser implementation configured using the index builder, so shouldn't need to be used externally.
  • New method ITokenizer.Normalize used to normalize a fragment of text according to the tokenizer's rules without tokenizing.

v2.1.1

14 Mar 16:50
Compare
Choose a tag to compare

Fixed #31 - Intra-node text containing surrogate pair characters breaks serialization

v2.1.0

24 Oct 17:45
Compare
Choose a tag to compare

This release fixes #30

v2.0.0

30 Jul 08:43
Compare
Choose a tag to compare

You can now override the ITokenizer implementation used in an index if you need to.

v2.0.0-rc1

24 Jul 08:40
Compare
Choose a tag to compare
v2.0.0-rc1 Pre-release
Pre-release

The first release candidate for V2. Hopefully it'll be out of RC over the next couple of weeks.

A fairly big refactoring, key breaking changes are:

  • TokenizationOptionsBuilder.XmlContent has been removed and replaced with the concept of text extractors. This is a much better design and makes it easier to implement text extraction from other formats, e.g. JSON, RTF.

  • FullTextIndexBuilder.WithDefaultTokenizationOptions renamed to WithDefaultTokenization