gem install picky
Create a new project directory:
picky generate unicorn_server project_name
Enter the newly created project directory:
cd project_name
Edit the Gemfile
and
bundle install
In app/application.rb
you’ll find an example of an application configuration.
The first block
indexing do
...
end
is about … indexing, doh ;) Where is the data taken from and how is it processed. illegal_characters
, stopwords
, split_text_on
are all concerned about normalizing and handling the data.
The example that is generated is extremely simple. There’s lots of configuration options, and they are explained in more detail in the Wiki (TODO).
add_index
is all about defining an index.
indexes do
illegal_characters(/[^äöüa-zA-Z0-9\s\/\-\"\&\.]/)
stopwords(/\b(und|der|die|das|mit|im|ein|des|dem|the|of)\b/)
split_text_on(/[\s\/\-\"\&\.]/)
add_index :books,
Sources::DB.new('SELECT id, title, author, isbn13 as isbn FROM books', :file => 'app/db.yml'),
field(:title),
field(:author),
field(:isbn)
end
Let’s look at it in detail. Here’s the signature:
add_index(index_identifier, source, *fields, options = {})
The index identifier
:books
is used later, in querying:
Search.new(Indexes[:books], Indexes[:dvds])
Queries use such an index, or a combination thereof.
source
is either a DB Source, or a CSV source.
Sources::DB.new('SELECT id, some_field, some_other_field FROM some_table', :file => 'filename.yml')
The file contains the yml configuration. Or you can pass in a hash with the configuration: :host => 'localhost', :adapter => :mysql, etc.
.
Sources::CSV.new(:some_field, :unused, :some_other_field)
, then a :file => 'filename.csv'
with the CSV data.
The fields title
, author
, isbn
define the fields that are contained in the index.
They are pretty simple in the example:
field(:title, :qualifiers => [:t, :title, :titre]),
field(:author, :qualifiers => [:s, :author, :auteur]),
field(:isbn, :qualifiers => [:i, :isbn])
The qualifiers are used when searching, to identify the field. They are all about you already knowing what you are looking for.
As an example, if you knew you’d be searching an author, you’d query like this:
author:faulkner
But the qualifiers aren’t even necessary. Per default it uses the field name.
The second block
indexing do
...
end
defines how query data (the search text you enter) is processed, and also how it is routed.
There are pretty much the same options as in indexing, like illegal_characters
, stopwords
, split_text_on
.
But also
maximum_tokens 5
that defines how many tokens make it through.
If you queried for “The red fox jumps over the nice dog”, only the first 5 tokens would come through.
queries do
maximum_tokens 5
# Note that Picky needs the following characters to
# pass through, as they are control characters: *"~:
#
illegal_characters(/[^a-zA-Z0-9\s\/\-\,\&äöü\"\~\*\:]/)
stopwords(/\b(und|der|die|das|mit|ein|des|dem|the|of)\b/)
split_text_on(/[\s\/\-\,\&]+/)
route %r{^/books}, Search.new(Indexes[:books])
root 200
end
Routing is quite easy. Use
route(regexp_or_string, query)
to define a route and a searchs that is called.
Indexes are identified by their index_identifier (see above). Multiple indexes can be used per query, no problem. Just pass them to a Query
like this:
Search.new(Indexes[:books], Indexes[:dvds], Indexes[:music]) # a comprehensive media search
To access the search server, the picky-client
gem offers a few helpful methods.
gem install picky-client
In your Gemfile:
gem 'picky-client'
Books = Picky::Client.new :host => 'localhost', :port => 8080, :path => '/books'
I recommend to put this code in an environment specific file, like e.g. development.rb
in Rails.
The controller action must return json.
Picky::Convenience
offers a few methods that make handling
# The example uses Sinatra.
#
# For full results, you get the ids from the picky server
# and then populate the result with models.
#
get '/search/full' do
results = Books.search params[:query], :ids => params[:ids] :offset => params[:offset]
results.extend Picky::Convenience
results.populate_with Book do |book|
book.to_s
end
ActiveSupport::JSON.encode results
end
# For live results, can go directly to the search server.
#
# Or, as shown here, go
#
get '/search/live' do
Books.search_unparsed params[:query], :ids => 0, :offset => params[:offset]
end
= Picky::Helper.interface(options = {})
= Picky::Helper.cached_interface(options = {})
The options are:
:button # Default: 'search'
:no_results # Default: 'Sorry, no results found!'
:more # Default: 'more'
Generates a search interface structure (with results) with a div#picky around it all.
TODO Extract JS files, make them installable through picky-client install javascripts
The simplest setup we can think of:
new PickyClient({
live: '/books/live',
full: '/books/full'
});
But it can be customized quite a bit:
pickyClient = new PickyClient({
live: '/books/live',
full: '/books/full',
showResultsLimit: 10, // Optional. Default is 10.
before: function(params, query, offset) { }, // Optional. Before Picky sends any data. Return modified params.
success: function(data, query) { }, // Optional. Just after Picky receives data. (Gets a PickyData object)
after: function(data, query) { }, // Optional. After Picky has handled the data and updated the view.
// This is used to generate the correct query strings, localized. E.g. "subject:war".
// Optional. If you don't give these, the field identifier given in the Picky server is used.
//
qualifiers: {
en:{
subjects: 'subject'
}
},
// This is used to explain the preceding word in the suggestion text, localized. E.g. "Peter (author)".
// Optional. Default are the field identifiers from the Picky server.
//
explanations: {
en:{
title: 'titled',
author: 'written by',
isbn: 'ISBN-13',
year: 'published in',
publisher: 'published by',
subjects: 'topics'
}
}
});
pickyClient.insert('italy');