Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Added support for Dictionaries #10

Open
wants to merge 10 commits into
base: master
Choose a base branch
from
Open
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension


Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
39 changes: 39 additions & 0 deletions README.md
Original file line number Diff line number Diff line change
Expand Up @@ -83,6 +83,45 @@ TextRazor.phrases('api_key', 'text')

```

### Dictionaries

You can manage dictionaries and their entries through the dictionary API.

#### Creating a dictionary

```
client = TextRazor::Client.new('api_key')

client.create_dictionary('my-dictionary', case_insensitive: true)
```

#### Adding entries to a dictionary

```
client.create_dictionary_entries('my-dictionary', [{id: 'my-entry', text: 'Text to be matched'}])
```

#### Getting entries from a dictionary

```
client.get_dictionary_entries('my-dictionary')

# Using pagination
client.get_dictionary_entries('my-dictionary', limit: 20, offset: 0)
```

#### Deleting a dictionary entry

```
client.delete_dictionary_entry('my-dictionary', 'my-entry')
```

#### Deleting a dctionary

```
client.delete_dictionary('my-dictionary')
```

## Next steps

Only implemented this for topics, entities, words and phrases. Also, implement
Expand Down
3 changes: 3 additions & 0 deletions lib/textrazor.rb
Original file line number Diff line number Diff line change
@@ -1,8 +1,11 @@
require "textrazor/version"
require "textrazor/configuration"
require "textrazor/util"
require "textrazor/dictionary"
require "textrazor/dictionary_entry"
require "textrazor/client"
require "textrazor/request"
require "textrazor/api_response"
require "textrazor/response"
require "textrazor/category"
require "textrazor/topic"
Expand Down
54 changes: 54 additions & 0 deletions lib/textrazor/api_response.rb
Original file line number Diff line number Diff line change
@@ -0,0 +1,54 @@
require 'json'

module TextRazor

class ApiResponse
Copy link
Owner

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Similar to ApiRequest, Response class was meant to take care of this responsibility. Any particular reason to do this.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The Response class is heavily oriented towards parsing responses to analysis requests.

The more general ApiResponse can handle any kind of response from TextRazor. (Side note: Response inherits from ApiResponse)


BadRequest = Class.new(StandardError)
Unauthorised = Class.new(StandardError)
RequestEntityTooLong = Class.new(StandardError)

attr_reader :raw_response, :time

def initialize(http_response)
code = http_response.code
body = http_response.body

raise BadRequest.new(body) if bad_request?(code)
raise Unauthorised.new(body) if unauthorised?(code)
raise RequestEntityTooLong.new(body) if request_entity_too_long?(code)

json_body = ::JSON::parse(body, symbolize_names: true)

@time = json_body[:time].to_f
@ok = json_body[:ok]
@raw_response = json_body[:response]
end

def ok?
@ok
end

#TODO: Not in a successful response
#def error
#end

#def message
#end

private

def bad_request?(code)
code == 400
end

def unauthorised?(code)
code == 401
end

def request_entity_too_long?(code)
code == 413
end
end

end
44 changes: 40 additions & 4 deletions lib/textrazor/client.rb
Original file line number Diff line number Diff line change
Expand Up @@ -8,6 +8,9 @@ class Client
UnsupportedExtractor = Class.new(StandardError)
UnsupportedCleanupMode = Class.new(StandardError)

InvalidDictionary = Class.new(StandardError)
InvalidDictionaryEntry = Class.new(StandardError)

DEFAULT_EXTRACTORS = ['entities', 'topics', 'words', 'phrases', 'dependency-trees',
'relations', 'entailments', 'senses']

Expand All @@ -30,11 +33,36 @@ def initialize(api_key, options = {})

def analyse(text)
assert_text(text)
options = {
api_key: api_key
}.merge(request_options)
Response.new(Request.post(api_key, text, **request_options))
end

def create_dictionary(id, **options)
dictionary = Dictionary.new(id: id, **options)
assert_dictionary(dictionary)
Request.create_dictionary(api_key, dictionary)
end

Response.new(Request.post(text, options))
def get_dictionary_entries(dictionary_id, limit: 0, offset: 0)
response = ApiResponse.new(
Request.get_dictionary_entries(api_key, dictionary_id, limit: limit, offset: offset)
)
return [] unless response.raw_response.key?(:entries)
response.raw_response[:entries].map { |hash| DictionaryEntry.create_from_hash(hash) }
end

def delete_dictionary(dictionary_id)
Request.delete_dictionary(api_key, dictionary_id)
end

def create_dictionary_entries(dictionary_id, dictionary_entry_hashes)
dictionary_entries = dictionary_entry_hashes.map do |entry_hash|
DictionaryEntry.new(entry_hash).tap { |e| assert_dictionary_entry(e) }
end
Request.create_dictionary_entries(api_key, dictionary_id, dictionary_entries)
end

def delete_dictionary_entry(dictionary_id, dictionary_entry_id)
Request.delete_dictionary_entry(api_key, dictionary_id, dictionary_entry_id)
end

def self.topics(api_key, text, options = {})
Expand Down Expand Up @@ -126,6 +154,14 @@ def is_text_bigger_than_200_kb?(text)
text.bytesize/1024.0 > 200
end

def assert_dictionary(dictionary)
raise InvalidDictionary, "Dictionary is invalid" unless dictionary.valid?
end

def assert_dictionary_entry(entry)
raise InvalidDictionaryEntry, "Entry is invalid" unless entry.valid?
end

end

end
27 changes: 27 additions & 0 deletions lib/textrazor/dictionary.rb
Original file line number Diff line number Diff line change
@@ -0,0 +1,27 @@
module TextRazor

class Dictionary

include Util

attr_reader :id, :match_type, :case_insensitive, :language

def initialize(params = {})
initialize_params params
end

def to_h
{
"matchType" => match_type,
"caseInsensitive" => case_insensitive,
"language" => language
}.reject { |_, v| v.nil? }
end

def valid?
!id.nil? && !id.empty?
end

end

end
27 changes: 27 additions & 0 deletions lib/textrazor/dictionary_entry.rb
Original file line number Diff line number Diff line change
@@ -0,0 +1,27 @@
module TextRazor

class DictionaryEntry

include Util

attr_reader :id, :text, :data

def initialize(params = {})
initialize_params params
end

def to_h
{
"id" => id,
"text" => text,
"data" => data
}.reject { |_, v| v.nil? || v.empty? }
end

def valid?
!text.nil? && !text.empty?
end

end
end

2 changes: 1 addition & 1 deletion lib/textrazor/entity.rb
Original file line number Diff line number Diff line change
Expand Up @@ -6,7 +6,7 @@ class Entity

attr_reader :id, :type, :matching_tokens, :entity_id, :freebase_types, :confidence_score,
:wiki_link, :matched_text, :freebase_id, :relevance_score, :entity_english_id,
:starting_pos, :ending_pos, :data, :wikidata_id
:starting_pos, :ending_pos, :data, :wikidata_id, :source_id

def initialize(params = {})
@type = []
Expand Down
55 changes: 52 additions & 3 deletions lib/textrazor/request.rb
Original file line number Diff line number Diff line change
Expand Up @@ -18,18 +18,63 @@ class Request
classifiers: 'classifiers'
}

def self.post(text, options)
def self.post(api_key, text, **options)
::RestClient.post(
TextRazor.configuration.url,
build_query(text, options),
accept_encoding: 'gzip'
build_headers(api_key)
)
end

def self.create_dictionary(api_key, dictionary, **options)
::RestClient.put(
url("/entities/#{dictionary.id}"),
dictionary.to_h.to_json,
build_headers(api_key)
)
dictionary
end

def self.get_dictionary_entries(api_key, dictionary_id, limit:, offset:)
::RestClient.get(
url("entities/#{dictionary_id}/_all?limit=#{limit}&offset=#{offset}"),
build_headers(api_key)
)
end

def self.delete_dictionary(api_key, dictionary_id)
::RestClient.delete(
url("entities/#{dictionary_id}"),
build_headers(api_key)
)
true
end

def self.create_dictionary_entries(api_key, dictionary_id, dictionary_entries)
::RestClient.post(
url("entities/#{dictionary_id}/"),
dictionary_entries.map(&:to_h).to_json,
build_headers(api_key)
)
dictionary_entries
end

def self.delete_dictionary_entry(api_key, dictionary_id, dictionary_entry_id)
::RestClient.delete(
url("entities/#{dictionary_id}/#{dictionary_entry_id}"),
build_headers(api_key)
)
true
end

def self.url(path = '/')
File.join(TextRazor.configuration.url, path)
end

private

def self.build_query(text, options)
query = {"text" => text, "apiKey" => options.delete(:api_key)}
query = { 'text' => text }

options.each do |key, value|
value = value.join(",") if value.is_a?(Array)
Expand All @@ -39,6 +84,10 @@ def self.build_query(text, options)
query
end

def self.build_headers(api_key)
{ x_textrazor_key: api_key }
end

end

end
46 changes: 1 addition & 45 deletions lib/textrazor/response.rb
Original file line number Diff line number Diff line change
Expand Up @@ -2,39 +2,7 @@

module TextRazor

class Response

BadRequest = Class.new(StandardError)
Unauthorised = Class.new(StandardError)
RequestEntityTooLong = Class.new(StandardError)

attr_reader :raw_response, :time

def initialize(http_response)
code = http_response.code
body = http_response.body

raise BadRequest.new(body) if bad_request?(code)
raise Unauthorised.new(body) if unauthorised?(code)
raise RequestEntityTooLong.new(body) if request_entity_too_long?(code)

json_body = ::JSON::parse(body, symbolize_names: true)

@time = json_body[:time].to_f
@ok = json_body[:ok]
@raw_response = json_body[:response]
end

def ok?
@ok
end

#TODO: Not in a successful response
#def error
#end

#def message
#end
class Response < ApiResponse

def custom_annotation_output
@custom_annotation_output ||= raw_response[:customAnnotationOutput]
Expand Down Expand Up @@ -98,18 +66,6 @@ def language_is_reliable?

private

def bad_request?(code)
code == 400
end

def unauthorised?(code)
code == 401
end

def request_entity_too_long?(code)
code == 413
end

def parse_entailments
parse(:entailment, raw_response[:entailments])
end
Expand Down
Loading