Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Documentation lacking for SearchRequest parameters #7

Closed
narayanacharya6 opened this issue Feb 21, 2022 · 8 comments
Closed

Documentation lacking for SearchRequest parameters #7

narayanacharya6 opened this issue Feb 21, 2022 · 8 comments
Labels
bug Something isn't working

Comments

@narayanacharya6
Copy link

narayanacharya6 commented Feb 21, 2022

I'm new to manticoresearch and manticoresearch-python and I am looking for more detailed documentation for the arguments in the SearchRequest constructor.

I noticed that the "match" in the HTTP query is not equivalent to the the MATCH in the SQL syntax.

For example, consider the following query, Do Cholesterol Statin Drugs Cause Breast Cancer?, to be run on an index. I use the below to see the plan used for how the query is being searched for:

mysql> SET profiling=1;  
mysql> SELECT *, WEIGHT(), PACKEDFACTORS({json=1}) FROM my_index WHERE MATCH('Do Cholesterol Statin Drugs Cause Breast Cancer?') OPTION ranker=expr('bm25');  
mysql> SHOW PLAN \G;  
*************************** 1. row ***************************  
Variable: transformed_tree  
   Value: AND(  
  AND(KEYWORD(do, querypos=1)),   
  AND(KEYWORD(cholesterol, querypos=2)),   
  AND(KEYWORD(statin, querypos=3)),   
  AND(KEYWORD(drugs, querypos=4)),   
  AND(KEYWORD(cause, querypos=5)),   
  AND(KEYWORD(breast, querypos=6)),   
  AND(KEYWORD(cancer, querypos=7)))  
ERROR:   
No query specified  

Notice the AND operator used at the start and since none of the documents in my index have all the tokens from this query, no documents are returned.

When I instead break the tokens manually in my query, separating each with a | delimiter, the plan is different (with the OR operator between the tokens) and I get back results.

mysql> SELECT *, WEIGHT(), PACKEDFACTORS({json=1}) FROM my_index WHERE MATCH('Do|Cholesterol|Statin|Drugs|Cause|Breast |Cancer?') OPTION ranker=expr('bm25');  
  
...<RESULTS OMITTED FOR SAKE OF BREVITY>...  
  
mysql> SHOW PLAN \G;                                                                                                  *************************** 1. row ***************************  
Variable: transformed_tree  
   Value: OR(  
  AND(KEYWORD(do, querypos=1)),   
  AND(KEYWORD(cholesterol, querypos=2)),   
  AND(KEYWORD(statin, querypos=3)),   
  AND(KEYWORD(drugs, querypos=4)),   
  AND(KEYWORD(cause, querypos=5)),   
  AND(KEYWORD(breast, querypos=6)),   
  AND(KEYWORD(cancer, querypos=7)))  
ERROR:   
No query specified  

But the HTTP syntax (as used below) returns results without having to break the query into tokens. This has me confused.

import manticoresearch  
configuration = manticoresearch.Configuration(  
    host="http://127.0.0.1:9308"  
)  
api_client = manticoresearch.ApiClient(configuration)  
query = "Do Cholesterol Statin Drugs Cause Breast Cancer?"  
api_instance = manticoresearch.SearchApi(api_client)  
search_request = SearchRequest(  
    index="my_index",  
    query={  
        "match": {  
            "*": query  
        }  
    }  
    profile=True  
)   
  
try:  
    api_response: SearchResponse = api_instance.search(search_request)  
except ApiException as e:  
    print("Exception when calling SearchApi->search: %s\n" % e)  

Another question related to the arguments in the SearchRequest class: How do I pass things like OPTIONS ranker=expr('bm25') as part of the SearchRequest object? Or is this is not possible at the moment unless I execute in SQL syntax using the UtilsApi's sql method?

@sanikolaev
Copy link
Collaborator

This python client is based on Manticore HTTP JSON protocol, it doesn't natively support ranker=... as well as other search options, so you are right - you need to use sql method for that.

We'll look into the other issues

@githubmanticore githubmanticore added the bug Something isn't working label Feb 23, 2022
@narayanacharya6
Copy link
Author

narayanacharya6 commented Feb 23, 2022

Is adding support for OPTIONS over HTTP in the pipeline? If not, can it be added? :)

@sanikolaev
Copy link
Collaborator

Yes. We had it in our internal ticketing system. I've just exposed it to github - manticoresoftware/manticoresearch#715

@githubmanticore
Copy link

➤ Nick Sergeev commented:

Done in 29e878b0

@narayanacharya6
Copy link
Author

Unfortunately, I raised two questions in one ticket (sorry for this):

  1. "match" in the HTTP query is not equivalent to the the MATCH in the SQL syntax.
  2. Support for OPTIONS over HTTP in SearchRequest.

Can you please clarify what was done?

@sanikolaev
Copy link
Collaborator

The issue about OPTIONS is here manticoresoftware/manticoresearch#715.

@bZichett
Copy link

bZichett commented Mar 3, 2023

This means the following is not possible with the generated client APIs (Im using python)?

https://manual.manticoresearch.com/Searching/Sorting_and_ranking#Extended-mode:~:text=When%20sorting%20on,property%20to%20true%3A

When sorting on an attribute, match weight (score) calculation is disabled by default (no ranker is used). You can enable weight calculation by setting the track_scores property to true:

{
  "index":"test",
  "track_scores":true,
  "query": { "match": { "title": "what was" } },
  "sort": [ { "gid": { "order":"desc" } } ]
}

I dont see any of the Open API generated libraries (except PHP?) supporting this flag. (Since it requires a ranker?)

The difference in support between SQL and the JSON api might be worthwhile to have a single place (in documentation?) describing what's not yet supported. Im still not decided on whether I should be using the SQL path while these reach equilibrium (if ever)

@githubmanticore
Copy link

➤ Nick Sergeev commented:

Done in 723cea2

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
bug Something isn't working
Projects
None yet
Development

No branches or pull requests

4 participants