Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Porter Stem Filter not working in some cases #954

Open
nileshshroff opened this issue Apr 20, 2024 · 2 comments
Open

Porter Stem Filter not working in some cases #954

nileshshroff opened this issue Apr 20, 2024 · 2 comments
Labels
bug Something isn't working

Comments

@nileshshroff
Copy link

nileshshroff commented Apr 20, 2024

What is the bug?

In some cases the stemmed search is not returning any results.

In case of the following text which is indexed using porter_stem
"price increases daily for the last month or so and this is not even the breaking news"

Searching for "price increas daili" returns 0 hits
Searching for "break new" returns 1 hit

The following steps show how it can be reproduced.

How can one reproduce the bug?

Create a index with a field which uses a porter_stem filter

PUT /index_with_stemissue
{
"settings": {
"index": {
"number_of_shards": 1,
"number_of_replicas": 1
},
"analysis": {
"analyzer": {
"stemanalyzer": {
"type": "custom",
"tokenizer": "letter",
"filter": [
"lowercase",
"porter_stem"
]
}
}
}
},
"mappings": {
"properties": {
"message_text": {
"type": "text",
"analyzer": "stemanalyzer",
"store": true
}
}
}
}

Test the field with analyze to see the stem words

GET /index_with_stemissue/_analyze
{
"field": "message_text",
"text": "price increases daily for the last month or so and this is not even the breaking news"
}

Add the document to the index

POST /index_with_stemissue/_doc/
{
"message_text" : "price increases daily for the last month or so and this is not even the breaking news"
}

This search using query string where stem words are included not work NO RESULTS RETURNED

GET index_with_stemissue/_search
{
"query":
{
"bool" : {
"must" : [
{
"query_string": {
"query" : "message_text:\"price increas daili\""
}
}
]
}
}
}

However, another eg stem word works fine

GET index_with_stemissue/_search
{
"query":
{
"bool" : {
"must" : [
{
"query_string": {
"query" : "message_text:\"break new\""
}
}
]
}
}
}

What is the expected behavior?

When searching for "price increas daili" 1 results should be returned

What is your host/environment?

AWS Service OpenSearch version 1.1

Do you have any screenshots?

If applicable, add screenshots to help explain your problem.

Do you have any additional context?

Add any other context about the problem.

@nileshshroff nileshshroff added bug Something isn't working untriaged labels Apr 20, 2024
@wbeckler
Copy link

wbeckler commented May 6, 2024

Is this due to an issue with the opensearch-java client, or should this issue be moved to https://github.com/opensearch-project/OpenSearch

@wbeckler wbeckler removed the untriaged label May 6, 2024
@dblock
Copy link
Member

dblock commented May 7, 2024

@nileshshroff Can you reproduce this with curl? Which version of OpenSearch/client? Otherwise can you please post your java code.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
bug Something isn't working
Projects
None yet
Development

No branches or pull requests

3 participants