Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Inconsistency between shown frequencies and number of kwic lines displayed, from R client to web gui, but also inside web gui #218

Open
perkuhn opened this issue Sep 23, 2024 · 7 comments

Comments

@perkuhn
Copy link

perkuhn commented Sep 23, 2024

following KorAP query:

https://korap.ids-mannheim.de/?q=%5borth%3dpopulistisches%5d&cq=availability%21%3dQAO-NC-LOC%3aids%20%26%20corpusSigle%20%3d%20%2fSOL%7c%5bSUTZ%5d%5b0-9%5d%5b0-9%5d%2f%20%26%20pubDate%20in%201975&ql=poliqarp

generated by the R-client (while working at home via vpn), yields a value (10) that is also displayed in the chart, URL taken from clickable data point. KorAP web gui shows also this value, but instead of kwic lines shows comment that there no results at all (in other cases shows only fewer lines than value indicates).

... another example:

https://korap.ids-mannheim.de/?q=%5borth%3dpopulistisches%5d&cq=availability%21%3dQAO-NC-LOC%3aids%20%26%20corpusSigle%20%3d%20%2fSOL%7c%5bSUTZ%5d%5b0-9%5d%5b0-9%5d%2f%20%26%20pubDate%20in%202017&ql=poliqarp

client works with 108 hits, KorAP web gui also shows this frequency 108, but displays only 9 KWIC lines (offers five more pages to show kwics, but without content, only "no results" messages).

Starting the query inside KorAP gui with the same setting (query and vc) yields 9 hits and 9 kwic lines.

But: If you remove the licence restrictions (that should have no effect with this vc definition) and start again a query in the KorAP gui, you get the same effect as above: claiming 108 hits, displaying 9 kwic lines ...

presumably the smaller numbers inside KorAP web gui (and thus the number of kwic lines) are correct, so the wrong results might be calculated from another vc and/or by ignoring the licence restrictions for the shown number but not ignoring this for displaying the kwics?

@Akron
Copy link
Member

Akron commented Sep 25, 2024

I don't get the wrong numbers, so I can't replicate this.

@Akron Akron transferred this issue from KorAP/RKorAPClient Sep 25, 2024
@Akron
Copy link
Member

Akron commented Sep 25, 2024

It was identified as probably a caching issue in Kalamar.

@perkuhn
Copy link
Author

perkuhn commented Sep 25, 2024

working without caching seems to solve problem inside web gui, RKorAP client script call of freqencyQuery still offers wrong numbers, e.g.

library(RKorAPClient)

startYear <- 1962
endYear <- 2023

query=c("[tt/l=populistisch]",
"[orth=populistisch]",
"[orth=populistischer]",
"[orth=populistisches]")

ql="poliqarp"
vc = "availability!=QAO-NC-LOC:ids & corpusSigle = /SOL|[UTSZ][0-9][0-9]/ & pubDate in "
year <- c(startYear:endYear)
kco1 <- new("KorAPConnection", accessToken=NULL, verbose = T)

df <- frequencyQuery(kco1, query=query, vc=paste0(vc,year), expand=TRUE, ql=ql)
hc <- hc_freq_by_year_ci(df)
print(hc)

---->
...
Searching "[orth=populistisches]" in "availability!=QAO-NC-LOC:ids & corpusSigle = /SOL|[UTSZ][0-9][0-9]/ & pubDate in 1988": 23 hits, took 3.155619155 s

actually should be 0 hits

@Akron
Copy link
Member

Akron commented Sep 26, 2024

We expect the problem to be associated with #219 .

@margaretha
Copy link
Contributor

I have investigated the problem and managed to reproduce it. It seems to be a problem with the cache in Kustvakt. I will investigate further and fix it.

@margaretha
Copy link
Contributor

I have made a fix in Kustvakt and it has already been run on the test instance.

@Akron
Copy link
Member

Akron commented Oct 7, 2024

Thank you very much - to be closed, when @perkuhn confirms the fix, #219 is still open for discussions.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

3 participants