Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Des résultats incohérents avec la recherche #41

Open
pYassine opened this issue Mar 7, 2024 · 1 comment
Open

Des résultats incohérents avec la recherche #41

pYassine opened this issue Mar 7, 2024 · 1 comment
Labels
broadcast About a broadcast service bug Something isn't working

Comments

@pYassine
Copy link

pYassine commented Mar 7, 2024

Type de diffusion concernée

Service de géocodage

Accès concerné

Ouvert

Description de l'erreur

Bonjour,

J'utilise l'api de géocodage depuis peu.
Plusieurs utilisateurs nous ont remonté que des résultats ne remontaient pas

Quand ont tape "20 boulevard Mat"
Aucun résultat cohérent ne remonte. Les utilisateurs cherchent à trouver l'adresse "20 boulevard matabiau"

Par contre quand on tape "20 boulevard mata", ça remonte bien.

Capture d’écran 2024-03-07 à 12 51 32

@pYassine pYassine added broadcast About a broadcast service bug Something isn't working labels Mar 7, 2024
@jdesboeufs
Copy link

Bonjour @pYassine et merci pour ce signalement.

Je constate que le comportement s'observe aussi sur l'API Adresse :
https://api-adresse.data.gouv.fr/search?q=20%20boulevard%20Mat

Cela est lié au mécanisme d'auto-complétion dans le moteur addok. Il semble ici que le terme mat soit réduit à ma à l'étage de tokenisation, terme ensuite trop court pour être utilisé pour l'auto-complétion qui se déclenche à 3 caractères.

Je vais créer un ticket sur le moteur en question.


Pour suivi :

> EXPLAIN 20 boulevard Mat
[123.9] Taken tokens: [<Token ma>]
[123.9] Common tokens: [<Token boulevar>]
[123.9] Housenumbers token: [<Token ving>]
[123.9] Not found tokens: []
[123.9] Filters: []
[124.0] ** ONLY_COMMONS_BUT_GEOHASH_TRY_AUTOCOMPLETE_COLLECTOR **
[124.0] ** NO_TOKENS_BUT_HOUSENUMBERS_AND_GEOHASH **
[124.0] ** NO_AVAILABLE_TOKENS_ABORT **
[124.0] ** ONLY_COMMONS **
[124.0] ** NO_MEANINGFUL_BUT_COMMON_TRY_AUTOCOMPLETE_COLLECTOR **
[124.0] ** ONLY_COMMONS_TRY_AUTOCOMPLETE_COLLECTOR **
[124.0] ** BUCKET_WITH_MEANINGFUL **
[124.0] New bucket with keys ['w|ma', 'w|boulevar'] and limit 10
[124.7] 4 ids in bucket so far
[124.7] ** REDUCE_WITH_OTHER_COMMONS **
[124.7] ** ENSURE_GEOHASH_RESULTS_ARE_INCLUDED_IF_CENTER_IS_GIVEN **
[124.7] ** AUTOCOMPLETE_MEANINGFUL_COLLECTOR **
[124.8] Autocompleting ma
[124.8] No candidates. Aborting.
[124.8] ** FUZZY_COLLECTOR **
[124.8] Checking cream.
[124.8] Computing results
[124.9] Done getting results data
[126.4] Done computing results
[126.4] Checking cream.
[126.4] Computing results
[126.4] Done computing results
[126.4] Fuzzy on. Trying with [<Token ma>, <Token boulevar>].
[126.4] Going fuzzy with boulevar and ['w|ma']
[129.2] Going fuzzy with ma and ['w|boulevar']
[129.6] Found fuzzy candidates ['la', 'pa', 'me', 'mz', 'mai', 'mak', 'man', 'mar', 'mas', 'mat', 'mau', 'max']
[129.6] Adding to bucket with keys ['w|boulevar', 'w|la']
[133.0] 100 ids in bucket so far
[133.0] ** EXTEND_RESULTS_EXTRAPOLING_RELATIONS **
[133.0] ** EXTEND_RESULTS_REDUCING_TOKENS **
[133.0] Computing results
[133.0] Done getting results data
[162.6] Done computing results
20 Boulevard Foch 62120 Aire-sur-la-Lys (97jWY | importance: 0.0633/0.1, str_distance: 0.5684/1.0)
20 Boulevard Doret 97400 Saint-Denis (ormyL | importance: 0.0674/0.1, str_distance: 0.54/1.0)
20 Boulevard Brune 19100 Brive-la-Gaillarde (ZYqov | importance: 0.0667/0.1, str_distance: 0.54/1.0)
20 Boulevard sully 85000 La Roche-sur-Yon (nWBAP | importance: 0.0611/0.1, str_distance: 0.54/1.0)
20 Boulevard National 92250 La Garenne-Colombes (4j3rg | importance: 0.065/0.1, str_distance: 0.5318/1.0)
20 Boulevard  Latouche 72200 La Flèche (kNNZK | importance: 0.0629/0.1, str_distance: 0.5318/1.0)
20 Boulevard Carnot 78200 Mantes-la-Jolie (zwPDq | importance: 0.0678/0.1, str_distance: 0.5143/1.0)
Boulevard du Mas 83700 Saint-Raphaël (jWBwl | importance: 0.0601/0.1, str_distance: 0.5211/1.0)
20 Boulevard edison 85000 La Roche-sur-Yon (jwR5v | importance: 0.0601/0.1, str_distance: 0.5143/1.0)
20 Boulevard Mestadier 23300 La Souterraine (J8v72 | importance: 0.0541/0.1, str_distance: 0.5087/1.0)
162.8 ms — 1 run(s) — 10 results
> EXPLAIN 20 boulevard Mata
[1254.] Taken tokens: [<Token mata>]
[1254.] Common tokens: [<Token boulevar>]
[1254.] Housenumbers token: [<Token ving>]
[1254.] Not found tokens: []
[1254.] Filters: []
[1254.] ** ONLY_COMMONS_BUT_GEOHASH_TRY_AUTOCOMPLETE_COLLECTOR **
[1254.] ** NO_TOKENS_BUT_HOUSENUMBERS_AND_GEOHASH **
[1254.] ** NO_AVAILABLE_TOKENS_ABORT **
[1254.] ** ONLY_COMMONS **
[1254.] ** NO_MEANINGFUL_BUT_COMMON_TRY_AUTOCOMPLETE_COLLECTOR **
[1254.] ** ONLY_COMMONS_TRY_AUTOCOMPLETE_COLLECTOR **
[1254.] ** BUCKET_WITH_MEANINGFUL **
[1254.] New bucket with keys ['w|mata', 'w|boulevar'] and limit 10
[1254.] 2 ids in bucket so far
[1254.] ** REDUCE_WITH_OTHER_COMMONS **
[1254.] ** ENSURE_GEOHASH_RESULTS_ARE_INCLUDED_IF_CENTER_IS_GIVEN **
[1254.] ** AUTOCOMPLETE_MEANINGFUL_COLLECTOR **
[1254.] Autocompleting mata
[1255.] Ordering candidates by frequency
[1255.] Found tokens to autocomplete [b'w|matalon, w|matabiau', …]
[1255.] Trying to extend bucket. Autocomplete w|matalon
[1255.] Adding to bucket with keys ['w|boulevar', 'w|matalon']
[1259.] 3 ids in bucket so far
[1259.] Trying to extend bucket. Autocomplete w|matabiau
[1259.] Adding to bucket with keys ['w|boulevar', 'w|matabiau']
[1259.] 4 ids in bucket so far
[1259.] ** FUZZY_COLLECTOR **
[1259.] Checking cream.
[1259.] Computing results
[1259.] Done getting results data
[1261.] Done computing results
[1261.] ** EXTEND_RESULTS_EXTRAPOLING_RELATIONS **
[1261.] No relation extrapolated.
[1261.] ** EXTEND_RESULTS_REDUCING_TOKENS **
[1261.] Checking cream.
[1261.] Computing results
[1261.] Done computing results
[1261.] Computing results
[1261.] Done computing results
20 Boulevard Matabiau 31000 Toulouse (AQPpp | importance: 0.0816/0.1, str_distance: 0.9/1.0)
Boulevard Bossais 17160 Matha (jZ60y | importance: 0.0454/0.1, str_distance: 0.468/1.0)
20 Boulevard de Saint Hérié 17160 Matha (r8VXK | importance: 0.0484/0.1, str_distance: 0.4091/1.0)
Boulevard rabatau daniel matalon 13010 Marseille (3lWyx | importance: 0.0681/0.1, str_distance: 0.3441/1.0)
1261.8 ms — 1 run(s) — 4 results
> TOKENIZE mat
ma

@jdesboeufs jdesboeufs removed their assignment Mar 28, 2024
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
broadcast About a broadcast service bug Something isn't working
Projects
None yet
Development

No branches or pull requests

2 participants