You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
same name same ASCII representation, by unaccent(name) function.
similar name same Metaphone-pt
Rules to canonize (choice of the official):
WHEN same CPF and similar name (or same first names)
1.1. check official name with a CPF-resolver
1.2. (if not possible) use the most recent
WHEN no CPF and same name (and same birthDate)
2.1. use the most recent when more than 6 years diff
2.2. (when else) use the "most accented" version or "most standard pt-BR" (eg. preffer use of i insted y)
... Use some log to notice mesages in conflict resolutions ...
Example:
name | birthdate | source
----------------------------------+------------+---------------------------
ANTONIO SETUBAL SILVESTRE | 1963-01-04 | br:tse;ce:candidatos:2016
ANTÔNIO SETUBAL SILVESTRE | 1963-01-04 | br:tse;ce:candidatos:2008
ANTONIO SETÚBAL SILVESTRE | 1963-01-04 | br:tse;ce:candidatos:2004
ANTÔNIO SETÚBAL SILVESTRE | 1963-01-04 | br:tse;ce:candidatos:2012
FABRICIO JOSE SATIRO DE OLIVEIRA | 1975-07-01 | br:tse;sc:candidatos:2010
FABRICIO JOSÉ SATIRO DE OLIVEIRA | 1975-07-01 | br:tse;sc:candidatos:2004
FABRÍCIO JOSÉ SATIRO DE OLIVEIRA | 1975-07-01 | br:tse;sc:candidatos:2000
FABRICIO JOSÉ SÁTIRO DE OLIVEIRA | 1975-07-01 | br:tse;sc:candidatos:2008
FABRÍCIO JOSÉ SÁTIRO DE OLIVEIRA | 1975-07-01 | br:tse;sc:candidatos:2012
Most accented "ANTÔNIO SETÚBAL SILVESTRE" of 2012, most recent "ANTONIO SETUBAL SILVESTRE" of 2016...
After canonized, delete records and register all variants in the info JSON
Rules to normalize:
unaccent(name)
function.Rules to canonize (choice of the official):
WHEN same CPF and similar name (or same first names)
1.1. check official name with a CPF-resolver
1.2. (if not possible) use the most recent
WHEN no CPF and same name (and same birthDate)
2.1. use the most recent when more than 6 years diff
2.2. (when else) use the "most accented" version or "most standard pt-BR" (eg. preffer use of i insted y)
... Use some log to notice mesages in conflict resolutions ...
Example:
Most accented "ANTÔNIO SETÚBAL SILVESTRE" of 2012, most recent "ANTONIO SETUBAL SILVESTRE" of 2016...
After canonized, delete records and register all variants in the
info
JSONThe text was updated successfully, but these errors were encountered: