Skip to content

Commit

Permalink
fix: update normalize regex to support arabic + hebrew (#74)
Browse files Browse the repository at this point in the history
  • Loading branch information
theRealPadster authored Jan 2, 2024
1 parent c856d0e commit 9202f04
Showing 1 changed file with 3 additions and 2 deletions.
5 changes: 3 additions & 2 deletions src/logic.ts
Original file line number Diff line number Diff line change
Expand Up @@ -28,8 +28,9 @@ const normalize = (str: string | undefined) => {
// Convert & to 'and'
cleaned = cleaned.replace(/&/g, 'and');

// Remove everything (including spaces) that is not a number, letter, Cyrylic alphabet, Polish alphabet
cleaned = cleaned.replace(/[^\wа-яА-ЯіїІЇ\dąćęłńóśźż\d]/g, '');
// Remove everything (including spaces) that is not a number, letter, or from Cyrylic/Polish/Arabic/Hebrew alphabet
// (Github Copilot says Arabic letters range from \u0621 to \u064A and Hebrew letters range from \u05D0 to \u05EA)
cleaned = cleaned.replace(/[^\wа-яА-ЯіїІЇ\dąćęłńóśźż\u0621-\u064A\u05D0-\u05EA\d]/g, '');

// TODO: add any other logic?

Expand Down

0 comments on commit 9202f04

Please sign in to comment.