Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

WIP: Reimplementing search_dates #945

Open
wants to merge 44 commits into
base: master
Choose a base branch
from
Open
Show file tree
Hide file tree
Changes from 1 commit
Commits
Show all changes
44 commits
Select commit Hold shift + click to select a range
02220da
Implimenting new search_dates
gavishpoddar Jul 16, 2021
f933d3a
Fixing DATE_ORDER, implimenting deep_search, tests
gavishpoddar Jul 21, 2021
77727b5
Unproving _joint_parse with data_carry accurate_return_text, deep_se…
gavishpoddar Jul 21, 2021
e7f38e8
implementing _final_text_clean()
gavishpoddar Jul 22, 2021
962066c
Simplifying text_clean and modifying tests
gavishpoddar Jul 25, 2021
624ac8e
Implementing relative date
gavishpoddar Jul 28, 2021
42ca6f6
Fixing tests
gavishpoddar Jul 28, 2021
51749a2
secondary_split_implimentation
gavishpoddar Aug 3, 2021
f5e4635
positional args to keyword argument
gavishpoddar Aug 3, 2021
121b15f
Micro fixes
gavishpoddar Aug 3, 2021
2cd93f0
Removing codes now part of #953
gavishpoddar Aug 3, 2021
006d2a5
adding check_settings
gavishpoddar Aug 4, 2021
10404c9
implimenting double_punctuation_split
gavishpoddar Aug 4, 2021
22596e0
Updating docs and removing test (TMP)
gavishpoddar Aug 6, 2021
b799dfb
cleaning code, adding tests, improving coverage
gavishpoddar Aug 6, 2021
42c984a
Merge branch 'scrapinghub:master' into search_dates
gavishpoddar Aug 9, 2021
8fc5e0d
Improving codecov
gavishpoddar Aug 11, 2021
74b6ec4
temporary commit to get diff
gavishpoddar Aug 16, 2021
56e0505
Merge branch 'search_dates' of https://github.com/gavishpoddar/datepa…
gavishpoddar Aug 16, 2021
5a1b1c5
temporary file change for review
gavishpoddar Aug 16, 2021
aa2aa8f
reverting the previous commit
gavishpoddar Aug 16, 2021
41eff6a
improvements
gavishpoddar Aug 16, 2021
f65531b
formatting code
gavishpoddar Aug 17, 2021
982fc08
formatting code
gavishpoddar Aug 17, 2021
3621b2d
improvements in text filter
gavishpoddar Aug 18, 2021
8a9496b
Merge branch 'scrapinghub:master' into search_dates
gavishpoddar Aug 19, 2021
45996b4
removing previous search_dates
gavishpoddar Aug 23, 2021
2ac88c6
Merge branch 'search_dates' of https://github.com/gavishpoddar/datepa…
gavishpoddar Aug 23, 2021
5dabc62
adding test
gavishpoddar Aug 23, 2021
ab1778d
fixing doc string
gavishpoddar Aug 27, 2021
14adf89
fixing doc string
gavishpoddar Aug 27, 2021
d57223a
Merge branch 'search_dates' of https://github.com/gavishpoddar/datepa…
gavishpoddar Aug 27, 2021
88afa30
updating xfail
gavishpoddar Aug 28, 2021
9209f3d
updating tests
gavishpoddar Aug 28, 2021
85254e0
Apply suggestions from code review
gavishpoddar Sep 1, 2021
e4604e6
Merge branch 'master' into search_dates
gavishpoddar Sep 7, 2021
4f119dd
Updates
gavishpoddar Sep 7, 2021
e6da4be
Fixing upstraem merges
gavishpoddar Sep 7, 2021
f6116bf
DateSearch -> DateSearchWithDetection
gavishpoddar Sep 9, 2021
0525cdc
Merge branch 'scrapinghub:master' into search_dates
gavishpoddar Oct 7, 2021
96b91c0
updating test with xfail
gavishpoddar Oct 7, 2021
b9d12f3
Merge branch 'search_dates' of https://github.com/gavishpoddar/datepa…
gavishpoddar Oct 7, 2021
99e66c6
minor fixes
gavishpoddar Oct 7, 2021
2935aae
Merge branch 'master' into search_dates
serhii73 Dec 28, 2022
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
22 changes: 0 additions & 22 deletions dateparser/languages/locale.py
Original file line number Diff line number Diff line change
Expand Up @@ -176,7 +176,6 @@ def _generate_relative_translations(self, normalize=False):

def translate_search(self, search_string, settings=None):
dashes = ['-', '——', '—', '~']
word_joint_unsupported_laguage = ["zh", "ja"]
sentences = self._sentence_split(search_string, settings=settings)
dictionary = self._get_dictionary(settings=settings)
translated = []
Expand All @@ -185,31 +184,10 @@ def translate_search(self, search_string, settings=None):
original_tokens, simplified_tokens = self._simplify_split_align(sentence, settings=settings)
translated_chunk = []
original_chunk = []
simplified_tokens_length = len(simplified_tokens)
skip_next_token = False
for i, word in enumerate(simplified_tokens):

next_word = simplified_tokens[i + 1] if (simplified_tokens_length - 1) > i else ""
current_and_next_joined = self._join_chunk([word, next_word], settings=settings)

if skip_next_token:
skip_next_token = False
continue

if word == '' or word == ' ':
noviluni marked this conversation as resolved.
Show resolved Hide resolved
translated_chunk.append(word)
original_chunk.append(original_tokens[i])
elif (
current_and_next_joined in dictionary
and word not in dashes
and self.shortname not in word_joint_unsupported_laguage
):
translated_chunk.append(dictionary[current_and_next_joined])
original_chunk.append(
self._join_chunk([original_tokens[i], original_tokens[i + 1]], settings=settings)
)
skip_next_token = True

elif word in dictionary and word not in dashes:
translated_chunk.append(dictionary[word])
original_chunk.append(original_tokens[i])
Expand Down
2 changes: 1 addition & 1 deletion test.py
Original file line number Diff line number Diff line change
Expand Up @@ -2,7 +2,7 @@

# THIS IS TEMPORARY FILE FOR TESTS

text = """of 629"""
text = """10 Febbraio 2020 15:00 ciao moka"""

out1 = search_dates(text)
print(out1)
Expand Down