Skip to content

Commit

Permalink
IMPALA-10994: Normalize the pip package name part of download URL.
Browse files Browse the repository at this point in the history
According to PEP-0503, pip repo server doesn't support unnormalized URL
access, and some package name within
'infra/python/deps/*requirements.txt' are unnormalized, e.g. 'Cython',
and pip_download.py will concat $PYPI_MIRROR and package name to get
download URL directly, which maybe unnormalized.

Fix this by normalize package name in download URL using the
recommanded method in PEP-0503.

Change-Id: I479df0ad7acf3c650b8f5317372261d5e2840864
Reviewed-on: http://gerrit.cloudera.org:8080/17987
Reviewed-by: Impala Public Jenkins <[email protected]>
Tested-by: Impala Public Jenkins <[email protected]>
  • Loading branch information
yx91490 authored and Impala Public Jenkins committed Nov 26, 2021
1 parent da53428 commit f566e7d
Showing 1 changed file with 2 additions and 1 deletion.
3 changes: 2 additions & 1 deletion infra/python/deps/pip_download.py
Original file line number Diff line number Diff line change
Expand Up @@ -82,7 +82,8 @@ def get_package_info(pkg_name, pkg_version):
# to sort them and return the first value in alphabetical order. This ensures that the
# same result is always returned even if the ordering changed on the server.
candidates = []
url = '{0}/simple/{1}/'.format(PYPI_MIRROR, pkg_name)
normalized_name = re.sub(r"[-_.]+", "-", pkg_name).lower()
url = '{0}/simple/{1}/'.format(PYPI_MIRROR, normalized_name)
print('Getting package info from {0}'.format(url))
# The web page should be in PEP 503 format (https://www.python.org/dev/peps/pep-0503/).
# We parse the page with regex instead of an html parser because that requires
Expand Down

0 comments on commit f566e7d

Please sign in to comment.