-
Notifications
You must be signed in to change notification settings - Fork 3.1k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Add parallel download support to BatchDownloader #12388
Changes from all commits
c3a5c73
ed13cea
2334399
a39cbc5
8b41c67
ec43625
3008a61
fa268ae
e754fd6
233ae10
e91234c
File filter
Filter by extension
Conversations
Jump to
Diff view
Diff view
There are no files selected for viewing
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -0,0 +1 @@ | ||
Add parallel download support to BatchDownloader |
Original file line number | Diff line number | Diff line change |
---|---|---|
|
@@ -7,6 +7,7 @@ | |
from pip._internal.cli.cmdoptions import make_target_python | ||
from pip._internal.cli.req_command import RequirementCommand, with_cleanup | ||
from pip._internal.cli.status_codes import SUCCESS | ||
from pip._internal.exceptions import CommandError | ||
from pip._internal.operations.build.build_tracker import get_build_tracker | ||
from pip._internal.req.req_install import check_legacy_setup_py_options | ||
from pip._internal.utils.misc import ensure_dir, normalize_path, write_output | ||
|
@@ -52,6 +53,7 @@ def add_options(self) -> None: | |
self.cmd_opts.add_option(cmdoptions.no_use_pep517()) | ||
self.cmd_opts.add_option(cmdoptions.check_build_deps()) | ||
self.cmd_opts.add_option(cmdoptions.ignore_requires_python()) | ||
self.cmd_opts.add_option(cmdoptions.parallel_downloads()) | ||
|
||
self.cmd_opts.add_option( | ||
"-d", | ||
|
@@ -76,6 +78,9 @@ def add_options(self) -> None: | |
|
||
@with_cleanup | ||
def run(self, options: Values, args: List[str]) -> int: | ||
if options.parallel_downloads < 1: | ||
raise CommandError("Value of '--parallel-downloads' must be greater than 0") | ||
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. Could this be use friendly, by instead of raising CommandError, surfacing the error to the user, but defaulting to 1, even here? There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. Do you mean raise a warning if the user sets |
||
|
||
options.ignore_installed = True | ||
# editable doesn't really make sense for `pip download`, but the bowels | ||
# of the RequirementSet code require that property. | ||
|
Original file line number | Diff line number | Diff line change |
---|---|---|
|
@@ -326,6 +326,7 @@ def __init__( | |
trusted_hosts: Sequence[str] = (), | ||
index_urls: Optional[List[str]] = None, | ||
ssl_context: Optional["SSLContext"] = None, | ||
parallel_downloads: int = 1, | ||
**kwargs: Any, | ||
) -> None: | ||
""" | ||
|
@@ -362,12 +363,22 @@ def __init__( | |
backoff_factor=0.25, | ||
) # type: ignore | ||
|
||
# Used to set numbers of parallel downloads in | ||
# pip._internal.network.BatchDownloader and to set pool_connection in | ||
# the HTTPAdapter to prevent connection pool from hitting the default(10) | ||
# limit and throwing 'Connection pool is full' warnings | ||
self.parallel_downloads = parallel_downloads | ||
pool_maxsize = max(self.parallel_downloads, 10) | ||
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. Is this not a configurable limit? It seems reasonable to allow more than 10 threads for downloading as each package should be able to be downloading in parallel. There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. Edit: I see now that we actually do set it below, so I think this is just unnecessary now. Please consider removing/changing to just There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. The default There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. Thanks Neil, you're right. On a second review that seems a wise decision. I actually misread it as |
||
# Our Insecure HTTPAdapter disables HTTPS validation. It does not | ||
# support caching so we'll use it for all http:// URLs. | ||
# If caching is disabled, we will also use it for | ||
# https:// hosts that we've marked as ignoring | ||
# TLS errors for (trusted-hosts). | ||
insecure_adapter = InsecureHTTPAdapter(max_retries=retries) | ||
insecure_adapter = InsecureHTTPAdapter( | ||
max_retries=retries, | ||
pool_connections=pool_maxsize, | ||
pool_maxsize=pool_maxsize, | ||
) | ||
|
||
# We want to _only_ cache responses on securely fetched origins or when | ||
# the host is specified as trusted. We do this because | ||
|
@@ -385,7 +396,12 @@ def __init__( | |
max_retries=retries, | ||
) | ||
else: | ||
secure_adapter = HTTPAdapter(max_retries=retries, ssl_context=ssl_context) | ||
secure_adapter = HTTPAdapter( | ||
max_retries=retries, | ||
ssl_context=ssl_context, | ||
pool_connections=pool_maxsize, | ||
pool_maxsize=pool_maxsize, | ||
) | ||
self._trusted_host_adapter = insecure_adapter | ||
|
||
self.mount("https://", secure_adapter) | ||
|
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Does this branch ever get used? I thought
default
would cover the case where the user does not supply a value.There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
the
parallel_downloads
option is only added to theinstall
anddownload
sub-commands, so in other sub-commands (index, list etc.) there is nooptions.parallel_downloads
. So in those cases I'm settingparallel_downloads
to 1.