-
Notifications
You must be signed in to change notification settings - Fork 4.2k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Implement RemoteDownloader w/ --experimental_remote_downloader
#10622
Implement RemoteDownloader w/ --experimental_remote_downloader
#10622
Conversation
Thanks for the PR. I think the code should rather live in the downloader package next to the http implementation. I don't really see a good reason for it being in the remote package. WDYT? |
src/main/java/com/google/devtools/build/lib/bazel/repository/downloader/HashOutputStream.java
Outdated
Show resolved
Hide resolved
When I tried to do this I hit circular dependency problems -- quite a lot of dependency problems, actually, because of how large the I could try moving it back but it might require creating shim interfaces for most of the remote execution utilities (e.g. the tracing metadata injector). I could also try creating a |
It's hard for me to say how bad the circular dependency problem is. I'd much prefer if this functionality was implemented outside of the remote module. If that means splitting up large java_library targets into smaller ones then that's fine with me. I think we should treat the Fetcher service and remote execution/caching as separate. Both make sense to be used without the other. For the flag naming how about |
The remote downloader has a hard dependency on the remote CAS. It's not possible to use
I think that would be confusing, given that this is not a proxy. It's a service that is told to download some external resource (from |
8a3f5a8
to
33bb551
Compare
Done -- Rebased to master, un-WIP'd the remote-apis vendor commit, and relocated the Does this look better? If so, I'll start on the tests. |
--experimental_remote_downloader
--experimental_remote_downloader
Timed out waiting for response. I've added a unit test Also removing [WIP] since the TODOs are resolved. |
return out; | ||
} | ||
|
||
private static String authHeadersJson(Map<URI, Map<String, String>> authHeaders) { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
How does this interact with the authentication methods from the remote module? Why do we want this?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
ctx.download()
allows arbitrary HTTP headers to be passed to the HTTP library for authenticating with the remote endpoint. This is part of the netrc
support added in ... v2.0? v2.1? Fairly recently.
This code implements that by passing it along to the remote download server, which is free to use (or ignore) that data. The download server would use it when the place being downloaded from (e.g. example.com) needs special headers such as HTTP Basic Auth.
"ReferenceCountedChannel.java", | ||
"RemoteRetrier.java", | ||
"RemoteRetrierUtils.java", | ||
"Retrier.java", |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
That's surprising. Why do you need this?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The RemoteRetrier
class implements the logic of --remote_retries
, which causes Bazel to retry failed gRPC calls sent to the remote executor/storage/downloader.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Let me be more clear: This target looks odd. Why do you need to exclude the sources? Is there a way to have more meaningful targets rather than one large one?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I would personally prefer smaller targets, and I also think it looks weird, but this is all pre-existing weirdness in Bazel.
//src/main/java/com/google/devtools/build/lib/remote
is a glob of every Java file in the directory.- Since there is now a dependency of
RemoteModule
->downloader
->RemoteRetrier
, we need to have at least some of the Java files inremote/
be a separate target. - Other most-but-not-all-srcs style targets handle this case with globs that exclude some source files.
If you feel like adjusting the remote/
directory to have finer-grained targets that sounds like a good idea to me. I would prefer to avoid doing any major surgery to the Bazel build graph as part of this PR.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This is some unfortunate tech debt I'm willing to punt on temporarily. If you have any ideas on how to improve the situation/would like to clean up this package (ie, enumerating all files in the srcs list) it'd be much appreciated.
b9d6fd7
to
d81b062
Compare
(rebased for #10679) |
@buchgr, do you think this PR can be merged in time for Bazel 2.2? (@jmillikin-stripe asked for it in #10133 (comment)) |
gentle ping @buchgr Please let me know if this PR is too large to easily review, I can split some parts (like the |
Sorry I haven't forgotten about it. I am currently overloaded with an internal project. We'll ship it in 2.2. |
I'm going to try splitting this up so it's easier to land incrementally, with the hopes of getting it into v2.3. First subset is vendoring the updated remote-apis protos: #10818 |
src/main/java/com/google/devtools/build/lib/bazel/repository/downloader/BUILD
Outdated
Show resolved
Hide resolved
import java.util.Map; | ||
import java.util.Optional; | ||
|
||
public class DownloaderDelegate implements Downloader { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This class needs documentation
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This is still open...
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
+1 - this makes it clear what's so special about this implementation. IIUC it's a wrapper around downloader that lets the implementation be swapped out?
...ain/java/com/google/devtools/build/lib/bazel/repository/downloader/GrpcRemoteDownloader.java
Outdated
Show resolved
Hide resolved
} | ||
|
||
@Override | ||
public void download( |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
There's not enough documentation here. Minimum:
- thread-safety (although that could also be covered in the interface)
- deduplication
- exceptions
return null; | ||
})); | ||
} catch (StatusRuntimeException e) { | ||
throw new IOException(e); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I really really dislike the decision by upstream grpc-java to use runtime exceptions. It makes it much harder to make the error handling correct.
However, we also need a clear policy for Bazel on how to handle these exceptions. In many cases, StatusRuntimeException is already wrapping an IOException, so this ends up with an IOException -> StatusRuntimeException -> IOException chain, which makes it hard to debug. My preference would be to rethrow the inner IOException if there is one, and only wrap if not.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I copied this exception handling from GrpcCacheClient
, and I don't know enough about grpc-java to understand the exact issue you describe. Would it be possible for a Bazel dev to make the existing code act as you describe? Then I could copy it, and/or use a shared helper that does the right thing w.r.t. error handling.
|
||
public final class RemoteDownloaderFactory implements DownloaderFactory { | ||
|
||
private volatile Downloader remoteDownloader; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Prefer synchronized over volatile. You almost never need it from a performance point of view, and it's much harder to reason about.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This logic is copied from the existing delegate/factory stacks, such as RepositoryRemoteExecutorFactoryDelegate
. To be honest I have no idea why this level of indirection is necessary, and would be happy to have some guidance (or, like, direct commits to this PR) from someone who writes Java regularly.
I tried using synchronized
instead, but the build fails:
src/main/java/com/google/devtools/build/lib/remote/RemoteDownloaderFactory.java:24: error: modifier synchronized not allowed here
private synchronized Downloader remoteDownloader;
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Oh, sorry. The synchronized goes on all methods that access the field, like so:
private Downloader remoteDownloader;
public synchronized void init(Downloader remoteDownloader) {
}
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yea, you can do something like...
@GuardedBy("this")
private Downloader remoteDownloader;
Then add "synchronized" to all methods using it. This'll also let you get rid of the whole "remoteDownloader0" thing.
The volatile pattern in other files was likely an oversight.
src/main/java/com/google/devtools/build/lib/remote/RemoteDownloaderFactory.java
Outdated
Show resolved
Hide resolved
src/main/java/com/google/devtools/build/lib/remote/RemoteModule.java
Outdated
Show resolved
Hide resolved
src/main/java/com/google/devtools/build/lib/remote/RemoteRetrierUtils.java
Outdated
Show resolved
Hide resolved
@@ -1589,7 +1597,8 @@ public BlazeRuntime build() throws AbruptExitException { | |||
productName, | |||
serverBuilder.getBuildEventArtifactUploaderMap(), | |||
serverBuilder.getAuthHeadersProvidersMap(), | |||
serverBuilder.getRepositoryRemoteExecutorFactory()); | |||
serverBuilder.getRepositoryRemoteExecutorFactory(), | |||
serverBuilder.getDownloaderFactory()); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This looks wrong. This means it's not possible to have different Downloader factories for different builds in the same client. Shouldn't this be part of the CommandEnvironment instead?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I don't understand the question, but handling of the remote downloader should match that of the other remote capabilities (caching and execution). The calls to getRepositoryRemoteExecutorFactory()
and getDownloaderFactory()
belong together.
If they belong together somewhere else, could we do that separately? I'm very out of my depth when it comes to this level of the Bazel client architecture.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think what ulf means is, assuming you're familiar with the client-server setup, the DownloaderFactory instance (and thus implementation) is tied to the lifetime of the server. It cannot be changed across different commands to the same server. I think this is fine since it's no worse than the current setup for other remote stuff, and there's only one impl.
0878448
to
0caf2c4
Compare
@ulfjack Thank you so much for the review! I've addressed most comments, but there are a few places where I either didn't understand the request (I don't write much Java), or where the issue was not introduced by my change. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Note that I don't work at Google anymore, and I am spending most of my time starting my own company... providing a remote execution service and will also implement the remote asset API. However, it means that I can't currently promise any SLAs on reviews.
I think this is fine to merge. There are a few minor nits left, but they could also be addressed after the initial commit.
import java.util.Map; | ||
import java.util.Optional; | ||
|
||
public class DownloaderDelegate implements Downloader { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This is still open...
|
||
public final class RemoteDownloaderFactory implements DownloaderFactory { | ||
|
||
private volatile Downloader remoteDownloader; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Oh, sorry. The synchronized goes on all methods that access the field, like so:
private Downloader remoteDownloader;
public synchronized void init(Downloader remoteDownloader) {
}
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think conceptually everything seems fine - I trust that you understand the end-to-end goal here better than I and that you've been testing it locally. Most comments are around style and organization.
@@ -43,6 +43,7 @@ void download( | |||
List<URL> urls, | |||
Map<URI, Map<String, String>> authHeaders, | |||
Optional<Checksum> checksum, | |||
String canonicalId, |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
can you document what this is?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The purpose and semantics of canonical_id
is documented at https://docs.bazel.build/versions/master/skylark/lib/repository_ctx.html#download
Note that this field was originally present in the PR that added the Downloader
interface. It got removed as part of the import into Google, and I just need back, because we have propagate this data to the backend somehow.
import java.util.Map; | ||
import java.util.Optional; | ||
|
||
public class DownloaderDelegate implements Downloader { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
+1 - this makes it clear what's so special about this implementation. IIUC it's a wrapper around downloader that lets the implementation be swapped out?
Map<String, String> clientEnv) | ||
throws IOException, InterruptedException { | ||
Downloader delegate = defaultDelegate; | ||
Optional<DownloaderFactory> factory = delegateFactory; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
i'm assuming this is in case delegateFactory changes mid-method? Is that actually expected? If so seems dubious... If not then just inline the call, which should simplify most of this method to...
Downloader delegate = delegateFactory.isPresent()
? delegateFactory.get().create().orElse(defaultDelegate)
: defaultDelegate
"ReferenceCountedChannel.java", | ||
"RemoteRetrier.java", | ||
"RemoteRetrierUtils.java", | ||
"Retrier.java", |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This is some unfortunate tech debt I'm willing to punt on temporarily. If you have any ideas on how to improve the situation/would like to clean up this package (ie, enumerating all files in the srcs list) it'd be much appreciated.
import com.google.devtools.build.lib.bazel.repository.downloader.DownloaderFactory; | ||
import java.util.Optional; | ||
|
||
public final class RemoteDownloaderFactory implements DownloaderFactory { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
add some quick javadoc explaining what's so interesting about this impl.
actual = hasher.hash(); | ||
} | ||
if (!code.equals(actual)) { | ||
throw new UnrecoverableHttpException( |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This side-effect is unfortunately hidden, but I guess there's prior art for this.
|
||
@BeforeClass | ||
public static void beforeEverything() { | ||
retryService = MoreExecutors.listeningDecorator(Executors.newScheduledThreadPool(1)); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
why not put this in the before and after? Otherwise we can leak threads across tests within this test class
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I copy-pasted this from the existing downloader test case, and don't have enough context in Bazel to know whether there's a pre-existing bug here.
private static byte[] downloadBlob( | ||
GrpcRemoteDownloader downloader, URL url, Optional<Checksum> checksum) | ||
throws IOException, InterruptedException { | ||
final List<URL> urls = ImmutableList.<URL>of(url); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Modern java should be smart enough that you can leave the "" out of "ImmutableList.of(...)", so rm it. Same for other occurrences where the compiler is fine with it.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Done in forked PR #10914
} | ||
|
||
private GrpcRemoteDownloader newDownloader(RemoteCacheClient cacheClient) throws IOException { | ||
final RemoteOptions remoteOptions = Options.getDefaults(RemoteOptions.class); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
btw, final shouldn't be necessary on most local variables, but it's not necessary to remove
|
||
getFromFuture(cacheClient.uploadBlob(contentDigest, ByteString.copyFromUtf8("wrong content"))); | ||
|
||
IOException e = assertThrows(UnrecoverableHttpException.class, () -> downloadBlob( |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Fix formatting to improve readability:
IOException e = assertThrows(
UnrecoverableHttpException.class,
() -> downloadBlob(
downloader,
...,
...));
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Done in forked PR #10914
cf9a90e
to
b3d1ddb
Compare
Left comments on #10914. What's remaining in this PR after that? Can you give a high-level description? |
After that PR is merged, this one would be just the plumbing of the new downloader into |
Extracted from #10622 Per discussion on that PR, there's still some unanswered questions about how exactly we plumb the new `Downloader` type into `RemoteModule`. And per #10742 (comment), it is unlikely that even heroic effort from me will get the full end-to-end functionality into v3.0. Given this, to simplify the review, I'm taking some of the bits the reviewer is happy with and moving them to a separate PR. After merger, `GrpcRemoteDownloader` and its tests will exist in the source tree, but will not yet be available as CLI options. R: @michajlo CC: @adunham-stripe @dslomov @EricBurnett @philwo @sstriker Closes #10914. PiperOrigin-RevId: 299908615
#10914 is now merged, do you want to rebase this and I will review the remainder? |
b3d1ddb
to
5f8553e
Compare
Rebased to master. Per earlier comments, I've rewritten this final commit. Instead of using the existing design of I don't know if the new version is as idiomatic for Java, but I find it easier to follow. Please let me know if you'd prefer the previous way. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This looks fine overall, just one question.
repoContext, | ||
cacheClient, | ||
remoteOptions)); | ||
downloaderChannel.release(); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Why the immediate release after the retain?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Trying to match the existing code. Ctrl+F new ByteStreamUploader(
or new GrpcRemoteExecutor(
for where I'm looking as examples.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks for the explanation. I'll import this today and try to cut a new RC of 3.0.0.
Extracted from #10622 Per discussion on that PR, there's still some unanswered questions about how exactly we plumb the new `Downloader` type into `RemoteModule`. And per #10742 (comment), it is unlikely that even heroic effort from me will get the full end-to-end functionality into v3.0. Given this, to simplify the review, I'm taking some of the bits the reviewer is happy with and moving them to a separate PR. After merger, `GrpcRemoteDownloader` and its tests will exist in the source tree, but will not yet be available as CLI options. R: @michajlo CC: @adunham-stripe @dslomov @EricBurnett @philwo @sstriker Closes #10914. PiperOrigin-RevId: 299908615
This is the Bazel client implementation of bazelbuild/proposals#160. It allows downloading of external dependencies to be delegated to a remote service. TODOs: - [x] Once bazelbuild/remote-apis#112 is merged, the vendored copy of `bazelbuild/remote-apis` should be updated. I've used a [WIP] placeholder for now. - [x] If the general approach looks reasonable then I'll add tests. Currently I've been testing with an in-house implementation of the downloader server. R: @buchgr @dslomov CC: @EricBurnett @sstriker @ulfjack Closes #10622. PiperOrigin-RevId: 300116716
@katre Thank you so much for your help! Both for the review, and for handling the cherry-pick. I no longer have Peer Bonus access, but if you're ever in Tokyo I'll buy you lunch or something :P |
@@ -160,6 +172,13 @@ public void beforeCommand(CommandEnvironment env) throws AbruptExitException { | |||
boolean enableHttpCache = RemoteCacheClientFactory.isHttpCache(remoteOptions); | |||
boolean enableGrpcCache = GrpcCacheClient.isRemoteCacheOptions(remoteOptions); | |||
boolean enableRemoteExecution = shouldEnableRemoteExecution(remoteOptions); | |||
boolean enableRemoteDownloader = shouldEnableRemoteDownloader(remoteOptions); | |||
|
|||
if (enableRemoteDownloader && !enableGrpcCache) { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This seems to require --remote_cache
be set alongside --remote_executor
are there any side effects which could arise from changing this to the below snippet?
if (enableRemoteDownloader && (!enableGrpcCache || !enableRemoteExecution)) {
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Remote execution is not required to use remote downloads. Remote caching (specifically gRPC remote caching) is required, because that's where the downloads get placed.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
However --remote_cache
is not usually required for the use of --remote_executor
. I assume bazel has some defaulting behaviour for the cache endpoint to be the executor endpoint. This enforces a flag requirement for the combination of these features.
Remote execution with the use of the remote downloader requires all 3 flags.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I've opened #11913 to track this issue.
@jmillikin Is there any way to invalidate the Remote Downloader cache? I've tried using |
I think that would be a question for the author of your remote downloader implementation. The Bazel client only uses one RPC, which is approximately "download file and return the CAS checksum + size". Other RPCs for cache management and invalidation are implementation-specific. |
@jmillikin Ok, but I wonder if it might be worth a feature request to make the Remote Assets API sensitive to the |
This is the Bazel client implementation of bazelbuild/proposals#160. It allows downloading of external dependencies to be delegated to a remote service.
TODOs:
bazelbuild/remote-apis
should be updated. I've used a [WIP] placeholder for now.R: @buchgr @dslomov
CC: @EricBurnett @sstriker @ulfjack