-
Notifications
You must be signed in to change notification settings - Fork 1.5k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
chore: PA Migration From Client #7449
Changes from all commits
7451622
97fe6f5
929a067
0477142
7e65b83
4d55a99
2e1fdbb
File filter
Filter by extension
Conversations
Jump to
Diff view
Diff view
There are no files selected for viewing
Original file line number | Diff line number | Diff line change |
---|---|---|
|
@@ -32,8 +32,10 @@ | |
ARG BASE_IMAGE=nvcr.io/nvidia/tritonserver:24.07-py3-min | ||
|
||
ARG TRITON_CLIENT_REPO_SUBDIR=clientrepo | ||
ARG TRITON_PA_REPO_SUBDIR=perfanalyzerrepo | ||
ARG TRITON_COMMON_REPO_TAG=main | ||
ARG TRITON_CORE_REPO_TAG=main | ||
ARG TRITON_CLIENT_REPO_TAG=main | ||
ARG TRITON_THIRD_PARTY_REPO_TAG=main | ||
ARG TRITON_MODEL_ANALYZER_REPO_TAG=main | ||
ARG TRITON_ENABLE_GPU=ON | ||
|
@@ -103,8 +105,10 @@ RUN rm -f /usr/bin/python && \ | |
# Build the client library and examples | ||
ARG TRITON_REPO_ORGANIZATION | ||
ARG TRITON_CLIENT_REPO_SUBDIR | ||
ARG TRITON_PA_REPO_SUBDIR | ||
ARG TRITON_COMMON_REPO_TAG | ||
ARG TRITON_CORE_REPO_TAG | ||
ARG TRITON_CLIENT_REPO_TAG | ||
ARG TRITON_THIRD_PARTY_REPO_TAG | ||
ARG TRITON_ENABLE_GPU | ||
ARG JAVA_BINDINGS_MAVEN_VERSION | ||
|
@@ -114,26 +118,53 @@ ARG TARGETPLATFORM | |
WORKDIR /workspace | ||
COPY TRITON_VERSION . | ||
COPY ${TRITON_CLIENT_REPO_SUBDIR} client | ||
COPY ${TRITON_PA_REPO_SUBDIR} perf_analyzer | ||
|
||
WORKDIR /workspace/build | ||
WORKDIR /workspace/client_build | ||
RUN cmake -DCMAKE_INSTALL_PREFIX=/workspace/install \ | ||
-DTRITON_VERSION=`cat /workspace/TRITON_VERSION` \ | ||
-DTRITON_REPO_ORGANIZATION=${TRITON_REPO_ORGANIZATION} \ | ||
-DTRITON_COMMON_REPO_TAG=${TRITON_COMMON_REPO_TAG} \ | ||
-DTRITON_CORE_REPO_TAG=${TRITON_CORE_REPO_TAG} \ | ||
-DTRITON_THIRD_PARTY_REPO_TAG=${TRITON_THIRD_PARTY_REPO_TAG} \ | ||
-DTRITON_ENABLE_PERF_ANALYZER=OFF \ | ||
-DTRITON_ENABLE_CC_HTTP=ON -DTRITON_ENABLE_CC_GRPC=ON \ | ||
-DTRITON_ENABLE_PYTHON_HTTP=ON -DTRITON_ENABLE_PYTHON_GRPC=ON \ | ||
-DTRITON_ENABLE_PYTHON_HTTP=OFF -DTRITON_ENABLE_PYTHON_GRPC=OFF \ | ||
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. I think you can just turn it on and build it as that is what the client stage will look like at the end, avoiding rebuild can leave to the PA stage There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. We can build the python client here, however, we need to rebuild it as part of the PA build in order for the pip wheel to contain the PA-associated binaries. For this reason, I figured it was best to skip the python client build at this stage. Am I misunderstanding? |
||
-DTRITON_ENABLE_JAVA_HTTP=ON \ | ||
-DTRITON_ENABLE_PERF_ANALYZER=ON \ | ||
-DTRITON_ENABLE_EXAMPLES=ON -DTRITON_ENABLE_TESTS=ON \ | ||
-DTRITON_ENABLE_GPU=${TRITON_ENABLE_GPU} /workspace/client | ||
RUN make -j16 cc-clients java-clients && \ | ||
rm -fr ~/.m2 | ||
|
||
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. Looks like a multi-stage build to me: 1. client libs; 2. PA / genai-perf There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. I general, yes, however, it's complicated by two things:
|
||
# TODO: PA will rebuild the CC clients since it depends on it. | ||
# This should be optimized so that we do not have to build | ||
# the CC clients twice. Similarly, because the SDK expectation is | ||
# that PA is packaged with the python client, we hold off on building | ||
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. Why it is packaged with Python client? There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. I am not sure what the original motivation was (likely convenience). However, this design of packaging PA has been baked into the python client AFAIK starting back in 2022. If we want to break this expectation, that is a separate discussion. |
||
# the python client until now. Post-migration we should focus | ||
# effort on de-tangling these flows. | ||
WORKDIR /workspace/pa_build | ||
RUN cmake -DCMAKE_INSTALL_PREFIX=/workspace/install \ | ||
-DTRITON_VERSION=`cat /workspace/TRITON_VERSION` \ | ||
-DTRITON_REPO_ORGANIZATION=${TRITON_REPO_ORGANIZATION} \ | ||
-DTRITON_COMMON_REPO_TAG=${TRITON_COMMON_REPO_TAG} \ | ||
-DTRITON_CORE_REPO_TAG=${TRITON_CORE_REPO_TAG} \ | ||
-DTRITON_CLIENT_REPO_TAG=${TRITON_CLIENT_REPO_TAG} \ | ||
-DTRITON_ENABLE_PERF_ANALYZER_C_API=ON \ | ||
-DTRITON_ENABLE_PERF_ANALYZER_TFS=ON \ | ||
-DTRITON_ENABLE_PERF_ANALYZER_TS=ON \ | ||
-DTRITON_ENABLE_PERF_ANALYZER_OPENAI=ON \ | ||
-DTRITON_ENABLE_EXAMPLES=ON -DTRITON_ENABLE_TESTS=ON \ | ||
-DTRITON_ENABLE_GPU=${TRITON_ENABLE_GPU} /workspace/client | ||
RUN make -j16 cc-clients python-clients java-clients && \ | ||
rm -fr ~/.m2 | ||
-DTRITON_ENABLE_CC_HTTP=ON \ | ||
-DTRITON_ENABLE_CC_GRPC=ON \ | ||
-DTRITON_ENABLE_PYTHON_HTTP=ON \ | ||
-DTRITON_ENABLE_PYTHON_GRPC=ON \ | ||
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. Should add some TODOs, I don't think these flags will be removed in the final version. There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. Can you clarify what TODOs are needed here? You're correct in that these flags will not be removed in the final version. There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. Sorry, I meant "will be".. PA doesn't have dependency on Python client library, right? There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. You are correct. PA does not depend on the Python client. However, as per current user expectation, the Python client does depend on PA, since it is generally packaged into the python client pip wheel during the build process (see comment above). We may need to discuss as an org if we want to break this expectation now that PA will be its own repo. |
||
-DTRITON_PACKAGE_PERF_ANALYZER=ON \ | ||
-DTRITON_ENABLE_GPU=${TRITON_ENABLE_GPU} \ | ||
/workspace/perf_analyzer | ||
RUN make -j16 perf-analyzer python-clients | ||
|
||
RUN pip3 install build \ | ||
&& cd /workspace/perf_analyzer/genai-perf \ | ||
&& python3 -m build --wheel --outdir /workspace/install/python | ||
|
||
# Install Java API Bindings | ||
RUN if [ "$TARGETPLATFORM" = "linux/amd64" ]; then \ | ||
|
@@ -144,9 +175,6 @@ RUN if [ "$TARGETPLATFORM" = "linux/amd64" ]; then \ | |
--jar-install-path /workspace/install/java-api-bindings; \ | ||
fi | ||
|
||
RUN pip3 install build \ | ||
&& cd /workspace/client/src/c++/perf_analyzer/genai-perf \ | ||
&& python3 -m build --wheel --outdir /workspace/install/python | ||
############################################################################ | ||
## Create sdk container | ||
############################################################################ | ||
|
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
can remove, this is OFF by default.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I saw it was off by default, but wanted to explicitly mark it as off for expectation clarity and JIC it changed. Ultimately, this option will be removed entirely from the client CMakeLists.txt. However, I can remove the option as well.