Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

chore: PA Migration From Client #7449

Merged
merged 7 commits into from
Jul 29, 2024
Merged
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
48 changes: 38 additions & 10 deletions Dockerfile.sdk
Original file line number Diff line number Diff line change
Expand Up @@ -32,8 +32,10 @@
ARG BASE_IMAGE=nvcr.io/nvidia/tritonserver:24.07-py3-min

ARG TRITON_CLIENT_REPO_SUBDIR=clientrepo
ARG TRITON_PA_REPO_SUBDIR=perfanalyzerrepo
ARG TRITON_COMMON_REPO_TAG=main
ARG TRITON_CORE_REPO_TAG=main
ARG TRITON_CLIENT_REPO_TAG=main
ARG TRITON_THIRD_PARTY_REPO_TAG=main
ARG TRITON_MODEL_ANALYZER_REPO_TAG=main
ARG TRITON_ENABLE_GPU=ON
Expand Down Expand Up @@ -103,8 +105,10 @@ RUN rm -f /usr/bin/python && \
# Build the client library and examples
ARG TRITON_REPO_ORGANIZATION
ARG TRITON_CLIENT_REPO_SUBDIR
ARG TRITON_PA_REPO_SUBDIR
ARG TRITON_COMMON_REPO_TAG
ARG TRITON_CORE_REPO_TAG
ARG TRITON_CLIENT_REPO_TAG
ARG TRITON_THIRD_PARTY_REPO_TAG
ARG TRITON_ENABLE_GPU
ARG JAVA_BINDINGS_MAVEN_VERSION
Expand All @@ -114,26 +118,53 @@ ARG TARGETPLATFORM
WORKDIR /workspace
COPY TRITON_VERSION .
COPY ${TRITON_CLIENT_REPO_SUBDIR} client
COPY ${TRITON_PA_REPO_SUBDIR} perf_analyzer

WORKDIR /workspace/build
WORKDIR /workspace/client_build
RUN cmake -DCMAKE_INSTALL_PREFIX=/workspace/install \
-DTRITON_VERSION=`cat /workspace/TRITON_VERSION` \
-DTRITON_REPO_ORGANIZATION=${TRITON_REPO_ORGANIZATION} \
-DTRITON_COMMON_REPO_TAG=${TRITON_COMMON_REPO_TAG} \
-DTRITON_CORE_REPO_TAG=${TRITON_CORE_REPO_TAG} \
-DTRITON_THIRD_PARTY_REPO_TAG=${TRITON_THIRD_PARTY_REPO_TAG} \
-DTRITON_ENABLE_PERF_ANALYZER=OFF \
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

can remove, this is OFF by default.

Copy link
Contributor Author

@fpetrini15 fpetrini15 Jul 16, 2024

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I saw it was off by default, but wanted to explicitly mark it as off for expectation clarity and JIC it changed. Ultimately, this option will be removed entirely from the client CMakeLists.txt. However, I can remove the option as well.

-DTRITON_ENABLE_CC_HTTP=ON -DTRITON_ENABLE_CC_GRPC=ON \
-DTRITON_ENABLE_PYTHON_HTTP=ON -DTRITON_ENABLE_PYTHON_GRPC=ON \
-DTRITON_ENABLE_PYTHON_HTTP=OFF -DTRITON_ENABLE_PYTHON_GRPC=OFF \
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think you can just turn it on and build it as that is what the client stage will look like at the end, avoiding rebuild can leave to the PA stage

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We can build the python client here, however, we need to rebuild it as part of the PA build in order for the pip wheel to contain the PA-associated binaries. For this reason, I figured it was best to skip the python client build at this stage. Am I misunderstanding?

-DTRITON_ENABLE_JAVA_HTTP=ON \
-DTRITON_ENABLE_PERF_ANALYZER=ON \
-DTRITON_ENABLE_EXAMPLES=ON -DTRITON_ENABLE_TESTS=ON \
-DTRITON_ENABLE_GPU=${TRITON_ENABLE_GPU} /workspace/client
RUN make -j16 cc-clients java-clients && \
rm -fr ~/.m2

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Looks like a multi-stage build to me: 1. client libs; 2. PA / genai-perf

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I general, yes, however, it's complicated by two things:

  1. The python client, as discussed above, has functionality to package PA in its pip wheel. Therefore, to guarantee its availability, the python client must be built after PA.
  2. PA does (in the current PA repo PR) and should be able to build itself (including its client dependencies) without needing the client libs prebuilt.

# TODO: PA will rebuild the CC clients since it depends on it.
# This should be optimized so that we do not have to build
# the CC clients twice. Similarly, because the SDK expectation is
# that PA is packaged with the python client, we hold off on building
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Why it is packaged with Python client?

Copy link
Contributor Author

@fpetrini15 fpetrini15 Jul 16, 2024

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I am not sure what the original motivation was (likely convenience). However, this design of packaging PA has been baked into the python client AFAIK starting back in 2022. If we want to break this expectation, that is a separate discussion.

# the python client until now. Post-migration we should focus
# effort on de-tangling these flows.
WORKDIR /workspace/pa_build
RUN cmake -DCMAKE_INSTALL_PREFIX=/workspace/install \
-DTRITON_VERSION=`cat /workspace/TRITON_VERSION` \
-DTRITON_REPO_ORGANIZATION=${TRITON_REPO_ORGANIZATION} \
-DTRITON_COMMON_REPO_TAG=${TRITON_COMMON_REPO_TAG} \
-DTRITON_CORE_REPO_TAG=${TRITON_CORE_REPO_TAG} \
-DTRITON_CLIENT_REPO_TAG=${TRITON_CLIENT_REPO_TAG} \
-DTRITON_ENABLE_PERF_ANALYZER_C_API=ON \
-DTRITON_ENABLE_PERF_ANALYZER_TFS=ON \
-DTRITON_ENABLE_PERF_ANALYZER_TS=ON \
-DTRITON_ENABLE_PERF_ANALYZER_OPENAI=ON \
-DTRITON_ENABLE_EXAMPLES=ON -DTRITON_ENABLE_TESTS=ON \
-DTRITON_ENABLE_GPU=${TRITON_ENABLE_GPU} /workspace/client
RUN make -j16 cc-clients python-clients java-clients && \
rm -fr ~/.m2
-DTRITON_ENABLE_CC_HTTP=ON \
-DTRITON_ENABLE_CC_GRPC=ON \
-DTRITON_ENABLE_PYTHON_HTTP=ON \
-DTRITON_ENABLE_PYTHON_GRPC=ON \
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Should add some TODOs, I don't think these flags will be removed in the final version.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Can you clarify what TODOs are needed here? You're correct in that these flags will not be removed in the final version.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Sorry, I meant "will be".. PA doesn't have dependency on Python client library, right?

Copy link
Contributor Author

@fpetrini15 fpetrini15 Jul 16, 2024

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

You are correct. PA does not depend on the Python client. However, as per current user expectation, the Python client does depend on PA, since it is generally packaged into the python client pip wheel during the build process (see comment above).

We may need to discuss as an org if we want to break this expectation now that PA will be its own repo.

-DTRITON_PACKAGE_PERF_ANALYZER=ON \
-DTRITON_ENABLE_GPU=${TRITON_ENABLE_GPU} \
/workspace/perf_analyzer
RUN make -j16 perf-analyzer python-clients

RUN pip3 install build \
&& cd /workspace/perf_analyzer/genai-perf \
&& python3 -m build --wheel --outdir /workspace/install/python

# Install Java API Bindings
RUN if [ "$TARGETPLATFORM" = "linux/amd64" ]; then \
Expand All @@ -144,9 +175,6 @@ RUN if [ "$TARGETPLATFORM" = "linux/amd64" ]; then \
--jar-install-path /workspace/install/java-api-bindings; \
fi

RUN pip3 install build \
&& cd /workspace/client/src/c++/perf_analyzer/genai-perf \
&& python3 -m build --wheel --outdir /workspace/install/python
############################################################################
## Create sdk container
############################################################################
Expand Down
Loading