[Do not merge!] Build: Remove TRT model generation for V100 #7712

pvijayakrish · 2024-10-16T22:04:49Z

What does the PR do?

Remove TRT model generation on V100.

Checklist

Commit Type:

Check the conventional commit type
box here and add the label to the github PR.

Client PR: triton-inference-server/client#429

* name changes * updated names

* Add documentation on generative sequence * Address comment * Reflect the "iterative" change

Co-authored-by: Ryan McCormick <[email protected]>

…generate_stream endpoint.

…#6620) * Extend request objects lifetime * Remove explicit TRITONSERVER_InferenceRequestDelete * Format fix * Include the inference_request_ initialization to cover RequestNew --------- Co-authored-by: Neelay Shah <[email protected]>

This fixes the issue where python client has `AttributeError: 'NoneType' object has no attribute 'enum_types_by_name' errors after python version is updated.

* Update README and versions for 2.40.0 / 23.11 (#6544) * Removing path construction to use SymLink alternatives * Update version for PyTorch * Update windows Dockerfile configuration * Update triton version to 23.11 * Update README and versions for 2.40.0 / 23.11 * Fix typo * Ading 'ldconfig' to configure dynamic linking in container (#6602) * Point to tekit_backend (#6616) * Point to tekit_backend * Update version * Revert tekit changes (#6640) --------- Co-authored-by: Kris Hung <[email protected]>

* New testing to confirm large request timeout values can be passed and retrieved within Python BLS models.

* Add test for optional internal tensor within an ensemble * Fix up

* Set CMake version to 3.27.7 * Set CMake version to 3.27.7 * Fix double slash typo

* Mlflow plugin fix

* Unify iGPU test build with x86 ARM * adding TRITON_IGPU_BUILD to core build definition; adding logic to skip caffe2plan test if TRITON_IGPU_BUILD=1 * re-organizing some copies in Dockerfile.QA to fix igpu devel build * Pre-commit fix --------- Co-authored-by: kyle <[email protected]>

* adding default value for TRITON_IGPU_BUILD=OFF * fix newline --------- Co-authored-by: kyle <[email protected]>

* Add test case for decoupled model raising exception * Remove unused import * Address comment

Don't Build `tritonfrontend` for Windows.

Co-authored-by: GuanLuo <[email protected]>

…revent race condition (#7617)

Co-authored-by: Sai Kiran Polisetty <[email protected]>

…7684)

mc-nv · 2024-10-16T22:28:02Z

qa/common/gen_qa_model_repository

@@ -512,7 +512,7 @@ chmod -R 777 $FORMATDESTDIR
 python3 $SRCDIR/gen_qa_trt_data_dependent_shape.py --models_dir=$DATADEPENDENTDIR
 chmod -R 777 $DATADEPENDENTDIR
 # Make shared library for custom Hardmax plugin.
-(git clone -b release/10.0 https://github.com/NVIDIA/TensorRT.git && \
+(git clone -b release/10.5 https://github.com/NVIDIA/TensorRT.git && \


Just to avoid us from managing the version

Suggested change

(git clone -b release/10.5 https://github.com/NVIDIA/TensorRT.git && \

docker pull ${TENSORRT_IMAGE}

TENSORRT_VERSION="$(docker inspect ${TENSORRT_IMAGE} --format '{{index .Config.Labels "com.nvidia.tensorrt.version"}}' | cut -d . -f -2)"

(git clone -b release/${TENSORRT_VERSION} https://github.com/NVIDIA/TensorRT.git && \

Thanks @mc-nv! Ack the suggestions. This is not to be merged and is still under development and is only a work around for generating models on V100 to unblock testing.

rmccorm4 and others added 30 commits November 15, 2023 19:57

Fix L0_sagemaker (#6587)

ea1d87d

Add C++ server wrapper to the doc (#6592)

0b1694e

Add timeout to client apis and tests (#6546)

624f24b

Client PR: triton-inference-server/client#429

Change name generative -> iterative (#6601)

68d5bb4

* name changes * updated names

Add documentation on generative sequence (#6595)

926330e

* Add documentation on generative sequence * Address comment * Reflect the "iterative" change

Updated description of iterative sequences

65bc558

Restricted HTTP API documentation

96933b1

Co-authored-by: Ryan McCormick <[email protected]>

Add request cancellation and debugging guide to generated docs (#6617)

fe10097

Support for http request cancellation. Includes fix for seg fault in …

7c93b8e

…generate_stream endpoint.

Bumped vLLM version to v0.2.2 (#6623)

2bb61e9

Upgrade ORT version (#6618)

da3ecc1

Use compliant preprocessor (#6626)

17d2de7

Update README.md (#6627)

e0d00e6

Update protobuf after python update for testing (#6638)

b2dd98b

This fixes the issue where python client has `AttributeError: 'NoneType' object has no attribute 'enum_types_by_name' errors after python version is updated.

PYBE Timeout Tests (#6483)

0e721e4

* New testing to confirm large request timeout values can be passed and retrieved within Python BLS models.

Add note on lack of ensemble support (#6648)

9e7f237

Added request id to span attributes (#6667)

df9d6e8

Add test for optional internal tensor within an ensemble (#6663)

88b8843

* Add test for optional internal tensor within an ensemble * Fix up

Set CMake version to 3.27.7 (#6675)

afaf6b0

* Set CMake version to 3.27.7 * Set CMake version to 3.27.7 * Fix double slash typo

restore typo (#6680)

c329f48

Update 'main' to track development of 2.42.0 / 24.01 (#6673)

d496f95

iGPU build refactor (#6684) (#6691)

d191ef9

Mlflow Plugin Fix (#6685)

0b63e0d

* Mlflow plugin fix

Fix extra content-type headers in HTTP server (#6678)

ebd6a3e

adding default value for TRITON_IGPU_BUILD=OFF (#6705)

57b49b4

* adding default value for TRITON_IGPU_BUILD=OFF * fix newline --------- Co-authored-by: kyle <[email protected]>

Add test case for decoupled model raising exception (#6686)

b0bbabb

* Add test case for decoupled model raising exception * Remove unused import * Address comment

Escape special characters in general docs (#6697)

8f5f515

fpetrini15 and others added 23 commits September 6, 2024 17:13

Don't Build tritonfrontend for Windows. (#7599)

8da14cc

Don't Build `tritonfrontend` for Windows.

fix: Add reference count tracking for shared memory regions (#7567)

9076d2c

Co-authored-by: GuanLuo <[email protected]>

build/test: RHEL8 EA3 (#7595)

3eab666

Fix: Add mutex lock for state completion check in gRPC streaming to p…

e452b58

…revent race condition (#7617)

Update fetch_models.sh (#7621)

a93de16

ci: Set stability factor to a higher value (#7634)

b4525aa

[docs] Removed vLLM meetup announcement (#7673)

e44cf29

Update the versions post 24.09 release.

fe0e41e

Build: Update triton version in Map (#7610)

c2fa60c

Update versions post 24.09

26a05ed

Dockerfile.win10.min - Update dependency versions (#7633)

b0adf31

Update server versions post 24.09

86dbef3

ci: Reducing flakiness of L0_python_api (#7674)

1fa799e

[doc]Adjusted formatting of the warning (#7675)

3a21f61

fix: usage of ReadDataFromJson in array tensors (#7624)

1df30ed

Co-authored-by: Sai Kiran Polisetty <[email protected]>

fix: tritonfrontend gRPC Streaming Segmentation Fault (#7671)

9bbee48

test: Enhance Python gRPC streaming test to send multiple requests (#…

71a285a

…7684)

Updtae README and versions for 24.10

e8d58b2

Updtae version as 24.10

33c6830

Remove TRT model generation for V100

c24a935

Correcting the TRT branch name

cb7fd80

Remove TRT model generation for V100

49f986d

Remove TRT model generation for V100

6835fe7

pvijayakrish requested a review from mc-nv October 16, 2024 22:04

pvijayakrish changed the title ~~Build: Remove TRT model generation for V100~~ [Do not merge!] Build: Remove TRT model generation for V100 Oct 16, 2024

mc-nv reviewed Oct 16, 2024

View reviewed changes

mc-nv marked this pull request as draft October 16, 2024 22:30

mc-nv mentioned this pull request Oct 17, 2024

Build: Update TRT release branch referenced in model gen file #7693

Open

20 tasks

mc-nv mentioned this pull request Oct 30, 2024

Build: Update main branch post 24.10 release #7754

Merged

20 tasks

pvijayakrish force-pushed the pvijayakrish-24.10-V100 branch from 68cf1fe to 6835fe7 Compare January 15, 2025 17:13

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

[Do not merge!] Build: Remove TRT model generation for V100 #7712

[Do not merge!] Build: Remove TRT model generation for V100 #7712

pvijayakrish commented Oct 16, 2024

mc-nv Oct 16, 2024 •

edited

Loading

pvijayakrish Oct 16, 2024

-(git clone -b release/10.5 https://github.com/NVIDIA/TensorRT.git && \
+docker pull ${TENSORRT_IMAGE}
+TENSORRT_VERSION="$(docker inspect ${TENSORRT_IMAGE} --format '{{index .Config.Labels "com.nvidia.tensorrt.version"}}' | cut -d . -f -2)"
+(git clone -b release/${TENSORRT_VERSION} https://github.com/NVIDIA/TensorRT.git && \

[Do not merge!] Build: Remove TRT model generation for V100 #7712

Are you sure you want to change the base?

[Do not merge!] Build: Remove TRT model generation for V100 #7712

Conversation

pvijayakrish commented Oct 16, 2024

What does the PR do?

Checklist

Commit Type:

mc-nv Oct 16, 2024 • edited Loading

Choose a reason for hiding this comment

pvijayakrish Oct 16, 2024

Choose a reason for hiding this comment

mc-nv Oct 16, 2024 •

edited

Loading