Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[Do not merge!] Build: Remove TRT model generation for V100 #7712

Draft
wants to merge 3,421 commits into
base: r24.10
Choose a base branch
from
Draft
Show file tree
Hide file tree
Changes from all commits
Commits
Show all changes
3421 commits
Select commit Hold shift + click to select a range
ea1d87d
Fix L0_sagemaker (#6587)
rmccorm4 Nov 16, 2023
0b1694e
Add C++ server wrapper to the doc (#6592)
krishung5 Nov 16, 2023
624f24b
Add timeout to client apis and tests (#6546)
jbkyang-nvi Nov 16, 2023
68d5bb4
Change name generative -> iterative (#6601)
nnshah1 Nov 19, 2023
926330e
Add documentation on generative sequence (#6595)
GuanLuo Nov 20, 2023
65bc558
Updated description of iterative sequences
nnshah1 Nov 20, 2023
96933b1
Restricted HTTP API documentation
nnshah1 Nov 20, 2023
fe10097
Add request cancellation and debugging guide to generated docs (#6617)
rmccorm4 Nov 21, 2023
7c93b8e
Support for http request cancellation. Includes fix for seg fault in …
nnshah1 Nov 21, 2023
2bb61e9
Bumped vLLM version to v0.2.2 (#6623)
oandreeva-nv Nov 21, 2023
da3ecc1
Upgrade ORT version (#6618)
tanmayv25 Nov 21, 2023
17d2de7
Use compliant preprocessor (#6626)
tanmayv25 Nov 22, 2023
e0d00e6
Update README.md (#6627)
dzier Nov 22, 2023
0556b7c
Extend request objects lifetime and fixes possible segmentation fault…
tanmayv25 Nov 22, 2023
b2dd98b
Update protobuf after python update for testing (#6638)
jbkyang-nvi Nov 29, 2023
7c0ed09
Update post-23.11 release (#6653)
mc-nv Dec 4, 2023
0e721e4
PYBE Timeout Tests (#6483)
fpetrini15 Dec 4, 2023
9e7f237
Add note on lack of ensemble support (#6648)
rmccorm4 Dec 5, 2023
df9d6e8
Added request id to span attributes (#6667)
oandreeva-nv Dec 6, 2023
88b8843
Add test for optional internal tensor within an ensemble (#6663)
GuanLuo Dec 8, 2023
afaf6b0
Set CMake version to 3.27.7 (#6675)
mc-nv Dec 10, 2023
c329f48
restore typo (#6680)
mc-nv Dec 11, 2023
d496f95
Update 'main' to track development of 2.42.0 / 24.01 (#6673)
mc-nv Dec 12, 2023
d191ef9
iGPU build refactor (#6684) (#6691)
kthui Dec 12, 2023
0b63e0d
Mlflow Plugin Fix (#6685)
fpetrini15 Dec 13, 2023
ebd6a3e
Fix extra content-type headers in HTTP server (#6678)
rmccorm4 Dec 13, 2023
6ca4764
Fix iGPU CMakeFile tags (#6695)
kthui Dec 13, 2023
57b49b4
adding default value for TRITON_IGPU_BUILD=OFF (#6705)
nv-kmcgill53 Dec 14, 2023
b0bbabb
Add test case for decoupled model raising exception (#6686)
krishung5 Dec 15, 2023
8f5f515
Escape special characters in general docs (#6697)
nealvaidya Dec 15, 2023
5b46d0e
vLLM Benchmarking Test (#6631)
pskiran1 Dec 15, 2023
e2e3837
Allow configuring GRPC max connection age and max connection age grac…
siweili11 Dec 15, 2023
a4b8162
Modify HTTP frontend to return error code reflecting Triton error. Ad…
GuanLuo Dec 19, 2023
06df7a8
Update `main` branch post-23.12 release (#6726)
mc-nv Dec 26, 2023
45278d7
trtllm CI fixes (#6687)
mc-nv Dec 28, 2023
cb0c2e5
Remove double unit test (#6714)
kthui Jan 2, 2024
14f70b6
Update RE2 package location (#6750)
Tabrizian Jan 3, 2024
6bc5625
Add testing for zero tensors in PyTorch backend (#6760)
Tabrizian Jan 3, 2024
fb3747a
Fix L0_lifecycle on insufficient hardware concurrency (#6762)
kthui Jan 4, 2024
1ea633a
Simplify cmake install command (#6725)
Tabrizian Jan 5, 2024
b48aa57
Add cmdline option to set model load retry. Add test (#6764)
GuanLuo Jan 5, 2024
8af13e9
Increase timeout (#6774)
fpetrini15 Jan 8, 2024
b5f1f7d
Move from jfrog artifactory to archives.boost.io to fix boost downloa…
rmccorm4 Jan 8, 2024
310c38c
Add Triton Inference Server In-Process Python API Tests
nnshah1 Jan 11, 2024
fa72cf2
Bring back Python backend based PyTorch backend (#6518)
kthui Jan 11, 2024
2782d30
Bump min cxx standard to 17 (#6742)
oandreeva-nv Jan 11, 2024
cf5fe35
Add tests/docs for Pinned Memory metrics (#6754)
pskiran1 Jan 12, 2024
c205451
Update 'main' to track development of 2.42.0 / 24.02 (#6786)
mc-nv Jan 12, 2024
4a719e4
Support for Context Propagation for OTel trace mode (#6785)
oandreeva-nv Jan 17, 2024
87165b2
Use current time when overwriting model configuration. (#6727)
whoisj Jan 17, 2024
7b06a37
Added docs for otel context propagation (#6804)
oandreeva-nv Jan 18, 2024
b6e017e
Fix typos in trace.md (#6808)
rmccorm4 Jan 18, 2024
3e79b2a
Fix test_model_config_overwite in L0_lifecycle (#6818)
GuanLuo Jan 19, 2024
7edeb9f
Improve L0_backend_python on shm reliability (#6803)
kthui Jan 19, 2024
3bff367
Remove boost::filesystem (#6810)
rmccorm4 Jan 22, 2024
bc71da0
Generate unittest xml reports from L0_python_api (#6822)
rmccorm4 Jan 23, 2024
6192c6e
Add unit test reports to L0_json, L0_metrics, L0_response_cache, L0_b…
rmccorm4 Jan 25, 2024
a514a05
Update trace summary script (#6758)
pskiran1 Jan 25, 2024
28f497c
Add gsutil upload retry helper function (#6817)
kthui Jan 25, 2024
ddfdb2a
Add test for shutdown while unloading in background (#6835)
kthui Jan 27, 2024
56e4232
Handle 0 dimension output for generate endpoint (#6833)
krishung5 Jan 29, 2024
d98a59c
tensorrt-llm benchmarking test (#6771)
pskiran1 Jan 29, 2024
2309bce
Update README.md and versions post-24.01 (#6847)
mc-nv Jan 30, 2024
d0e2653
Use libmamba solver for L0_backend_python env test. Fix pytest not fo…
krishung5 Jan 30, 2024
f92732d
Add test for shutdown while loading model (#6837)
kthui Jan 31, 2024
776e641
Adding OpenTelemetry Batch Span Processor (#6842)
oandreeva-nv Feb 1, 2024
b0a495a
Support Double-Type Inference Request/Response Parameters (#6755)
fpetrini15 Feb 1, 2024
508929a
Updating vllm version to 0.3.0 (#6858)
oandreeva-nv Feb 7, 2024
738c98f
Python Backend Windows Support (#6830)
fpetrini15 Feb 8, 2024
3d79568
Add support for Oracle Cloud in deploy (#6850)
bruno-garbaccio Feb 9, 2024
1df73dc
Add link to TRTLLM metrics docs (#6874)
rmccorm4 Feb 13, 2024
4294cc6
Add unit test reports to L0_dlpack_multi_gpu and L0_warmup (#6873)
krishung5 Feb 14, 2024
f078bfb
Set OV version to 2023.3.0 (#6880)
kthui Feb 14, 2024
80fc56c
Fixing StringTo uint32_t used only by tracing (#6883)
oandreeva-nv Feb 14, 2024
8a2a229
Update 'main' to track development of 2.44.0 / 24.03 (#6892)
mc-nv Feb 16, 2024
59e267f
Add response statistics (#6869)
kthui Feb 17, 2024
21a7fc5
Fix busyop test for L0_memory_growth (#6900)
krishung5 Feb 22, 2024
60872b9
Add cancellation into response statistics (#6904)
kthui Feb 23, 2024
8d8b607
Install required pip pkgs (#6906)
krishung5 Feb 24, 2024
adafa4f
Match forward headers case insensitively. (#6889)
yinggeh Feb 27, 2024
551978b
Add note on --cache-config spacing and fix typos (#6929)
rmccorm4 Mar 1, 2024
246f46c
Remove ignore files that are not in use by repository (#6893)
mc-nv Mar 2, 2024
1dcf2cf
Update README and versions for 2.43.0 / 24.02 (#6886)
mc-nv Feb 15, 2024
9be77f1
Set ONNX Runtime version 1.17.2
mc-nv Mar 1, 2024
19b02a2
Expose tritonserver args in values.yaml (#5582)
okyspace Mar 4, 2024
d0f332b
Parameterize git repository (#6934)
nv-kmcgill53 Mar 6, 2024
c2299d5
Enhance bound check for shm offset (#6914)
kthui Mar 8, 2024
110251b
Allow non-decoupled model to send response and FINAL flag separately …
GuanLuo Mar 8, 2024
25266a5
Add test for max queue delay timeout prompt response (#6938)
kthui Mar 8, 2024
b012bd0
Test improved input validation errors (#6933)
indrajit96 Mar 9, 2024
52a1cd2
Update Dockerfile.sdk with OpenAI support (#6941)
tgerdesnv Mar 11, 2024
b2e6e7e
Test Correlation Id string support for BLS (#6963)
pskiran1 Mar 11, 2024
9786e40
Update 'main' to track development of 2.45.0 / 24.04 (#6974)
mc-nv Mar 11, 2024
e92abf2
Add AsyncIO HTTP compression test (#6975)
kthui Mar 13, 2024
8139431
Install `genai-pa` into SDK container (#6942)
mc-nv Mar 13, 2024
5c6e487
extend existing tests with more parameters (#6951)
yf711 Mar 15, 2024
9f16eef
Exposing trace context to python backend (#6985)
oandreeva-nv Mar 15, 2024
8b36aa8
Add documentation for mapping between Triton Errors and HTTP status c…
Tabrizian Mar 19, 2024
afaa6f4
Remove hatch version (#7009)
tgerdesnv Mar 21, 2024
fdbfb27
Update vLLM to 0.3.2 for gemma support (#6918)
kebe7jun Mar 21, 2024
2be127b
Add missing copyright for L0_trace (#6996)
oandreeva-nv Mar 25, 2024
df753d7
fix sphinx warnings (#7030)
yinggeh Mar 25, 2024
a844eda
Add meetup invite banner (#7049)
rmccorm4 Mar 27, 2024
8a208d7
Update 'main' post-24.03 (#7051)
mc-nv Apr 1, 2024
1dfa33d
Fix incorrect version updates (#7073)
Tabrizian Apr 4, 2024
879a505
Update compose.py and remove mention of tensorflow1 in documentation …
jbkyang-nvi Apr 4, 2024
e9e3648
Add testing for iterative scheduler backlogged requests (#7059)
Tabrizian Apr 5, 2024
dbeb198
Remove conda package manager (#7069)
mc-nv Apr 5, 2024
e1d58c7
fix link (#7044)
yinggeh Apr 5, 2024
74660f1
Add Documentation from Additional Repositories to nvidia.docs.com (#7…
yinggeh Apr 5, 2024
2150fc2
Fix html image rendering in sphinx documentation (#7084)
tanmayv25 Apr 8, 2024
cbd6967
Remove obsolete mention of image tags (#7085)
tanmayv25 Apr 9, 2024
aff4b93
HTTP live connections on server shutdown (#6986)
kthui Apr 9, 2024
10f1c8d
Enable autodocs for python client library API documentation (#7082)
tanmayv25 Apr 9, 2024
5e20ef6
Updated vllm version (#7095)
oandreeva-nv Apr 10, 2024
52f97b5
Disable Dynamic Log File (#7092)
yinggeh Apr 11, 2024
159b060
Validate system shared memory region size when registering a region (…
rmccorm4 Apr 11, 2024
196caf0
Decoupled Async Execute (#7062)
kthui Apr 11, 2024
5b739db
Add trace mode and trace config entries in trace settings API (#7050)
indrajit96 Apr 11, 2024
0a4c87b
Update 'main' to track development of 2.46.0 / 24.05 (#7105)
mc-nv Apr 11, 2024
3b6c6f9
Validate the memory requested for the infer request is not out of bou…
jbkyang-nvi Apr 12, 2024
b889687
Add copyright for tritonclient_api (#7109)
Tabrizian Apr 12, 2024
7529f0e
Disable dynamic trace file (#7106)
yinggeh Apr 13, 2024
e116a2a
Update L0_logging to reflect error when trying to update log_file (#7…
yinggeh Apr 13, 2024
8e88f2c
Add new cached channel test (#7123)
jbkyang-nvi Apr 17, 2024
e965287
Fix gRPC frontend race condition (#7110)
kthui Apr 17, 2024
233c4b2
Remove client testing of server trace to match discontinued support f…
matthewkotila Apr 17, 2024
2de09ee
Re-enable PA trace testing but remove setting trace file (#7131)
matthewkotila Apr 19, 2024
dba31c2
Fix windows build for shared memory bound checking(#7137)
jbkyang-nvi Apr 19, 2024
09b34be
Fix test for cached channels (#7130)
jbkyang-nvi Apr 19, 2024
1da454c
Use a lower concurrency with more repetition for L0_memory_growth (#7…
krishung5 Apr 23, 2024
f243276
Replace deprecated tritongrpcclient package (#7061)
Tabrizian Apr 24, 2024
365b86a
Avoid the HTTP Error 403: rate limit exceeded error (#7155)
krishung5 Apr 25, 2024
987deaa
Clarify instance group documentation for ensemble (#7162)
Tabrizian Apr 25, 2024
d432266
Add extra footer to documentation (#7163)
mc-nv Apr 26, 2024
5239ff0
Add metrics model namespacing label test (#7141)
kthui Apr 26, 2024
16e5470
Update `main` post-24.04 (#7160)
mc-nv Apr 30, 2024
3c99c95
Remove meetup note now that the event has completed (#7179)
Tabrizian May 3, 2024
a9d3dac
Validate CUDA SHM region registration size (#7178)
krishung5 May 7, 2024
ee6d238
Fix python client Shm Leak (#7172)
fpetrini15 May 7, 2024
c724193
Add test for sequence state after cancellation (#7167)
kthui May 7, 2024
27c2142
Rename triton_tensorrtllm_worker -> trtllmExecutorWorker (#7194)
krishung5 May 8, 2024
884ca4e
Tests for Top Level Request Caching for Ensemble Models (#7074)
lkomali May 9, 2024
6694b74
Test cuda shared memory offset and byte size out of bounds(#7202)
jbkyang-nvi May 10, 2024
dd71d3b
Upgrade the golang version to 1.22.3 (#7208)
tanmayv25 May 13, 2024
a669145
Update 'Dockerfile' Python path to include DALI (#7216)
mc-nv May 14, 2024
4dcda7f
Remove the dependency on CUDA driver (#7224)
krishung5 May 15, 2024
d6fe6e6
Multiple Model Configurations (#7185)
yinggeh May 16, 2024
d356d6e
Fix L0_backend_python iGPU PyTorch installation (#7231)
kthui May 16, 2024
747f5d4
Fix the L0_simple_go_client (#7239)
tanmayv25 May 17, 2024
0370485
Add section on ensemble model caching (#7234)
rmccorm4 May 18, 2024
3e97828
Add testing for escaped log messages
nnshah1 May 20, 2024
9faf444
updating log parsing in test
nnshah1 May 21, 2024
620f095
Add documentation on logging formats
nnshah1 May 21, 2024
0c4228c
Return an error if --load-model is specified without explicit model c…
rmccorm4 May 22, 2024
1322225
Exclude Jax example from Python 3.8 (#7260)
krishung5 May 23, 2024
2d2c0b5
add test for shape validation (#7195)
jbkyang-nvi May 24, 2024
9cfc53a
Enhance OTEL testing to capture and verify Cancellation Requests and …
indrajit96 May 24, 2024
60a06bf
Fix Python 3.11 env (#7274)
krishung5 May 28, 2024
729b677
Bump vllm to v0.4.2 (#7198)
kebe7jun May 29, 2024
ea095c9
Update main to track development for 2.47.0 / r24.06 (#7291)
tanmayv25 May 29, 2024
20f3487
Update 'main' post 24.05 release (#7298)
tanmayv25 May 29, 2024
c907231
Update openvino to 2024.0.0 (#7299)
krishung5 May 30, 2024
c3eb5ca
docs: Update PR templates (#7290)
jbkyang-nvi May 30, 2024
13f819b
docs: Add default template that diverts to sub templates (#7306)
jbkyang-nvi May 30, 2024
d189a87
Added new flag for GPU peer access API control (#7261)
indrajit96 Jun 3, 2024
4d113dc
build: Update vllm version to v0.4.3 (latest) (#7309)
oandreeva-nv Jun 3, 2024
6a303f8
fix: Fix L0_input_validation--base (#7304)
yinggeh Jun 4, 2024
34390d7
fix: Remove onnxruntime libraries from system path (#7323)
tanmayv25 Jun 5, 2024
b0ea306
Change TensorRT-LLM (#7143)
mc-nv Jun 5, 2024
b6734dd
Add testing for libtorch cudnn (#7286)
Tabrizian Jun 5, 2024
31f00b6
Fix gRPC streaming non-decoupled segfault if sending response and fin…
kthui Jun 6, 2024
497475e
Add support for response sender in the default mode (#7311)
kthui Jun 6, 2024
8ce3890
fix: Handling grpc cancellation edge-case:: Cancelling at step START …
oandreeva-nv Jun 6, 2024
8745160
test: Add testing for CUDA EP options (#7328)
krishung5 Jun 6, 2024
03ca720
ci: Support BF16 data type in TensorRT backend (#7310)
pskiran1 Jun 7, 2024
c0e4c81
test: Update error messages to comply with core change (#7326)
yinggeh Jun 7, 2024
7236796
ci: Restrict numpy to version 1.x (#7327)
KrishnanPrash Jun 7, 2024
3135eb5
test: Fix the test to expect updated error messages (#7340)
tanmayv25 Jun 12, 2024
fe63eba
test: Python models filtering outputs based on requested outputs (#7338)
kthui Jun 12, 2024
5f8497f
test: Add test for sequence flags in ensemble streaming inference (#7…
indrajit96 Jun 12, 2024
fd1d9c4
fix: Fix version for setuptools and grpcio-tools. Remove cudnn 8 inst…
krishung5 Jun 18, 2024
f326993
ci: Add INT64 Datatype Support for Shape Tensors in TensorRT Backend …
pskiran1 Jun 20, 2024
9e55dab
Update 15-container-copyright.txt (#7375)
Tabrizian Jun 26, 2024
0f4c9d3
Update `main` post -24.06 (#7380)
mc-nv Jun 28, 2024
686cf1a
test: Add input byte size tests using C APIs (#7372)
yinggeh Jul 3, 2024
33d7e7e
[refactor]: Refactor Frontend Trace OpenTelemetry Implementation (#7390)
oandreeva-nv Jul 5, 2024
65a9140
[fix]: grpc state cleanup fix (#7409)
oandreeva-nv Jul 5, 2024
4415430
[build]: vllm version update (#7405)
oandreeva-nv Jul 5, 2024
8c5b94c
[feat]:Custom Backend Tracing (#7403)
oandreeva-nv Jul 5, 2024
66e4fff
build: Reduce intermediate layers (#7408)
krishung5 Jul 8, 2024
e9b811c
test: Remove AWS bucket on test failure (#7342)
kthui Jul 8, 2024
dabb7cb
fix: Fix error message for L0_trt_compat (#7432)
krishung5 Jul 10, 2024
2f299d1
feat: Support for request id field in generate API (#7392)
shreyas-samsung Jul 10, 2024
22d9261
perf: Improve response throughput of a single gRPC stream (#7404)
kthui Jul 12, 2024
b263bfc
test: Tests for Metrics API enhancement to include error counters (#7…
indrajit96 Jul 12, 2024
3421429
Update NGC versions post-24.07 release (#7469)
pvijayakrish Jul 25, 2024
96ef8a7
[build]: Bumping vllm version to v0.5.3.post1 (#7453)
oandreeva-nv Jul 25, 2024
f151f8a
ci: Fix shape and reformat free tensor handling in the input byte siz…
pskiran1 Jul 27, 2024
b8a3629
chore: PA Migration From Client (#7449)
fpetrini15 Jul 29, 2024
5e61a01
test: Refactor cpu metrics tests to make L0_metrics more stable (#7476)
rmccorm4 Jul 29, 2024
e713208
test: Add BF16 test for python backend (#7483)
rmccorm4 Jul 30, 2024
3443dd6
test: Improve L0_logging stability (#7486)
rmccorm4 Jul 31, 2024
839faf7
ci: Return custom exit code to indicate known shm leak failure in L0_…
krishung5 Jul 31, 2024
d4b585d
Including 'tritonserver.lib' into final package (#7491)
mc-nv Aug 2, 2024
327ee02
build: Add default value for argument 'TRITON_REPO_ORGANIZATION' from…
zhanga5 Aug 5, 2024
5b33a25
chore:Purge PA from Client Repo (#7488)
fpetrini15 Aug 6, 2024
04e0d85
PA Migration: Update L0_client_build_variants (#7505)
fpetrini15 Aug 7, 2024
3c7263f
test: Add test for sending response after sending complete final flag…
kthui Aug 7, 2024
ea3ebca
Add vLLM x Triton user meetup announcement (#7509)
harryskim Aug 8, 2024
a5ad309
Fix benchmarking tests (#7461)
pskiran1 Aug 10, 2024
61466d4
feat: Add vLLM counter metrics access through Triton (#7493)
yinggeh Aug 16, 2024
cadd112
build: RHEL 8 Compatibility (#7519)
nv-kmcgill53 Aug 16, 2024
5611ca1
feat: Add GRPC error codes to GRPC streaming if enabled by user. (#7499)
indrajit96 Aug 16, 2024
6857dc3
test: Add python backend tests for the new histogram metric (#7540)
yinggeh Aug 17, 2024
c91d1e5
test: Load new model version should not reload loaded existing model …
kthui Aug 20, 2024
a7a43a2
Intermittent `L0_decoupled_grpc_error` crash fixed. (#7552)
indrajit96 Aug 20, 2024
3735d99
ci: Raise Documentation Generation Errors (#7559)
fpetrini15 Aug 22, 2024
8e56e30
docs: Add tensorrtllm_backend into doc generation (#7563)
krishung5 Aug 23, 2024
be1a0a5
build: RHEL8 EA2 Backends (#7568)
fpetrini15 Aug 27, 2024
ef6afcd
Release: Update NGC versions post-24.08 release (#7565)
pvijayakrish Aug 27, 2024
c88aec5
docs: Add python backend to windows build command (#7572)
krishung5 Aug 27, 2024
3ea493f
docs: Triton TRT-LLM user guide (#7529)
krishung5 Aug 27, 2024
01438d8
Build: Updating to allow passing DOCKER_GPU_ARGS at model generation …
pvijayakrish Aug 27, 2024
5104900
feat: Python Deployment of Triton Inference Server (#7501)
KrishnanPrash Aug 30, 2024
89a9038
fix: Adding copyright info (#7591)
KrishnanPrash Sep 3, 2024
cb1204d
test: Refactor core input size checks (#7592)
yinggeh Sep 4, 2024
8da14cc
Don't Build `tritonfrontend` for Windows. (#7599)
fpetrini15 Sep 7, 2024
9076d2c
fix: Add reference count tracking for shared memory regions (#7567)
pskiran1 Sep 11, 2024
3eab666
build/test: RHEL8 EA3 (#7595)
fpetrini15 Sep 11, 2024
e452b58
Fix: Add mutex lock for state completion check in gRPC streaming to p…
pskiran1 Sep 17, 2024
a93de16
Update fetch_models.sh (#7621)
vd-nv Sep 19, 2024
b4525aa
ci: Set stability factor to a higher value (#7634)
lkomali Sep 20, 2024
e44cf29
[docs] Removed vLLM meetup announcement (#7673)
oandreeva-nv Oct 1, 2024
fe0e41e
Update the versions post 24.09 release.
pvijayakrish Sep 25, 2024
c2fa60c
Build: Update triton version in Map (#7610)
pvijayakrish Sep 11, 2024
26a05ed
Update versions post 24.09
fpetrini15 Sep 7, 2024
b0adf31
Dockerfile.win10.min - Update dependency versions (#7633)
mc-nv Sep 24, 2024
86dbef3
Update server versions post 24.09
pvijayakrish Sep 26, 2024
1fa799e
ci: Reducing flakiness of `L0_python_api` (#7674)
KrishnanPrash Oct 2, 2024
3a21f61
[doc]Adjusted formatting of the warning (#7675)
oandreeva-nv Oct 3, 2024
1df30ed
fix: usage of ReadDataFromJson in array tensors (#7624)
v-shobhit Oct 7, 2024
9bbee48
fix: `tritonfrontend` gRPC Streaming Segmentation Fault (#7671)
KrishnanPrash Oct 7, 2024
71a285a
test: Enhance Python gRPC streaming test to send multiple requests (#…
kthui Oct 7, 2024
e8d58b2
Updtae README and versions for 24.10
pvijayakrish Oct 8, 2024
33c6830
Updtae version as 24.10
pvijayakrish Oct 8, 2024
c24a935
Remove TRT model generation for V100
pvijayakrish Oct 15, 2024
cb7fd80
Correcting the TRT branch name
pvijayakrish Oct 15, 2024
49f986d
Remove TRT model generation for V100
pvijayakrish Oct 16, 2024
6835fe7
Remove TRT model generation for V100
pvijayakrish Oct 16, 2024
File filter

Filter by extension

Filter by extension


Conversations
Failed to load comments.
Loading
Jump to
The table of contents is too big for display.
Diff view
Diff view
  •  
  •  
  •  
6 changes: 4 additions & 2 deletions .clang-format
Original file line number Diff line number Diff line change
Expand Up @@ -2,7 +2,8 @@
BasedOnStyle: Google

IndentWidth: 2
ContinuationIndentWidth: 2
ColumnLimit: 80
ContinuationIndentWidth: 4
UseTab: Never
MaxEmptyLinesToKeep: 2

Expand Down Expand Up @@ -34,4 +35,5 @@ BinPackArguments: true
BinPackParameters: true
ConstructorInitializerAllOnOneLineOrOnePerLine: false

IndentCaseLabels: true
IndentCaseLabels: true

24 changes: 24 additions & 0 deletions .github/ISSUE_TEMPLATE/bug_report.md
Original file line number Diff line number Diff line change
@@ -0,0 +1,24 @@
---
name: Bug report
about: Create a report to help us improve
title: ''
labels: ''
assignees: ''

---

**Description**
A clear and concise description of what the bug is.

**Triton Information**
What version of Triton are you using?

Are you using the Triton container or did you build it yourself?

**To Reproduce**
Steps to reproduce the behavior.

Describe the models (framework, inputs, outputs), ideally include the model configuration file (if using an ensemble include the model configuration file for that as well).

**Expected behavior**
A clear and concise description of what you expected to happen.
20 changes: 20 additions & 0 deletions .github/ISSUE_TEMPLATE/feature_request.md
Original file line number Diff line number Diff line change
@@ -0,0 +1,20 @@
---
name: Feature request
about: Suggest an idea for this project
title: ''
labels: ''
assignees: ''

---

**Is your feature request related to a problem? Please describe.**
A clear and concise description of what the problem is. Ex. I'm always frustrated when [...]

**Describe the solution you'd like**
A clear and concise description of what you want to happen.

**Describe alternatives you've considered**
A clear and concise description of any alternative solutions or features you've considered.

**Additional context**
Add any other context or screenshots about the feature request here.
Original file line number Diff line number Diff line change
@@ -0,0 +1,50 @@
#### What does the PR do?
<!-- Describe your pull request here. Please read the text below the line, and make sure you follow the checklist.-->

#### Checklist
- [ ] I have read the [Contribution guidelines](#../../CONTRIBUTING.md) and signed the [Contributor License
Agreement](https://github.com/NVIDIA/triton-inference-server/blob/master/Triton-CCLA-v1.pdf)
- [ ] PR title reflects the change and is of format `<commit_type>: <Title>`
- [ ] Changes are described in the pull request.
- [ ] Related issues are referenced.
- [ ] Populated [github labels](https://docs.github.com/en/issues/using-labels-and-milestones-to-track-work/managing-labels) field
- [ ] Added [test plan](#test-plan) and verified test passes.
- [ ] Verified that the PR passes existing CI.
- [ ] I ran pre-commit locally (`pre-commit install, pre-commit run --all`)
- [ ] Verified copyright is correct on all changed files.
- [ ] Added _succinct_ git squash message before merging [ref](https://tbaggery.com/2008/04/19/a-note-about-git-commit-messages.html).
- [ ] All template sections are filled out.
- [ ] Optional: Additional screenshots for behavior/output changes with before/after.

#### Commit Type:
Check the [conventional commit type](https://github.com/angular/angular/blob/22b96b9/CONTRIBUTING.md#type)
box here and add the label to the github PR.
- [ ] build
- [ ] ci
- [ ] docs
- [ ] feat
- [ ] fix
- [ ] perf
- [ ] refactor
- [ ] revert
- [ ] style
- [ ] test

#### Related PRs:
<!-- Related PRs from other Repositories -->

#### Where should the reviewer start?
<!-- call out specific files that should be looked at closely -->

#### Test plan:
<!-- list steps to verify feature works -->
<!-- were e2e tests added?-->

#### Caveats:
<!-- any limitations or possible things missing from this PR -->

#### Background
<!-- e.g. what led to this change being made. this is optional extra information to help the reviewer -->

#### Related Issues: (use one of the action keywords Closes / Fixes / Resolves / Relates to)
- closes GitHub issue: #xxx
Original file line number Diff line number Diff line change
@@ -0,0 +1,50 @@
#### What does the PR do?
<!-- Describe your pull request here. Please read the text below the line, and make sure you follow the checklist.-->

#### Checklist
- [ ] PR title reflects the change and is of format `<commit_type>: <Title>`
- [ ] Changes are described in the pull request.
- [ ] Related issues are referenced.
- [ ] Populated [github labels](https://docs.github.com/en/issues/using-labels-and-milestones-to-track-work/managing-labels) field
- [ ] Added [test plan](#test-plan) and verified test passes.
- [ ] Verified that the PR passes existing CI.
- [ ] Verified copyright is correct on all changed files.
- [ ] Added _succinct_ git squash message before merging [ref](https://tbaggery.com/2008/04/19/a-note-about-git-commit-messages.html).
- [ ] All template sections are filled out.
- [ ] Optional: Additional screenshots for behavior/output changes with before/after.

#### Commit Type:
Check the [conventional commit type](https://github.com/angular/angular/blob/22b96b9/CONTRIBUTING.md#type)
box here and add the label to the github PR.
- [ ] build
- [ ] ci
- [ ] docs
- [ ] feat
- [ ] fix
- [ ] perf
- [ ] refactor
- [ ] revert
- [ ] style
- [ ] test

#### Related PRs:
<!-- Related PRs from other Repositories -->

#### Where should the reviewer start?
<!-- call out specific files that should be looked at closely -->

#### Test plan:
<!-- list steps to verify -->
<!-- were e2e tests added?-->

- CI Pipeline ID:
<!-- Only Pipeline ID and no direct link here -->

#### Caveats:
<!-- any limitations or possible things missing from this PR -->

#### Background
<!-- e.g. what led to this change being made. this is optional extra information to help the reviewer -->

#### Related Issues: (use one of the action keywords Closes / Fixes / Resolves / Relates to)
- closes GitHub issue: #xxx
13 changes: 13 additions & 0 deletions .github/pull_request_template.md
Original file line number Diff line number Diff line change
@@ -0,0 +1,13 @@
Thanks for submitting a PR to Triton!
Please go the the `Preview` tab above this description box and select the appropriate sub-template:

* [PR description template for Triton Engineers](?expand=1&template=pull_request_template_internal_contrib.md)
* [PR description template for External Contributors](?expand=1&template=pull_request_template_external_contrib.md)

If you already created the PR, please replace this message with one of
* [External contribution template](https://raw.githubusercontent.com/triton-inference-server/server/main/.github/PULL_REQUEST_TEMPLATE/pull_request_template_external_contrib.md)
* [Internal contribution template](https://raw.githubusercontent.com/triton-inference-server/server/main/.github/PULL_REQUEST_TEMPLATE/pull_request_template_internal_contrib.md)

and fill it out.


84 changes: 84 additions & 0 deletions .github/workflows/codeql.yml
Original file line number Diff line number Diff line change
@@ -0,0 +1,84 @@
# Copyright 2023, NVIDIA CORPORATION & AFFILIATES. All rights reserved.
#
# Redistribution and use in source and binary forms, with or without
# modification, are permitted provided that the following conditions
# are met:
# * Redistributions of source code must retain the above copyright
# notice, this list of conditions and the following disclaimer.
# * Redistributions in binary form must reproduce the above copyright
# notice, this list of conditions and the following disclaimer in the
# documentation and/or other materials provided with the distribution.
# * Neither the name of NVIDIA CORPORATION nor the names of its
# contributors may be used to endorse or promote products derived
# from this software without specific prior written permission.
#
# THIS SOFTWARE IS PROVIDED BY THE COPYRIGHT HOLDERS ``AS IS'' AND ANY
# EXPRESS OR IMPLIED WARRANTIES, INCLUDING, BUT NOT LIMITED TO, THE
# IMPLIED WARRANTIES OF MERCHANTABILITY AND FITNESS FOR A PARTICULAR
# PURPOSE ARE DISCLAIMED. IN NO EVENT SHALL THE COPYRIGHT OWNER OR
# CONTRIBUTORS BE LIABLE FOR ANY DIRECT, INDIRECT, INCIDENTAL, SPECIAL,
# EXEMPLARY, OR CONSEQUENTIAL DAMAGES (INCLUDING, BUT NOT LIMITED TO,
# PROCUREMENT OF SUBSTITUTE GOODS OR SERVICES; LOSS OF USE, DATA, OR
# PROFITS; OR BUSINESS INTERRUPTION) HOWEVER CAUSED AND ON ANY THEORY
# OF LIABILITY, WHETHER IN CONTRACT, STRICT LIABILITY, OR TORT
# (INCLUDING NEGLIGENCE OR OTHERWISE) ARISING IN ANY WAY OUT OF THE USE
# OF THIS SOFTWARE, EVEN IF ADVISED OF THE POSSIBILITY OF SUCH DAMAGE.

name: "CodeQL"

on:
pull_request:

jobs:
analyze:
name: Analyze
runs-on: ubuntu-latest
permissions:
actions: read
contents: read
security-events: write

strategy:
fail-fast: false
matrix:
language: [ 'python' ]
# CodeQL supports [ 'cpp', 'csharp', 'go', 'java', 'javascript', 'python', 'ruby' ]
# Learn more about CodeQL language support at https://aka.ms/codeql-docs/language-support

steps:
- name: Checkout repository
uses: actions/checkout@v3

# Initializes the CodeQL tools for scanning.
- name: Initialize CodeQL
uses: github/codeql-action/init@v2
with:
languages: ${{ matrix.language }}
# If you wish to specify custom queries, you can do so here or in a config file.
# By default, queries listed here will override any specified in a config file.
# Prefix the list here with "+" to use these queries and those in the config file.

# Details on CodeQL's query packs refer to:
# https://docs.github.com/en/code-security/code-scanning/automatically-scanning-your-code-for-vulnerabilities-and-errors/configuring-code-scanning#using-queries-in-ql-packs
queries: +security-and-quality


# Autobuild attempts to build any compiled languages (C/C++, C#, Go, or Java).
# If this step fails, then you should remove it and run the build manually (see below)
- name: Autobuild
uses: github/codeql-action/autobuild@v2

# Command-line programs to run using the OS shell.
# See https://docs.github.com/en/actions/using-workflows/workflow-syntax-for-github-actions#jobsjob_idstepsrun

# If the Autobuild fails above, remove it and uncomment the following three lines.
# modify them (or add more) to build your code if your project, please refer to the EXAMPLE below for guidance.

# - run: |
# echo "Run, Build Application using script"
# ./location_of_script_within_repo/buildscript.sh

- name: Perform CodeQL Analysis
uses: github/codeql-action/analyze@v2
with:
category: "/language:${{matrix.language}}"
39 changes: 39 additions & 0 deletions .github/workflows/pre-commit.yaml
Original file line number Diff line number Diff line change
@@ -0,0 +1,39 @@
# Copyright 2023, NVIDIA CORPORATION & AFFILIATES. All rights reserved.
#
# Redistribution and use in source and binary forms, with or without
# modification, are permitted provided that the following conditions
# are met:
# * Redistributions of source code must retain the above copyright
# notice, this list of conditions and the following disclaimer.
# * Redistributions in binary form must reproduce the above copyright
# notice, this list of conditions and the following disclaimer in the
# documentation and/or other materials provided with the distribution.
# * Neither the name of NVIDIA CORPORATION nor the names of its
# contributors may be used to endorse or promote products derived
# from this software without specific prior written permission.
#
# THIS SOFTWARE IS PROVIDED BY THE COPYRIGHT HOLDERS ``AS IS'' AND ANY
# EXPRESS OR IMPLIED WARRANTIES, INCLUDING, BUT NOT LIMITED TO, THE
# IMPLIED WARRANTIES OF MERCHANTABILITY AND FITNESS FOR A PARTICULAR
# PURPOSE ARE DISCLAIMED. IN NO EVENT SHALL THE COPYRIGHT OWNER OR
# CONTRIBUTORS BE LIABLE FOR ANY DIRECT, INDIRECT, INCIDENTAL, SPECIAL,
# EXEMPLARY, OR CONSEQUENTIAL DAMAGES (INCLUDING, BUT NOT LIMITED TO,
# PROCUREMENT OF SUBSTITUTE GOODS OR SERVICES; LOSS OF USE, DATA, OR
# PROFITS; OR BUSINESS INTERRUPTION) HOWEVER CAUSED AND ON ANY THEORY
# OF LIABILITY, WHETHER IN CONTRACT, STRICT LIABILITY, OR TORT
# (INCLUDING NEGLIGENCE OR OTHERWISE) ARISING IN ANY WAY OUT OF THE USE
# OF THIS SOFTWARE, EVEN IF ADVISED OF THE POSSIBILITY OF SUCH DAMAGE.

name: pre-commit

on:
pull_request:

jobs:
pre-commit:
runs-on: ubuntu-22.04
steps:
- uses: actions/checkout@v3
- uses: actions/setup-python@v3
- uses: pre-commit/[email protected]

19 changes: 8 additions & 11 deletions .gitignore
Original file line number Diff line number Diff line change
@@ -1,11 +1,8 @@
/bazel-bin
/bazel-ci_build-cache
/bazel-genfiles
/bazel-trtserver
/bazel-out
/bazel-serving
/bazel-tensorflow
/bazel-tensorflow_serving
/bazel-testlogs
/bazel-tf
/bazel-workspace
/build
/builddir
/.vscode
*.so
__pycache__
tmp
*.log
test_results.txt
74 changes: 74 additions & 0 deletions .pre-commit-config.yaml
Original file line number Diff line number Diff line change
@@ -0,0 +1,74 @@
# Copyright 2023, NVIDIA CORPORATION & AFFILIATES. All rights reserved.
#
# Redistribution and use in source and binary forms, with or without
# modification, are permitted provided that the following conditions
# are met:
# * Redistributions of source code must retain the above copyright
# notice, this list of conditions and the following disclaimer.
# * Redistributions in binary form must reproduce the above copyright
# notice, this list of conditions and the following disclaimer in the
# documentation and/or other materials provided with the distribution.
# * Neither the name of NVIDIA CORPORATION nor the names of its
# contributors may be used to endorse or promote products derived
# from this software without specific prior written permission.
#
# THIS SOFTWARE IS PROVIDED BY THE COPYRIGHT HOLDERS ``AS IS'' AND ANY
# EXPRESS OR IMPLIED WARRANTIES, INCLUDING, BUT NOT LIMITED TO, THE
# IMPLIED WARRANTIES OF MERCHANTABILITY AND FITNESS FOR A PARTICULAR
# PURPOSE ARE DISCLAIMED. IN NO EVENT SHALL THE COPYRIGHT OWNER OR
# CONTRIBUTORS BE LIABLE FOR ANY DIRECT, INDIRECT, INCIDENTAL, SPECIAL,
# EXEMPLARY, OR CONSEQUENTIAL DAMAGES (INCLUDING, BUT NOT LIMITED TO,
# PROCUREMENT OF SUBSTITUTE GOODS OR SERVICES; LOSS OF USE, DATA, OR
# PROFITS; OR BUSINESS INTERRUPTION) HOWEVER CAUSED AND ON ANY THEORY
# OF LIABILITY, WHETHER IN CONTRACT, STRICT LIABILITY, OR TORT
# (INCLUDING NEGLIGENCE OR OTHERWISE) ARISING IN ANY WAY OUT OF THE USE
# OF THIS SOFTWARE, EVEN IF ADVISED OF THE POSSIBILITY OF SUCH DAMAGE.

repos:
- repo: https://github.com/timothycrosley/isort
rev: 5.12.0
hooks:
- id: isort
additional_dependencies: [toml]
- repo: https://github.com/psf/black
rev: 23.1.0
hooks:
- id: black
types_or: [python, cython]
- repo: https://github.com/PyCQA/flake8
rev: 5.0.4
hooks:
- id: flake8
args: [--max-line-length=88, --select=C,E,F,W,B,B950, --extend-ignore = E203,E501]
types_or: [python, cython]
- repo: https://github.com/pre-commit/mirrors-clang-format
rev: v16.0.5
hooks:
- id: clang-format
types_or: [c, c++, cuda, proto, textproto, java]
args: ["-fallback-style=none", "-style=file", "-i"]
- repo: https://github.com/codespell-project/codespell
rev: v2.2.4
hooks:
- id: codespell
additional_dependencies: [tomli]
args: ["--toml", "pyproject.toml"]
exclude: (?x)^(.*stemmer.*|.*stop_words.*|^CHANGELOG.md$)
# More details about these pre-commit hooks here:
# https://pre-commit.com/hooks.html
- repo: https://github.com/pre-commit/pre-commit-hooks
rev: v4.4.0
hooks:
- id: check-case-conflict
- id: check-executables-have-shebangs
- id: check-merge-conflict
- id: check-json
- id: check-toml
- id: check-yaml
exclude: ^deploy(\/[^\/]+)*\/templates\/.*$
- id: check-shebang-scripts-are-executable
- id: end-of-file-fixer
types_or: [c, c++, cuda, proto, textproto, java, python]
- id: mixed-line-ending
- id: requirements-txt-fixer
- id: trailing-whitespace
Loading