Adding new options for LLM #768

nv-braf · 2023-09-29T21:45:59Z

Added the following new options to support LLM:

--llm-search-enable
--run-config-search-min/max-prompt-length
--run-config-search-min/max-token-count
--prompt-length
--token-count

tests/test_model_config_measurement.py

model_analyzer/config/input/config_command_profile.py

tests/test_cli.py

nv-braf · 2023-10-02T14:23:52Z

model_analyzer/config/input/config_defaults.py

@@ -51,10 +51,15 @@
 DEFAULT_RUN_CONFIG_MIN_MODEL_BATCH_SIZE = 1
 DEFAULT_RUN_CONFIG_MAX_MODEL_BATCH_SIZE = 128
 DEFAULT_RUN_CONFIG_MAX_BINARY_SEARCH_STEPS = 5
+DEFAULT_RUN_CONFIG_MIN_PROMPT_LENGTH = 1


These are just placeholder values. Need input from LLM stakeholders to pick real values

nv-braf · 2023-10-02T14:24:48Z

model_analyzer/config/input/config_command_profile.py

@@ -36,62 +37,6 @@
 from model_analyzer.triton.server.server_config import TritonServerConfig

 from .config_command import ConfigCommand
-from .config_defaults import (


It was getting annoying (and ugly) to keep adding more defaults. So, I switched this to import all the defaults

nv-braf · 2023-10-02T14:25:38Z

tests/test_cli.py

@@ -71,23 +72,27 @@ def get_test_options():
        # The following options can be None:
        #   short_option
        #   expected_default_value
-        OptionStruct("int", "profile", "--client-max-retries", "-r", "125", "50"),


Boyscoutted the removal of magic numbers

nv-braf · 2023-10-02T15:26:04Z

Dockerfile

@@ -12,8 +12,8 @@
 # See the License for the specific language governing permissions and
 # limitations under the License.

-ARG BASE_IMAGE=nvcr.io/nvidia/tritonserver:23.08-py3


I don't know why these changes are showing up after I rebased, but these and the docs changes are safe to ignore.

Maybe you can try this to remove the unwanted commit from PR: https://stackoverflow.com/a/51400593

I'm not too concerned. They match what is on the add-llm-mode branch and will go away when this subbranch is pushed. Just wanted to call it out so you wouldn't review it.

model_analyzer/config/input/config_defaults.py

* Adding new options for LLM (#768) * Update README and versions for 23.09 branch (#761) (#767) * Adding new options for LLM * Fixing codeQL issues * Fixing codeQL issue --------- Co-authored-by: Misha Chornyi <[email protected]> * Add LLM support to Brute Search (#769) * Initial coding complete * First unit test passing * Adding test for prompt length * Refactor PACG methods * Further refactoring * Ensure early exit isn't enabled for LLM models * Fix type checking errors * Attempt at fixing codeql issue * Revert "Attempt at fixing codeql issue" This reverts commit 2619b83. * Attempt at codeQL fix * Adding deepcopy back in * Removing deepcopy in an attempt to fix codeQL errors * Update model_analyzer/config/input/config_command_profile.py Co-authored-by: Hyunjae Woo <[email protected]> * Update model_analyzer/config/generate/perf_analyzer_config_generator.py Co-authored-by: Hyunjae Woo <[email protected]> * Update model_analyzer/config/generate/perf_analyzer_config_generator.py Co-authored-by: Hyunjae Woo <[email protected]> * Update model_analyzer/config/generate/perf_analyzer_config_generator.py Co-authored-by: Hyunjae Woo <[email protected]> * Moving location of method * Changing parameter to inference load * Changing parameter to inference load * Changing prompt length to text input length * Changing max_tokens to use request-parameter * Fix input-data typo * Changing non-parameter to parameter --------- Co-authored-by: Hyunjae Woo <[email protected]> * New LLM record types (#770) * New measurement fields created. * Fixing omission in llm_metric_table * Changing name to be avg_token_to_token... * New config options based on live run (#775) * Added new config options and modified existing options * Refactoring model parameter setting * Removing magic numbers * Capture LLM metrics from PA (#774) * Initial code for aggregation of new LLM metrics * New measurement fields created. * Fixing PA unit tests * Adding hooks in metrics to capture new LLM fields * Fixing codeQL errors * Fixing type checking errors * Changes needed post-merge from other branches * Revert naming mistake (due to merge). * Changes uncovered during live testing * Fixes based on hwoo review * Fixing typo * Change to use lists and mean() * Changes based on hwoo review * Correct how periodic concurrency works in PACG (#777) * Created a new class ConfigRangeNumeric and using it for periodic-concurrency * Fixes and defaults for periodic concurrency * First unit test passing * PACG chagnes complete. Unit tests updated and passing * Removing uneeded class * Fixing codeQL and hwoo's review suggestions * Adding missing else * Llm testing live run (#778) * Created a new class ConfigRangeNumeric and using it for periodic-concurrency * Fixes and defaults for periodic concurrency * First unit test passing * PACG chagnes complete. Unit tests updated and passing * Removing uneeded class * Changes to fix live run * Minor refactor and cleanup * Removing json files * Changing to use f-string * More cleanup from hwoo CR * Removing stale code for request period * Fix nit * Changes to get LLM summary reports working (#779) * Changes to get LLM summary reports working * Addressing hwoo's CR * Adding illegal LLM checks w/ unit testing + some minor cleanup (#781) * Adding illegal LLM checks w/ unit testing + some minor cleanup * Updated with TMA * Misc LLM cleanup (#782) * General cleanup * Add ticket nums to todos * Fix for non-LLM breaking bug introduced. * summary table in progress --------- Co-authored-by: Misha Chornyi <[email protected]> Co-authored-by: Hyunjae Woo <[email protected]>

github-advanced-security bot found potential problems Sep 29, 2023

View reviewed changes

tests/test_model_config_measurement.py Fixed Show fixed Hide fixed

model_analyzer/config/input/config_command_profile.py Fixed Show fixed Hide fixed

tests/test_cli.py Fixed Show fixed Hide fixed

Update README and versions for 23.09 branch (#761) (#767)

a387c11

github-advanced-security bot found potential problems Sep 29, 2023

View reviewed changes

tests/test_cli.py Fixed Show fixed Hide fixed

nv-braf commented Oct 2, 2023

View reviewed changes

nv-braf marked this pull request as ready for review October 2, 2023 14:31

nv-braf requested review from nv-hwoo and matthewkotila October 2, 2023 14:31

nv-braf added 3 commits October 2, 2023 15:18

Adding new options for LLM

e000bb4

Fixing codeQL issues

793de71

Fixing codeQL issue

81d7e23

nv-braf force-pushed the new-cli-config-options-for-llm branch from 187c1c4 to 81d7e23 Compare October 2, 2023 15:19

nv-braf commented Oct 2, 2023

View reviewed changes

nv-hwoo reviewed Oct 2, 2023

View reviewed changes

model_analyzer/config/input/config_defaults.py Show resolved Hide resolved

nv-hwoo approved these changes Oct 3, 2023

View reviewed changes

nv-braf merged commit c9d467f into add-llm-mode Oct 3, 2023
3 checks passed

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Adding new options for LLM #768

Adding new options for LLM #768

nv-braf commented Sep 29, 2023 •

edited

Loading

nv-braf Oct 2, 2023

nv-braf Oct 2, 2023

nv-braf Oct 2, 2023

nv-braf Oct 2, 2023

nv-hwoo Oct 2, 2023

nv-braf Oct 2, 2023

Adding new options for LLM #768

Adding new options for LLM #768

Conversation

nv-braf commented Sep 29, 2023 • edited Loading

nv-braf Oct 2, 2023

Choose a reason for hiding this comment

nv-braf Oct 2, 2023

Choose a reason for hiding this comment

nv-braf Oct 2, 2023

Choose a reason for hiding this comment

nv-braf Oct 2, 2023

Choose a reason for hiding this comment

nv-hwoo Oct 2, 2023

Choose a reason for hiding this comment

nv-braf Oct 2, 2023

Choose a reason for hiding this comment

nv-braf commented Sep 29, 2023 •

edited

Loading