-
Notifications
You must be signed in to change notification settings - Fork 20
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Merge LM-Eval dev branch #337
Commits on Jul 24, 2024
-
Add lm-eval-service controller (#258)
* feat: Initial database support (#246) * Initial database support - Add status checking - Add better storage flags - Add spec.storage.format validation - Add DDL -Add HIBERNATE format to DB (test) - Update service image - Revert identifier to DATABASE - Update CR options (remove mandatory data) * Remove default DDL generation env var * Update service image to latest tag * Add migration awareness * Add updating pods for migration * Change JDBC url from mysql to mariadb * Fix TLS mount * Revert images * Remove redundant logic * Fix comments * feat: Add TLS certificate mount on ModelMesh (#255) * feat: Add TLS certificate mount on ModelMesh * Revert from http to https until kserve/modelmesh#147 is merged * Add lm-eval-service controller refactor the existing TrustyAIService controller and add LMEvalService controller Signed-off-by: Yihong Wang <[email protected]> --------- Signed-off-by: Yihong Wang <[email protected]> Co-authored-by: Rui Vieira <[email protected]>
Configuration menu - View commit details
-
Copy full SHA for 7e1a712 - Browse repository at this point
Copy the full SHA 7e1a712View commit details
Commits on Jul 26, 2024
-
fix: Fix typo in operator's arguments (#261)
Operator's arguments changed from `--eanble-services` to `--enable-services`. trustyai.opendatahub.io_lmevaljobs.yaml and zz_generated.deepcopy.go regenerated.
Configuration menu - View commit details
-
Copy full SHA for 5e853a1 - Browse repository at this point
Copy the full SHA 5e853a1View commit details
Commits on Aug 5, 2024
-
Configuration menu - View commit details
-
Copy full SHA for 2173aae - Browse repository at this point
Copy the full SHA 2173aaeView commit details -
sync: sync dev/lm-eval with main branch (#271)
* feat: Initial database support (#246) * Initial database support - Add status checking - Add better storage flags - Add spec.storage.format validation - Add DDL -Add HIBERNATE format to DB (test) - Update service image - Revert identifier to DATABASE - Update CR options (remove mandatory data) * Remove default DDL generation env var * Update service image to latest tag * Add migration awareness * Add updating pods for migration * Change JDBC url from mysql to mariadb * Fix TLS mount * Revert images * Remove redundant logic * Fix comments * feat: Add TLS certificate mount on ModelMesh (#255) * feat: Add TLS certificate mount on ModelMesh * Revert from http to https until kserve/modelmesh#147 is merged * Pin oc version, ubi version (#263) * Restore checkout of trustyai-exp (#265) * Add operator installation robustness (#266) * fix: Skip InferenceService patching for KServe RawDeployment (#262) * feat: ConfigMap key to disable KServe Serverless configuration (#267) * feat: Add support for custom certificates in database connection (#259) * Add TLS endpoint for ModelMesh payload processors. (#268) Keep non-TLS endpoint for KServe Serverless (disabled by default) --------- Signed-off-by: Yihong Wang <[email protected]> Co-authored-by: Rui Vieira <[email protected]> Co-authored-by: Rob Geada <[email protected]>
Configuration menu - View commit details
-
Copy full SHA for 427d102 - Browse repository at this point
Copy the full SHA 427d102View commit details
Commits on Aug 23, 2024
-
Weekly sync up of dev/lm-eval branch (#278)
* feat: Initial database support (#246) * Initial database support - Add status checking - Add better storage flags - Add spec.storage.format validation - Add DDL -Add HIBERNATE format to DB (test) - Update service image - Revert identifier to DATABASE - Update CR options (remove mandatory data) * Remove default DDL generation env var * Update service image to latest tag * Add migration awareness * Add updating pods for migration * Change JDBC url from mysql to mariadb * Fix TLS mount * Revert images * Remove redundant logic * Fix comments * feat: Add TLS certificate mount on ModelMesh (#255) * feat: Add TLS certificate mount on ModelMesh * Revert from http to https until kserve/modelmesh#147 is merged * Pin oc version, ubi version (#263) * Restore checkout of trustyai-exp (#265) * Add operator installation robustness (#266) * fix: Skip InferenceService patching for KServe RawDeployment (#262) * feat: ConfigMap key to disable KServe Serverless configuration (#267) * feat: Add support for custom certificates in database connection (#259) * Add TLS endpoint for ModelMesh payload processors. (#268) Keep non-TLS endpoint for KServe Serverless (disabled by default) * fix: Correct maxSurge and maxUnavailable (#275) * feat: Add support for custom DB names (#257) * feat: Add support for custom DB names * fix: Correct custom DB name --------- Signed-off-by: Yihong Wang <[email protected]> Co-authored-by: Rui Vieira <[email protected]> Co-authored-by: Rob Geada <[email protected]>
Configuration menu - View commit details
-
Copy full SHA for 342d1e2 - Browse repository at this point
Copy the full SHA 342d1e2View commit details
Commits on Aug 27, 2024
-
Driver updates job's status periodically (#280)
The driver periodically update the LMEvalJob.Status.Message field with the outputs from the lm-eval. The message pattern the driver captures is like `Running text generation: 81%|`. Then users can use this information to check the progress of the job. Signed-off-by: Yihong Wang <[email protected]>
Configuration menu - View commit details
-
Copy full SHA for f6d37ea - Browse repository at this point
Copy the full SHA f6d37eaView commit details -
Add Dockerfile for LMES job image (#276)
Add Dockerfile for LMES job image and the needed files Signed-off-by: Yihong Wang <[email protected]>
Configuration menu - View commit details
-
Copy full SHA for 2767641 - Browse repository at this point
Copy the full SHA 2767641View commit details
Commits on Aug 29, 2024
-
* feat: Add overlays * Remove redundant lmes-tas overlay. Change job image name.
Configuration menu - View commit details
-
Copy full SHA for f9c1284 - Browse repository at this point
Copy the full SHA f9c1284View commit details -
Configuration menu - View commit details
-
Copy full SHA for df87ea2 - Browse repository at this point
Copy the full SHA df87ea2View commit details
Commits on Aug 30, 2024
-
Configuration menu - View commit details
-
Copy full SHA for 0d2393d - Browse repository at this point
Copy the full SHA 0d2393dView commit details
Commits on Sep 12, 2024
-
feat: support batch size (#290)
Add batch size support in the LMEvalJob which leverages the `--batch_size` in the `lm-evaluation-harness`. This only affects the local models. The `--bath_size` doesn't work for remote inference APIs. Signed-off-by: Yihong Wang <[email protected]>
Configuration menu - View commit details
-
Copy full SHA for d2b9b2f - Browse repository at this point
Copy the full SHA d2b9b2fView commit details -
Add the
openai
package into the lmes job image (#292)update the LMES job's Dockerfile to include the `openai` package. Signed-off-by: Yihong Wang <[email protected]>
Configuration menu - View commit details
-
Copy full SHA for db7ae08 - Browse repository at this point
Copy the full SHA db7ae08View commit details
Commits on Sep 17, 2024
-
fix: fix dependency error in the job image (#296)
Split up the unitxt and openai dependencies to avoid the conflict. Signed-off-by: Yihong Wang <[email protected]>
Configuration menu - View commit details
-
Copy full SHA for d9b5684 - Browse repository at this point
Copy the full SHA d9b5684View commit details
Commits on Sep 20, 2024
-
feat: add device detection in lmes driver (#298)
Added a new feature in LMES driver to detect the available devices by using the PyTorch API. This feature can be disabled by passing the `--detect-device false` option. Signed-off-by: Yihong Wang <[email protected]>
Configuration menu - View commit details
-
Copy full SHA for a626cf8 - Browse repository at this point
Copy the full SHA a626cf8View commit details
Commits on Sep 24, 2024
-
feat: support unitxt recipes (#301)
Add new fields in the CRD to support unitxt recipes and leverage the driver to create corresponding yaml files of the unitxt recipes. Signed-off-by: Yihong Wang <[email protected]>
Configuration menu - View commit details
-
Copy full SHA for 159842f - Browse repository at this point
Copy the full SHA 159842fView commit details
Commits on Oct 9, 2024
-
feat: support custom dataset (#309)
Updated the CRD data struct to allow users to specify a custom Unitxt card in JSON format. The custom Unitxt card is equivalent to a custom dataset definition. Also restructured and updated the CRD to support Volumes, VolumeMounts, Env, Resources, Labels, and Annotations. Signed-off-by: Yihong Wang <[email protected]>
Configuration menu - View commit details
-
Copy full SHA for b2bec12 - Browse repository at this point
Copy the full SHA b2bec12View commit details
Commits on Oct 13, 2024
-
feat: new pulling mechanism for job statuses (#314)
Update the driver to keep running even the user program finishes. The driver provides two APIs: - GetStatus(): retrieve job status - Shutdown(): properly tear down the driver In the controller side, it uses `pod/exec` resource to run the driver command to invoke the driver APIs to retrieve the job status and shutdown the driver when job is done. Signed-off-by: Yihong Wang <[email protected]>
Configuration menu - View commit details
-
Copy full SHA for ab6bc98 - Browse repository at this point
Copy the full SHA ab6bc98View commit details
Commits on Oct 14, 2024
-
Move operator's cmd/operator/main.go to cmd/main.go to keep operator-…
…sdk compatibility (#295)
Configuration menu - View commit details
-
Copy full SHA for 36c035a - Browse repository at this point
Copy the full SHA 36c035aView commit details -
Configuration menu - View commit details
-
Copy full SHA for 1d3e882 - Browse repository at this point
Copy the full SHA 1d3e882View commit details
Commits on Oct 18, 2024
-
Configuration menu - View commit details
-
Copy full SHA for fe7c0bf - Browse repository at this point
Copy the full SHA fe7c0bfView commit details
Commits on Oct 19, 2024
-
Refactor some lmesreconcile methods (#323)
* Refactor lmes reconcile optoins Signed-off-by: ted chang <[email protected]> * Update controllers/lmes/lmevaljob_controller.go Co-authored-by: Yihong Wang <[email protected]> * Update controllers/lmes/lmevaljob_controller.go Co-authored-by: Yihong Wang <[email protected]> Signed-off-by: ted chang <[email protected]> --------- Signed-off-by: ted chang <[email protected]> Co-authored-by: Yihong Wang <[email protected]>
Configuration menu - View commit details
-
Copy full SHA for 61744ff - Browse repository at this point
Copy the full SHA 61744ffView commit details -
tidy: clean up lmes-job image (#333)
remove BAM related packages and patch. Signed-off-by: Yihong Wang <[email protected]>
Configuration menu - View commit details
-
Copy full SHA for dc03620 - Browse repository at this point
Copy the full SHA dc03620View commit details
Commits on Oct 21, 2024
-
Enable job suspend for Kueue (#317)
* Refactor lmes reconcile optoins Signed-off-by: ted chang <[email protected]> * Update controllers/lmes/lmevaljob_controller.go Co-authored-by: Yihong Wang <[email protected]> * Update controllers/lmes/lmevaljob_controller.go Co-authored-by: Yihong Wang <[email protected]> Signed-off-by: ted chang <[email protected]> * Enable job suspend for Kueue Signed-off-by: ted chang <[email protected]> --------- Signed-off-by: ted chang <[email protected]> Co-authored-by: Yihong Wang <[email protected]>
Configuration menu - View commit details
-
Copy full SHA for b54e222 - Browse repository at this point
Copy the full SHA b54e222View commit details -
Configuration menu - View commit details
-
Copy full SHA for faf468b - Browse repository at this point
Copy the full SHA faf468bView commit details -
sync: sync up dev/lm-eval branch with main branch (#336)
* [CI] Run tests from trustyai-tests (#279) * Change Dockerfile to clone trustyai-tests * Add PYTEST_MARKERS env and remove TESTS_REGEX * RHOAIENG-12274: Update operator's overlays (#287) * Update operator's overlays * Update kustomization.yaml * Add devflag printout to GH Action comment (#289) * Add timeout loop to DSC install (#305) * RHOAIENG-13625: Add DBAvailable status to CR (#304) * Add DBAvailable status to CR * Remove probes * Add KServe destination rule for Inference Services in the ServiceMesh (#315) * Add DestinationRule creation for KServe serverless * Add permissions for destination rules * Add role for destination rules * Add missing role for creating destination rules * Fix spacing in DestinationRule template * Add check if DestinationRule CRD is present before creating it (#316) * Add check for DestinationRule CRD * Add API extensions to operator's scheme * Add permission for CRD resource * Fix operator metrics service target port (#320) * Add readiness probes (#312) * Enable KServe serverless in the rhoai overlay (#321) * Update overlay images (#331) * Add correct CA cert to JDBC (#324) * Add correct CA cert to JDBC * Add require SSL * Support for VirtualServices for InferenceLogger traffic (#332) * Generate KServe Inference Logger in conformance with DestinationRule and VirtualService * Add VirtualService creation for models in the mesh * Add permissions for VirtualServices * Update manifests for VirtualServices * Fix VirtualServiceName variable * fix yaml linter after the sync Signed-off-by: Yihong Wang <[email protected]> * tidy the go.mod and go.sum as well Signed-off-by: Yihong Wang <[email protected]> --------- Signed-off-by: Yihong Wang <[email protected]> Co-authored-by: Adolfo Aguirrezabal <[email protected]> Co-authored-by: Rui Vieira <[email protected]> Co-authored-by: Rob Geada <[email protected]> Co-authored-by: Rui Vieira <[email protected]>
Configuration menu - View commit details
-
Copy full SHA for 471738b - Browse repository at this point
Copy the full SHA 471738bView commit details
Commits on Oct 22, 2024
-
Configuration menu - View commit details
-
Copy full SHA for 834829b - Browse repository at this point
Copy the full SHA 834829bView commit details