Electriclizard solution #14

electriclizard · 2023-05-15T13:06:03Z

Hi 🤚!
Here is my simple solution for nlp models inference server.
It works with cpu and gpu. And i've tested it with wrk benchmark tool.
Documentation will be available at /documentation/swagger-ui endpoint.
It also has some future works, like models weights update process.
I also think that inference models with Nvidia triton inference server will be more efficient, but not sure about the deployment with helm chart
I've also tried threading for parallel models call, but still have save rps on benchmarks, it was expecting, but i tried, now working on increasing rps

Dev

rsolovev · 2023-05-15T15:00:27Z

Hey @electriclizard, thank you for the great solution. Here are our test results on grafana dashboard

If you would like to work on your solution further, you can continue optimizing/improving it and re-request our review once done. Any contribution during the challenge period will be taken into account while choosing a winner. Many thanks!

Artur and others added 7 commits May 14, 2023 00:39

infrastructure layer

b10375a

service layer

558714a

handlers, configuration and transport app layer

5923b45

Containerization

a9f6005

helm chart updates

a9659ca

entrypoint fix

3bae4aa

Merge pull request #13 from electriclizard/dev

8b893f4

Dev

electriclizard requested review from darknessest and rsolovev as code owners May 15, 2023 13:06

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Electriclizard solution #14

Electriclizard solution #14

electriclizard commented May 15, 2023

rsolovev commented May 15, 2023

Electriclizard solution #14

Are you sure you want to change the base?

Electriclizard solution #14

Conversation

electriclizard commented May 15, 2023

rsolovev commented May 15, 2023