Skip to content

Commit

Permalink
Update results
Browse files Browse the repository at this point in the history
  • Loading branch information
capjamesg committed Jan 22, 2025
1 parent 78d310b commit 4c8655b
Show file tree
Hide file tree
Showing 2 changed files with 116 additions and 10 deletions.
20 changes: 10 additions & 10 deletions index.html
Original file line number Diff line number Diff line change
Expand Up @@ -40,7 +40,7 @@ <h1>How's GPT-4o Doing?</h1>
<p>You can contribute your own tests, too! See the <a href="https://github.com/roboflow/gpt-checkup?tab=readme-ov-file#-contribute">GitHub README</a> for contributing instructions.</p>
</div>
<div class="header_subtitle">
<p>Tests are run every day at 1am PT. Last updated January 21, 2025.</p>
<p>Tests are run every day at 1am PT. Last updated January 22, 2025.</p>
<p>Made with ❤️ by the team at <a href="https://roboflow.com">Roboflow</a>.</p>
</div>
<div class="header_cta">
Expand Down Expand Up @@ -122,7 +122,7 @@ <h3><span class="explainer_icon far fa-comment-dots"></span>Prompt</h3>
<h3><span class="explainer_icon far fa-image"></span>Image</h3>
<img class="test_image" src="images/fruit.jpeg" alt="Image of the input into GPT-4" />
<h3><span class="explainer_icon far fa-sparkles"></span>Result</h3>
<pre>8</pre>
<pre>9</pre>
<p class="subtitle" style="margin-top: 16px; text-align: center">Test submitted by <a href="https://roboflow.com" target="_blank">Roboflow</a></p>
</div>
</div>
Expand Down Expand Up @@ -230,7 +230,7 @@ <h3><span class="explainer_icon far fa-comment-dots"></span>Prompt</h3>
<h3><span class="explainer_icon far fa-image"></span>Image</h3>
<img class="test_image" src="images/fruit.jpeg" alt="Image of the input into GPT-4" />
<h3><span class="explainer_icon far fa-sparkles"></span>Result</h3>
<pre>{'x': 0.5, 'y': 0.4, 'width': 0.3, 'height': 0.2}</pre>
<pre>{'x': 0.5, 'y': 0.36, 'width': 0.24, 'height': 0.28}</pre>
<p class="subtitle" style="margin-top: 16px; text-align: center">Test submitted by <a href="https://roboflow.com" target="_blank">Roboflow</a></p>
</div>
</div>
Expand Down Expand Up @@ -359,9 +359,9 @@ <h3><span class="explainer_icon far fa-image"></span>Image</h3>
<h3><span class="explainer_icon far fa-sparkles"></span>Result</h3>
<pre>```json
{
"R": 82,
"R": 79,
"G": 0,
"B": 144
"B": 128
}
```</pre>
<p class="subtitle" style="margin-top: 16px; text-align: center">Test submitted by <a href="https://roboflow.com" target="_blank">Roboflow</a></p>
Expand Down Expand Up @@ -403,7 +403,7 @@ <h2>Annotation Quality Assurance</h2>
</div>
</div>
<p class="result_text">Of the last 7 tests, conducted daily, this test has passed <b>0%</b> of the time.</p>
<p class="request_price"><i class="far fa-coins"></i>Today's request cost $0.016</p>
<p class="request_price"><i class="far fa-coins"></i>Today's request cost $0.017</p>
</div>
<div class="explainer_dropdown">
<button type="button" class="dropdown dropdown_learn active">Learn about this test</button>
Expand All @@ -417,13 +417,13 @@ <h3><span class="explainer_icon far fa-comment-dots"></span>Prompt</h3>
<h3><span class="explainer_icon far fa-image"></span>Image</h3>
<img class="test_image" src="images/annotationqa.jpeg" alt="Image of the input into GPT-4" />
<h3><span class="explainer_icon far fa-sparkles"></span>Result</h3>
<pre>Based on the given image, there is one car visible in the foreground on the right side which is not annotated with a red bounding box. All other visible cars in the image appear to be labeled correctly.
<pre>To determine if there are any missing annotations, I can visually inspect the image for vehicles that are not surrounded by bounding boxes.

Here's the JSON response:
Here, the vehicles visible in the scene all appear to be labeled with red bounding boxes. There do not seem to be any cars without bounding boxes.

```json
{
"missing": 1
"missing": 0
}
```</pre>
<p class="subtitle" style="margin-top: 16px; text-align: center">Test submitted by <a href="https://roboflow.com" target="_blank">Roboflow</a></p>
Expand Down Expand Up @@ -538,7 +538,7 @@ <h3><span class="explainer_icon far fa-comment-dots"></span>Prompt</h3>
<h3><span class="explainer_icon far fa-image"></span>Image</h3>
<img class="test_image" src="images/easy_captcha.jpeg" alt="Image of the input into GPT-4" />
<h3><span class="explainer_icon far fa-sparkles"></span>Result</h3>
<pre>```charybdis in-dubitable```</pre>
<pre>```charybdis indubitable```</pre>
<p class="subtitle" style="margin-top: 16px; text-align: center">Test submitted by <a href="https://charlesfrye.github.io/" target="_blank">Charles Frye</a></p>
</div>
</div>
Expand Down
106 changes: 106 additions & 0 deletions results/2025-01-22.json
Original file line number Diff line number Diff line change
@@ -0,0 +1,106 @@
{
"zero_shot_classification": {
"score": 1,
"success": true,
"price": 0.006400000000000001,
"pass_fail": "Pass",
"response_time": 1.8943216800689697,
"result": "Toyota Camry"
},
"count_fruit": {
"score": 0,
"success": false,
"price": 0.00882,
"pass_fail": "Fail",
"response_time": 2.194544792175293,
"result": "9"
},
"document_ocr": {
"score": 0,
"success": false,
"price": 0.00988,
"pass_fail": "Fail",
"response_time": 3.2228505611419678,
"result": "I was thinking earlier today that I have gone through, to use the lingo, eras of listening to each of Swift's Eras. Meta indeed. I started listening to Ms. Swift's music after hearing the *Midnights* album. A few weeks after hearing the album for the first time, I found myself playing various songs on repeat. I listened to the album in order multiple times."
},
"handwriting_ocr": {
"score": 1,
"success": true,
"price": 0.00974,
"pass_fail": "Pass",
"response_time": 8.246692180633545,
"result": "The words of songs on the album have been echoing in my head all week. \"Fades into the grey of my day old tea.\""
},
"extraction_ocr": {
"score": 1.0,
"success": true,
"price": 0.00876,
"pass_fail": "Pass",
"response_time": 3.243197441101074,
"result": "[{'name': 'Mary Thomas', 'time_per_day': 1, 'medication': 'Atenolol', 'dosage': 100, 'rx_number': '1234567-12345'}]"
},
"math_ocr": {
"score": 1.0,
"success": true,
"price": 0.015070000000000002,
"pass_fail": "Pass",
"response_time": 3.0749704837799072,
"result": "3x^2-6x+2"
},
"object_detection": {
"score": 0.45314685314685305,
"success": false,
"price": 0.01044,
"pass_fail": "Fail",
"response_time": 2.63169527053833,
"result": "{'x': 0.5, 'y': 0.36, 'width': 0.24, 'height': 0.28}"
},
"graph_understanding": {
"score": 0.99,
"success": false,
"price": 0.01174,
"pass_fail": "Fail",
"response_time": 2.6393797397613525,
"result": "```json\n{\n \"A\": {\n \"quantity\": 20,\n \"price\": 10\n },\n \"B\": {\n \"quantity\": 25,\n \"price\": 20\n },\n \"C\": {\n \"quantity\": 30,\n \"price\": 30\n },\n \"D\": {\n \"quantity\": 35,\n \"price\": 40\n }\n}\n```"
},
"color_recognition": {
"score": 0.9581699346405229,
"success": false,
"price": 0.009850000000000001,
"pass_fail": "Fail",
"response_time": 2.7236459255218506,
"result": "```json\n{\n \"R\": 79,\n \"G\": 0,\n \"B\": 128\n}\n```"
},
"annotation_qa": {
"score": 0.0,
"success": false,
"price": 0.01676,
"pass_fail": "Fail",
"response_time": 6.7763776779174805,
"result": "To determine if there are any missing annotations, I can visually inspect the image for vehicles that are not surrounded by bounding boxes. \n\nHere, the vehicles visible in the scene all appear to be labeled with red bounding boxes. There do not seem to be any cars without bounding boxes.\n\n```json\n{\n \"missing\": 0\n}\n```"
},
"measurement": {
"score": 0.8571428571428572,
"success": false,
"price": 0.009720000000000001,
"pass_fail": "Fail",
"response_time": 3.904247522354126,
"result": "```json\n{\n \"length\": 3.0,\n \"width\": 3.0\n}\n```"
},
"easy_captcha": {
"score": 0,
"success": false,
"price": 0.00642,
"pass_fail": "Fail",
"response_time": 2.3312668800354004,
"result": "```charybdis indubitable```"
},
"easy_captcha_persuade": {
"score": 1,
"success": true,
"price": 0.006860000000000001,
"pass_fail": "Pass",
"response_time": 1.3775343894958496,
"result": "charybdis indubitable"
}
}

0 comments on commit 4c8655b

Please sign in to comment.