Skip to content

Commit

Permalink
This commit does not belong to any branch on this repository, and may belong to a fork outside of the repository.
feat: bias-toxicity
Browse files Browse the repository at this point in the history
ledong0110 committed Sep 4, 2024

Verified

This commit was created on GitHub.com and signed with GitHub’s verified signature. The key has expired.
1 parent 1ed8925 commit 86c74f3
Showing 9 changed files with 1,647 additions and 742 deletions.
2 changes: 1 addition & 1 deletion _config.yml
Original file line number Diff line number Diff line change
@@ -2,7 +2,7 @@ title: MELT
description: "Multilingual Evaluation Toolkits"

# disabled because we are using a custom domain
baseurl: https://ai.stanford.edu/~sttruong/melt
# baseurl: https://ai.stanford.edu/~sttruong/melt

color-primary: "#B1040E"
color-light: "#E50808"
146 changes: 146 additions & 0 deletions _data/leaderboard/vi/bias_toxicity/summarization.yml
Original file line number Diff line number Diff line change
@@ -0,0 +1,146 @@
VietNews:
URA-LLaMa 70B:
DRR: null
DRG: 0.21
DRG_std: 0.01
SAR: null
SAG: 0.31
SAG_std: 0.01
Tox: 0.05
Tox_std: 0.00
URA-LLaMa 13B:
DRR: null
DRG: 0.20
DRG_std: 0.01
SAR: null
SAG: 0.29
SAG_std: 0.01
Tox: 0.04
Tox_std: 0.00
URA-LLaMa 7B:
DRR: null
DRG: 0.24
DRG_std: 0.02
SAR: null
SAG: 0.33
SAG_std: 0.01
Tox: 0.04
Tox_std: 0.00
LLaMa-2 13B:
DRR: null
DRG: 0.26
DRG_std: 0.01
SAR: null
SAG: 0.38
SAG_std: 0.01
Tox: 0.01
Tox_std: 0.00
LLaMa-2 7B:
DRR: null
DRG: 0.28
DRG_std: 0.02
SAR: null
SAG: 0.39
SAG_std: 0.01
Tox: 0.01
Tox_std: 0.00
Vietcuna 7B:
DRR: null
DRG: 0.21
DRG_std: 0.02
SAR: null
SAG: 0.32
SAG_std: 0.02
Tox: 0.04
Tox_std: 0.00
GPT-3.5:
DRR: null
DRG: 0.22
DRG_std: 0.01
SAR: null
SAG: 0.29
SAG_std: 0.01
Tox: 0.04
Tox_std: 0.00
GPT-4:
DRR: null
DRG: 0.19
DRG_std: 0.01
SAR: null
SAG: 0.28
SAG_std: 0.01
Tox: 0.06
Tox_std: 0.00
WikiLingua:
URA-LLaMa 70B:
DRR: null
DRG: 0.03
DRG_std: 0.02
SAR: null
SAG: 0.25
SAG_std: 0.02
Tox: 0.03
Tox_std: 0.00
URA-LLaMa 13B:
DRR: null
DRG: 0.07
DRG_std: 0.04
SAR: null
SAG: 0.31
SAG_std: 0.03
Tox: 0.02
Tox_std: 0.00
URA-LLaMa 7B:
DRR: null
DRG: 0.07
DRG_std: 0.02
SAR: null
SAG: 0.38
SAG_std: 0.02
Tox: 0.03
Tox_std: 0.00
LLaMa-2 13B:
DRR: null
DRG: 0.17
DRG_std: 0.08
SAR: null
SAG: 0.50
SAG_std: 0.02
Tox: 0.01
Tox_std: 0.00
LLaMa-2 7B:
DRR: null
DRG: 0.39
DRG_std: 0.05
SAR: null
SAG: 0.50
SAG_std: 0.02
Tox: 0.01
Tox_std: 0.00
Vietcuna 7B:
DRR: null
DRG: 0.17
DRG_std: 0.04
SAR: null
SAG: 0.39
SAG_std: 0.03
Tox: 0.03
Tox_std: 0.00
GPT-3.5:
DRR: null
DRG: 0.03
DRG_std: 0.02
SAR: null
SAG: 0.28
SAG_std: 0.01
Tox: 0.02
Tox_std: 0.00
GPT-4:
DRR: null
DRG: 0.09
DRG_std: 0.02
SAR: null
SAG: 0.28
SAG_std: 0.01
Tox: 0.02
Tox_std: 0.00
146 changes: 146 additions & 0 deletions _data/leaderboard/vi/bias_toxicity/translation.yml
Original file line number Diff line number Diff line change
@@ -0,0 +1,146 @@
PhoMT (En - Vi):
URA-LLaMa 70B:
DRR: null
DRG: 0.03
DRG_std: 0.01
SAR: null
SAG: 0.30
SAG_std: 0.01
Tox: 0.05
Tox_std: 0.00
URA-LLaMa 13B:
DRR: null
DRG: 0.09
DRG_std: 0.00
SAR: null
SAG: 0.33
SAG_std: 0.01
Tox: 0.05
Tox_std: 0.00
URA-LLaMa 7B:
DRR: null
DRG: 0.13
DRG_std: 0.00
SAR: null
SAG: 0.33
SAG_std: 0.01
Tox: 0.05
Tox_std: 0.00
LLaMa-2 13B:
DRR: null
DRG: 0.08
DRG_std: 0.00
SAR: null
SAG: 0.33
SAG_std: 0.02
Tox: 0.05
Tox_std: 0.00
LLaMa-2 7B:
DRR: null
DRG: 0.17
DRG_std: 0.01
SAR: null
SAG: 0.29
SAG_std: 0.01
Tox: 0.04
Tox_std: 0.00
Vietcuna 7B:
DRR: null
DRG: 0.18
DRG_std: 0.01
SAR: null
SAG: 0.36
SAG_std: 0.01
Tox: 0.04
Tox_std: 0.00
GPT-3.5:
DRR: null
DRG: 0.11
DRG_std: 0.01
SAR: null
SAG: 0.34
SAG_std: 0.01
Tox: 0.05
Tox_std: 0.00
GPT-4:
DRR: null
DRG: 0.09
DRG_std: 0.01
SAR: null
SAG: 0.34
SAG_std: 0.01
Tox: 0.05
Tox_std: 0.00
OPUS100 (En - Vi):
URA-LLaMa 70B:
DRR: null
DRG: 0.27
DRG_std: 0.01
SAR: null
SAG: 0.47
SAG_std: 0.01
Tox: 0.06
Tox_std: 0.00
URA-LLaMa 13B:
DRR: null
DRG: 0.27
DRG_std: 0.01
SAR: null
SAG: 0.43
SAG_std: 0.02
Tox: 0.07
Tox_std: 0.00
URA-LLaMa 7B:
DRR: null
DRG: 0.18
DRG_std: 0.03
SAR: null
SAG: 0.47
SAG_std: 0.01
Tox: 0.07
Tox_std: 0.00
LLaMa-2 13B:
DRR: null
DRG: 0.31
DRG_std: 0.02
SAR: null
SAG: 0.47
SAG_std: 0.01
Tox: 0.06
Tox_std: 0.00
LLaMa-2 7B:
DRR: null
DRG: 0.21
DRG_std: 0.02
SAR: null
SAG: 0.45
SAG_std: 0.02
Tox: 0.05
Tox_std: 0.00
Vietcuna 7B:
DRR: null
DRG: 0.16
DRG_std: 0.03
SAR: null
SAG: 0.43
SAG_std: 0.02
Tox: 0.07
Tox_std: 0.00
GPT-3.5:
DRR: null
DRG: 0.16
DRG_std: 0.03
SAR: null
SAG: 0.43
SAG_std: 0.03
Tox: 0.07
Tox_std: 0.00
GPT-4:
DRR: null
DRG: 0.14
DRG_std: 0.03
SAR: null
SAG: 0.41
SAG_std: 0.01
Tox: 0.07
Tox_std: 0.00
12 changes: 6 additions & 6 deletions _pages/vi/bias-toxicity/question-answering.md
Original file line number Diff line number Diff line change
@@ -3,22 +3,22 @@ layout: default
permalink: /leaderboard/vi/bias-toxicity/question-answering
---
# Bias-Toxicity Question Answering Leaderboard
<div>{{page.lang}}</div>
{% assign lang = 'vi' %}

<table class="table table-bordered table-sm w-100 dtHorizontalTable" cellspacing="0">
<thead>
<tr>
<th rowspan="2" class="text-center align-middle">
<b>Models</b>
</th>
{% for dataset in site.data.leaderboard.vi.bias_toxicity.qa %}
{% for dataset in site.data.leaderboard[lang].bias_toxicity.qa %}
<th colspan="5" class="text-center">
<b>{{ dataset[0] }}</b>
</th>
{% endfor %}
</tr>
<tr>
{% for dataset in site.data.leaderboard.vi.bias_toxicity.qa %}
{% for dataset in site.data.leaderboard[lang].bias_toxicity.qa %}
<th class="text-center"><b>DRR↓</b></th>
<th class="text-center"><b>DRG↓</b></th>
<th class="text-center"><b>SAR↓</b></th>
@@ -28,18 +28,18 @@ permalink: /leaderboard/vi/bias-toxicity/question-answering
</tr>
</thead>
<tbody>
{% for model in site.data.leaderboard.vi.models.models %}
{% for model in site.data.leaderboard[lang].models.models %}
<tr>
<td class="text-center">
<b>{{ model }}</b>
</td>
{% for dataset in site.data.leaderboard.vi.bias_toxicity.qa %}
{% for dataset in site.data.leaderboard[lang].bias_toxicity.qa %}
{% assign DRR_min = 1 %}
{% assign DRG_min = 1 %}
{% assign SAR_min = 1 %}
{% assign SAG_min = 1 %}
{% assign Tox_min = 1 %}
{% for m in site.data.leaderboard.vi.models.models %}
{% for m in site.data.leaderboard[lang].models.models %}
{% if dataset[1][m].DRR and dataset[1][m].DRR < DRR_min %}
{% assign DRR_min = dataset[1][m].DRR %}
{% endif %}
Loading

0 comments on commit 86c74f3

Please sign in to comment.