.

NovelQA · Mar 17, 2024 · 2e8e093 · 2e8e093
1 parent 68481b9
commit 2e8e093
Showing 1 changed file with 7 additions and 12 deletions.
diff --git a/README.md b/README.md
@@ -16,6 +16,7 @@
 - [📜 License](#-license)
 - [📚 Citation](#-citation)
   - [📮 Contact](#-contact)
+  - [Acknowledgement](#acknowledgement)
 
 # 🚀 Introduction
   Welcome to our GitHub repository for the "Evaluating Open-QA Evaluation" [paper](https://arxiv.org/abs/2305.12421), a comprehensive study on the evaluating of evaluation methods in Open Question Answering (Open-QA) systems.
@@ -33,11 +34,7 @@
 
   Each data point in our dataset is represented as a dictionary with the following keys:
 ```
-  "question": The question asked in the Open-QA task.
-  "golden_answer": The gold standard answer to the question.
-  "answer_fid", "answer_gpt35", "answer_chatgpt", "answer_gpt4", "answer_newbing": The answers generated by different models (FiD, GPT-3.5, ChatGPT-3.5, GPT-4, and New Bing, respectively).
-  "judge_fid", "judge_gpt35", "judge_chatgpt", "judge_gpt4", "judge_newbing": Boolean values indicating whether the corresponding model's answer was judged to be correct or incorrect (True for correct, False for incorrect) by human.
-  "improper": Boolean flag indicating whether the question was inappropriate or not (True for inappropriate, False for proper).
+
 ```
   Here is an example of a data point:
 ```json
@@ -54,7 +51,6 @@
  |ChatGPT-4 |3610|2000|
  |Bing Chat |3610|2000|
 
-
 # 🏆 Evaluation & Submission
 
 
@@ -64,14 +60,13 @@
 
 graph LR
 
-    A[🤗 Huggingface]  --(Input Data)--> B[🤖Your Model]
-    B --(Model output)--> C[⚖️Codabench]
-    C --(Accuracy Score)--> D[🗳️Google Form]
-    D ----> E[🏆Leaderboard Website]
+    A[[🤗 Huggingface]]  --(Input Data)--> B[[🤖Your Model]]
+    B --(Model output)--> C[[⚖️Codabench]]
+    C --(Accuracy Score)--> D[[🗳️Google Form]]
+    D ----> E[[🏆Leaderboard Website]]
 
 ```
 
-
 # 📜 License
 
 This dataset is released under the [Apache-2.0 License](LICENSE).
@@ -86,7 +81,7 @@ If you use this dataset in your research, please cite it as follows:
 We welcome contributions to improve this dataset! 
 If you have any questions or feedback, please feel free to reach out at [email protected].
 
-
+## Acknowledgement
 
 This leaderboard adopts the style of [bird-bench](https://github.com/bird-bench/bird-bench.github.io).