Fill README out a bit more

vemchance · Feb 8, 2024 · d451cb4 · d451cb4
1 parent 8ac5c31
commit d451cb4
Show file tree

Hide file tree

Showing 2 changed files with 30 additions and 6 deletions.
diff --git a/EDA/readme.md b/EDA/readme.md
@@ -2,5 +2,5 @@
 This is where all the Exploratory Data Analysis will live. The first notebook is a quick overview of the data, mostly focusing on NLP but will be updated to include vision. Output images/files may also be placed here. Some more complex analysis might be added.
 
 ## To add:
-- JRC helper notebook (JRC are the responisable org for the labels we're using)
+- JRC helper notebook (JRC are the responsible org for the labels we're using)
 - Entity analysis from GoogleVision
diff --git a/README.md b/README.md
@@ -4,24 +4,48 @@
 
 This is our entry to [SemEval2024 Task 4: Multilingual Detection of Persuasion Techniques in Memes](https://propaganda.math.unipd.it/semeval2024task4/index.html). Our plan is to tackle only tasks **1** and **2a**.
 
+**Paper:** \[LINK PENDING. TODO FILL IN HERE WHEN AVAILABLE.\]
+
 **This repository is a work in progress.**
 
 ## System Requirements
 - TODO detail system requirements here
 
 ## Getting started
-TODO Write this section of the README. We should include:
+The code in this repository is split into multiple subdirectories:
+
+- **`EDA`:** Exploratory Data Analysis. Extra experiments not required to utilise our approach.
+- **`GoogleVision`:** Generates entities from image files. This is used as an input to the vision stream.
+- **`LateFusionEngine`:** The late-fusion engine that merges the output of the NLP and Vision streams together using an per-label accuracy weighting system.
+- **`Multimodal Baselines`:** 
+- **`Predictions`:** The predictions we (presumably) submitted to the challenge for the `dev` dataset. TODO confirm if this is actually the case.
+- **`Test Prediction Files`:** The predictions we (presumably) submitted to the challenge for the `test` dataset. TODO confirm if this is actually the case.
+- **`Unimodal Baselines`:** 
+- **`scorer-baseline`:** 
+- **`word-embeddings`:** Some experiments with word embedding algorithms. These experiments informed the rest of the work done, but is not required to use the approach detailed in our paper.
+
+TODO Fill in the rest of the above descriptions.
+
+Please visit the README.md file in each subdirectory for specific instructions on each subproject.
+
+A common first step though is to clone this git repository:
+
+```bash
+git clone https://github.com/vemchance/BDA-SemEval4.git
+cd BDA-SemEval4
+```
+
+TODO Finish this section of the README. We should include a high-level overview of the project and how to use it.
 
-- What is in each directory
-- Where instructions are for each thing AND which is the main thing
-- MAYBE getting started instructions for the main thing
+## Architecture
+TODO fill this out.
 
 ## OneDrive Link
 Link to OneDrive where the big files live: [OneDrive](https://hullacuk-my.sharepoint.com/:f:/g/personal/v_sherratt-2020_hull_ac_uk/EpevevOycPdKppCMZaSyysgB-z2AeAiZ-2YtVN9tHKF-5Q?e=8Of06X)
 To be DELETED once the repo is public. E-mail Vic if you don't have access. Do not reshare link.
 
 ## Contributing
-TODO fill this out. I assume contributions are welcome after the challenge si finished. If so, we should say so here.
+TODO fill this out. I assume contributions are welcome after the challenge is finished. If so, we should say so here.
 
 ## Licence
 All code in this repository is licensed under the [GNU Affero General Public License 3.0](./LICENSE.md). choosealicense.com has a great summary of this license: <https://choosealicense.com/licenses/agpl-3.0/>