Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Performance on 20 NewsGroups #2

Open
YFCodeDream opened this issue Oct 26, 2023 · 1 comment
Open

Performance on 20 NewsGroups #2

YFCodeDream opened this issue Oct 26, 2023 · 1 comment

Comments

@YFCodeDream
Copy link

YFCodeDream commented Oct 26, 2023

Hello, thanks for your excellent work! I tried to reproduce the performance on 20 NewsGroups dataset. After I fixed some small bugs, I got the result:
{'accuracy': 0.848380244291025, 'micro_precision': 0.848380244291025, 'micro_recall': 0.848380244291025, 'micro_f1': 0.848380244291025, 'micro_num_predicted': 7532, 'micro_num_gold': 7532, 'macro_precision': 0.8462746442882952, 'macro_recall': 0.8409563907963044, 'macro_f1': 0.8410184256685328}

I have tried several times with the default settings. All results do not exceed 0.85 accuracy. I want to know the detailed experiment settings to reproduce the result 0.863 accuracy reported in Table 3.

This is my shell script to run the experiment on 20 NewsGroups:

cd ../code
data_dir=/HDD-b/gyf/trldc
dataset=20newsbydate

seed=52
length=4096
output=${dataset}_length_${length}_seed_${seed}
output_dir=${data_dir}/TEMP/0330A_1_${output}_$(date +%F-%H-%M-%S-%N)

if ! test -f "../results/3/test/${output}.json"; then
  python3 train.py \
  --task_name singlelabel \
  --dataset_name $dataset \
  --output_metrics_filepath ../results/1/train/${output}.json \
  --model_dir $data_dir/Corpora/RoBERTa/mimic_roberta_base \
  --seed $seed \
  --train_filepath $data_dir/ProcessedData/20newsbydate/0/train.json \
  --dev_filepath $data_dir/ProcessedData/20newsbydate/0/dev.json \
  --output_dir $output_dir \
  --per_device_train_batch_size 2 \
  --gradient_accumulation_steps 8 \
  --learning_rate 2e-5 \
  --num_train_epochs 30.0 \
  --save_strategy epoch \
  --evaluation_strategy epoch \
  --metric_for_best_model accuracy \
  --greater_is_better \
  --max_seq_length $length \
  --segment_length 64 --do_use_stride --do_use_label_wise_attention

  python3 eval.py \
  --task_name singlelabel \
  --dataset_name $dataset \
  --output_metrics_filepath ../results/3/test/${output}.json \
  --model_dir $output_dir \
  --test_filepath $data_dir/ProcessedData/20newsbydate/0/test.json \
  --output_dir $output_dir \
  --max_seq_length $length \
  --segment_length 64 --do_use_stride --do_use_label_wise_attention

  rm -r $output_dir
fi

Looking forward to your reply!

@dainlp
Copy link
Collaborator

dainlp commented Nov 6, 2023

Hi, can you check the following sets? I think I used a larger segment length without label-wise attention. Also I didn't use mimic_robert.

#!/bin/bash
#SBATCH --time=5:00:00 --mail-type=END --mail-user=[email protected]
#SBATCH --ntasks=1 --cpus-per-task=4 --mem=8GB
#SBATCH -p gpu --gres=gpu:titanrtx:1
#SBATCH --job-name 0312B
#SBATCH --output=C-%x-%j-%a.out
#SBATCH --array=0-23%3

cd ../code
data_dir=/home/djk887
dataset=20news

SEEDs=(52 869 1001)
LENGTHs=(8192 7168 6144 5120 4096 3072 2048 1024)
seed=${SEEDs[$SLURM_ARRAY_TASK_ID/${#LENGTHs[@]}]}
length=${LENGTHs[$SLURM_ARRAY_TASK_ID%${#LENGTHs[@]}]}
output=${dataset}length${length}seed${seed}
output_dir=${data_dir}/TEMP/0312B_${output}_$(date +%F-%H-%M-%S-%N)

if ! test -f "../results/3/test/${output}.json"; then
python train.py
--task_name singlelabel
--dataset_name $dataset
--max_seq_length $length
--train_filepath $data_dir/ProcessedData/20newsbydate/0/train.json
--dev_filepath $data_dir/ProcessedData/20newsbydate/0/dev.json
--model_dir $data_dir/Corpora/RoBERTa/roberta-base
--output_dir $output_dir
--output_metrics_filepath ../results/3/train/${output}.json
--per_device_train_batch_size 2
--gradient_accumulation_steps 8
--learning_rate 2e-5
--num_train_epochs 30.0
--save_strategy epoch
--evaluation_strategy epoch
--metric_for_best_model accuracy
--greater_is_better
--seed $seed
--segment_length 256 --do_use_stride

python eval.py
--task_name singlelabel
--dataset_name $dataset
--max_seq_length $length
--test_filepath $data_dir/ProcessedData/20newsbydate/0/test.json
--model_dir $data_dir/Corpora/RoBERTa/roberta-base
--output_dir $output_dir
--output_metrics_filepath ../results/3/test/${output}.json
--segment_length 256 --do_use_stride

rm -r $output_dir
fi

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants