Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Empty Set Behaviour for Topics Without Retrieved Documents #5

Open
mam10eks opened this issue Apr 23, 2023 · 1 comment
Open

Empty Set Behaviour for Topics Without Retrieved Documents #5

mam10eks opened this issue Apr 23, 2023 · 1 comment

Comments

@mam10eks
Copy link

Dear all,

Thank you very much for this nice tool!

At the moment, repro_eval crashes if a baseline run retrieves no documents for a topic that an advanced run has.
In our situation, we have a duoT5 run in comparison with a BM25 run, and the duoT5 run re-ranks the BM25 run.
There is a topic for which BM25 retrieves only a single document, and therefore, duoT5 can not build any pairs, so duoT5 retrieves no documents for the topics for which BM25 retrieves one document.

The program fails when the baseline run has a missing topic: https://github.com/irgroup/repro_eval/blob/master/repro_eval/measure/overall_effects.py#L18

The opposite direction should also be handled, i.e., when the advanced run does not retrieve a document for a topic.

A possible solution would be that repro_eval implements the empty set behaviour of ir_measures? (citing from the linked page "queries that appear in the qrels but not the run are given a score of 0. Queries that do not appear in the run may have returned no results, and therefore be scored as such."

What do you think?

@mam10eks
Copy link
Author

mam10eks commented Apr 23, 2023

A method like this would maybe solve it (adding pseudo non-relevant document to the run) if all parsing would be done with this method:

    @staticmethod
    def parse_run(run_file, qrels=None):
        """
        This parses the passed run_file with pytrec_eval.
        For each topic in the qrels that is not in the run,
        we add a single non-relevant document to implement the empty set behaviour of ir_measures:
        Queries that appear in the qrels but not the run should become a score of 0, so we add a non-relevant document for topics that are not retrieved.
        """
        if not qrels:
            qrels = {}
        
        with open(run_file, 'r') as f_run:
            run = pytrec_eval.parse_run(f_run)
            for topic_in_qrels in qrels.keys():
                if topic_in_qrels not in run:
                    run[topic_in_qrels] = {f'non-relevant-pseudo-document-for-topic-{topic_in_qrels}': 1}

            return {t: run[t] for t in sorted(run)}

But I am also not too sure if one should implement this in repro_eval, or if one should rather fix the run.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

1 participant