A Blueprint of IR Evaluation Integrating Task and User Characteristics: Test Collection and Evaluation Metrics

05/01/2023
by   Kal Jarvelin, et al.
0

Relevance is generally understood as a multi-level and multi-dimensional relationship between an information need and an information object. However, traditional IR evaluation metrics naively assume mono-dimensionality. We ask: How to deal with multidimensional and graded relevance assessments in IR evaluation? Moreover, search result evaluation metrics neglect document overlaps and naively assume gains piling up as the searcher examines the ranked list into greater length. Consequently, we examine: How to deal with document overlap in IR evaluation? The usability of a document for a person-in-need also depends on document usability attributes beyond relevance. Therefore, we ask: How to deal with usability attributes, and how to combine this with multidimensional relevance assessments in IR evaluation? Finally, we ask how to define a formal model, which deals with multidimensional graded relevance assessments, document overlaps, and document usability attributes in a coherent framework serving IR evaluation?

READ FULL TEXT
research
07/04/2022

On the Effect of Ranking Axioms on IR Evaluation Metrics

The study of IR evaluation metrics through axiomatic analysis enables a ...
research
11/01/2020

Cheap IR Evaluation: Fewer Topics, No Relevance Judgements, and Crowdsourced Assessments

To evaluate Information Retrieval (IR) effectiveness, a possible approac...
research
08/08/2022

Relevance Judgment Convergence Degree – A Measure of Inconsistency among Assessors for Information Retrieval

Relevance judgment of human assessors is inherently subjective and dynam...
research
10/19/2020

Surprise: Result List Truncation via Extreme Value Theory

Work in information retrieval has largely been centered around ranking a...
research
08/21/2018

A Usefulness-based Approach for Measuring the Local and Global Effect of IIR Services

In Interactive Information Retrieval (IIR) different services such as se...
research
01/05/2022

Atomized Search Length: Beyond User Models

We argue that current IR metrics, modeled on optimizing user experience,...
research
09/12/2022

Joint Upper Lower Bound Normalization for IR Evaluation

In this paper, we present a novel perspective towards IR evaluation by p...

Please sign up or login with your details

Forgot password? Click here to reset