Studying Retrievability of Publications and Datasets in an Integrated Retrieval System
In this paper, we investigate the retrievability of datasets and publications in a real-life Digital Library (DL). The measure of retrievability was originally developed to quantify the influence that a retrieval system has on the access to information. Retrievability can also enable DL engineers to evaluate their search engine to determine the ease with which the content in the collection can be accessed. Following this methodology, in our study, we propose a system-oriented approach for studying dataset and publication retrieval. A speciality of this paper is the focus on measuring the accessibility biases of various types of DL items and including a metric of usefulness. Among other metrics, we use Lorenz curves and Gini coefficients to visualize the differences of the two retrievable document types (specifically datasets and publications). Empirical results reported in the paper show a distinguishable diversity in the retrievability scores among the documents of different types.
READ FULL TEXT