HybridQA: A Dataset of Multi-Hop Question Answering over Tabular and Textual Data
Existing question answering datasets focus on dealing with homogeneous information, based either only on text or KB/Table information alone. However, as human knowledge is distributed over heterogeneous forms, using homogeneous information might lead to severe coverage problems. To fill in the gap, we present , a new large-scale question-answering dataset that requires reasoning on heterogeneous information. Each question is aligned with a structured Wikipedia table and multiple free-form corpora linked with the entities in the table. The questions are designed to aggregate both tabular information and text information, i.e. lack of either form would render the question unanswerable. We test with three different models: 1) table-only model. 2) text-only model. 3) a hybrid model which combines both table and textual information to build a reasoning path towards the answer. The experimental results show that the first two baselines obtain compromised scores below 20%, while significantly boosts EM score to over 50%, which proves the necessity to aggregate both structure and unstructured information in . However, 's score is still far behind human performance, hence we believe to an ideal and challenging benchmark to study question answering under heterogeneous information. The dataset and code are available at <https://github.com/wenhuchen/HybridQA>.
READ FULL TEXT