Symlink: A New Dataset for Scientific Symbol-Description Linking

04/26/2022
by   Viet Dac Lai, et al.
0

Mathematical symbols and descriptions appear in various forms across document section boundaries without explicit markup. In this paper, we present a new large-scale dataset that emphasizes extracting symbols and descriptions in scientific documents. Symlink annotates scientific papers of 5 different domains (i.e., computer science, biology, physics, mathematics, and economics). Our experiments on Symlink demonstrate the challenges of the symbol-description linking task for existing models and call for further research effort in this area. We will publicly release Symlink to facilitate future research.

READ FULL TEXT

Please sign up or login with your details

Forgot password? Click here to reset