How do people internalize visualizations: as images or information? In t...
Cross-task generalization is a significant outcome that defines mastery ...
Large Language Models (LLMs) have gained widespread popularity due to th...
Recent research has shown that language models exploit `artifacts' in
be...
Evaluation of models on benchmarks is unreliable without knowing the deg...
Several benchmarks have been built with heavy investment in resources to...
With the increasing importance of safety requirements associated with th...
The electrical power grid is a critical infrastructure, with disruptions...
Even though deep neural models have achieved superhuman performance on m...
Alluvial diagrams are a popular technique for visualizing flow and relat...
Deep Learning's outstanding track record across several domains has stem...
Models that top leaderboards often perform unsatisfactorily when deploye...
A `state of the art' model A surpasses humans in a benchmark B, but fail...
Models that surpass human performance on several popular benchmarks disp...
Neural language models have achieved human level performance across seve...