This paper analyzes three formal models of Transformer encoders that dif...
With the proliferation of online misinformation, fake news detection has...
Interpretability methods for neural networks are difficult to evaluate
b...
By positing a relationship between naturalistic reading times and
inform...
LSTM language models have been shown to capture syntax-sensitive grammat...
This paper defines a subregular class of functions called the tier-based...
Neural network architectures have been augmented with differentiable sta...
This paper analyzes the behavior of stack-augmented recurrent neural net...