Sampling is a common strategy for generating text from probabilistic mod...
A fundamental result in psycholinguistics is that less predictable words...
Subword tokenization is a key part of many NLP pipelines. However, littl...
Byte-Pair Encoding (BPE) is a popular algorithm used for tokenizing data...
While natural languages differ widely in both canonical word order and w...
Language modeling, a central task in natural language processing, involv...
After just a few hundred training updates, a standard probabilistic mode...
Over the past two decades, numerous studies have demonstrated how less
p...
Despite significant progress in the quality of language generated from
a...
While probabilistic language generators have improved dramatically over ...
Probing has become a go-to methodology for interpreting and analyzing de...
Shannon entropy is often a quantity of interest to linguists studying th...
When generating natural language from neural probabilistic models, high
...
Numerous analyses of reading time (RT) data have been implemented – all ...
When generating text from probabilistic models, the chosen decoding stra...
Despite achieving incredibly low perplexities on myriad natural language...
While there exist scores of natural languages, each with its unique feat...
Homophony's widespread presence in natural languages is a controversial
...
The uniform information density (UID) hypothesis posits a preference amo...
Beam search is the default decoding strategy for many sequence generatio...
Large pre-trained language models have repeatedly shown their ability to...
Beam search is a go-to strategy for decoding neural sequence models. The...
Sparse attention has been claimed to increase model interpretability und...
We propose an alternate approach to quantifying how well language models...
The uniform information density (UID) hypothesis, which posits that spea...
Neural sequence-to-sequence models are currently the predominant choice ...
Quite surprisingly, exact maximum a posteriori (MAP) decoding of neural
...
Decoding for many NLP tasks requires a heuristic algorithm for approxima...
Prior work has explored directly regularizing the output distributions o...
Machine translation software has seen rapid progress in recent years due...
In recent years, machine translation software has increasingly been
inte...