Few-shot named entity recognition (NER) enables us to build a NER system...
Stochastic gradient descent (SGD) is the cornerstone of modern machine
l...
Breakthroughs in unsupervised domain adaptation (uDA) can help in adapti...
The appeal of serverless (FaaS) has triggered a growing interest on how ...
Scalable training of large models (like BERT and GPT-3) requires careful...
Adam is the important optimization algorithm to guarantee efficiency and...
Optimizing distributed learning systems is an art of balancing between
c...