Does Higher Order LSTM Have Better Accuracy in Chunking and Named Entity Recognition?

11/22/2017

∙

Current researches usually employ single order setting by default when dealing with sequence labeling tasks. In our work, "order" means the number of tags that a prediction involves at every time step. High order models tend to capture more dependency information among tags. We first propose a simple method that low order models can be easily extended to high order models. To our surprise, the high order models which are supposed to capture more dependency information behave worse when increasing the order. We suppose that forcing neural networks to learn complex structure may lead to overfitting. To deal with the problem, we propose a method which combines low order and high order information together to decode the tag sequence. The proposed method, multi-order decoding (MOD), keeps the scalability to high order models with a pruning technique. MOD achieves higher accuracies than existing methods of single order setting. It results in a 21 in chunking and an error reduction over 23 available at https://github.com/lancopku/Multi-Order-Decoding.

READ FULL TEXT

Does Higher Order LSTM Have Better Accuracy in Chunking and Named Entity Recognition?

An Exploration of Arbitrary-Order Sequence Labeling via Energy-Based Inference Networks

Complex Structure Leads to Overfitting: A Structure Regularization Decoding Method for Natural Language Processing

Dual Graph Embedding for Object-Tag LinkPrediction on the Knowledge Graph

Transferring Neural Potentials For High Order Dependency Parsing

Getting the Most out of Simile Recognition

HOPE: High-order Polynomial Expansion of Black-box Neural Networks

Information decomposition reveals hidden high-order contributions to temporal irreversibility

Does Higher Order LSTM Have Better Accuracy in Chunking and Named Entity Recognition?

Related Research

An Exploration of Arbitrary-Order Sequence Labeling via Energy-Based Inference Networks

Complex Structure Leads to Overfitting: A Structure Regularization Decoding Method for Natural Language Processing

Dual Graph Embedding for Object-Tag LinkPrediction on the Knowledge Graph

Transferring Neural Potentials For High Order Dependency Parsing

Getting the Most out of Simile Recognition

HOPE: High-order Polynomial Expansion of Black-box Neural Networks

Information decomposition reveals hidden high-order contributions to temporal irreversibility