A Nested HDP for Hierarchical Topic Models

01/16/2013
by   John Paisley, et al.
0

We develop a nested hierarchical Dirichlet process (nHDP) for hierarchical topic modeling. The nHDP is a generalization of the nested Chinese restaurant process (nCRP) that allows each word to follow its own path to a topic node according to a document-specific distribution on a shared tree. This alleviates the rigid, single-path formulation of the nCRP, allowing a document to more easily express thematic borrowings as a random effect. We demonstrate our algorithm on 1.8 million documents from The New York Times.

READ FULL TEXT

Please sign up or login with your details

Forgot password? Click here to reset