Characterizing the Language of Online Communities and its Relation to Community Reception

09/15/2016
by   Trang Tran, et al.
0

This work investigates style and topic aspects of language in online communities: looking at both utility as an identifier of the community and correlation with community reception of content. Style is characterized using a hybrid word and part-of-speech tag n-gram language model, while topic is represented using Latent Dirichlet Allocation. Experiments with several Reddit forums show that style is a better indicator of community identity than topic, even for communities organized around specific topics. Further, there is a positive correlation between the community reception to a contribution and the style similarity to that community, but not so for topic similarity.

READ FULL TEXT

Please sign up or login with your details

Forgot password? Click here to reset