Data Science at Home
Episode 64: Get the best shot at NLP sentiment analysis
The rapid diffusion of social media like Facebook and Twitter, and the massive use of different types of forums like Reddit, Quora, etc., is producing an impressive amount of text data every day.
There is one specific activity that many business owners have been contemplating over the last five years, that is identifying the social sentiment of their brand, by analysing the conversations of their users.
In this episode I explain how one can get the best shot at classifying sentences with deep learning and word embedding.
Additional material
Schematic representation of how to learn a word embedding matrix E by training a neural network that, given the previous M words, predicts the next word in a sentence.
Word2Vec example source code
https://gist.github.com/rlangone/ded90673f65e932fd14ae53a26e89eee#file-word2vec_example-py
References
[1] Mikolov, T. et al., "Distributed Representations of Words and Phrases and their Compositionality", Advances in Neural Information Processing Systems 26, pages 3111-3119, 2013.
[2] The Best Embedding Method for Sentiment Classification, https://medium.com/@bramblexu/blog-md-34c5d082a8c5
[3] The state of sentiment analysis: word, sub-word and character embedding https://amethix.com/state-of-sentiment-analysis-embedding/