Navigation Header

Lecture 2: Sentiment Analysis with simple method

We work on an actual sentiment analysis dataset (which is an instance of text classification), for which I also provide Python code (see link downwards). The approach is very similar to something that is commonly called a Naive Bayes Classifier.

Supplementary links and references

Good resources for learning Python:

A Byte of Python [free book]

Learn Python the Hard Way [free book, exercise oriented]

Googles Python class [web, video]

Additional information
You might find it hard to do much better than 80%, as this is a difficult dataset. You might tweak and tweak and tweak and get up to 85% if you tweak enough, but then you'd just be overfitting the test set. We'll talk about what that means later, but basically, you'd find that if you run that same method on a different dataset, you'd perform badly because your algorithm was just fine tuned to handle this one specifically. Attached files

Python Code (includes the dataset)

PowerPoint slideshow that I used for this video

Code