Lecture 2: Sentiment Analysis with simple method
We work on an actual sentiment analysis dataset (which is an instance of
text classification), for which I also provide Python code (see link downwards). The approach
is very similar to something that is commonly called a Naive Bayes Classifier.
Supplementary links and references
Good resources for learning Python:
A Byte of Python [free book]
Learn Python the Hard Way [free book, exercise oriented]
Googles Python class [web, video]
Additional information
You might find it hard to do much better than 80%, as this is a difficult dataset.
You might tweak and tweak and tweak and get up to 85% if you tweak enough,
but then you'd just be overfitting the test set.
We'll talk about what that means later,
but basically, you'd find that if you run that same method on a different dataset,
you'd perform badly because your algorithm was just fine tuned to handle this one specifically.
Attached files
Python Code (includes the dataset)
PowerPoint slideshow that I used for this video
Code