WHAT

Machine-learned prediction of review helpfulness score

WHEN

April 2014

WHO

Me, Stefano Fenu, Charles Wang

WHERE

Georgia Tech, CS 4650

WHY

We were told to do an NLP-based project, and predicting review helpfulness with a massive dataset was an awesome idea for that

HOW

The report covers the gruesome details, but really nothing too fancy - an online PassiveAggressive classifier that used word counts as features and got trained on massive amounts of data using online batch training. I was very impressed that simply feeding in more data led it to gradually get to 90% accuracy, but dissapointed no other features seemed to help. This project also reinforced for me how awesome Python’s SciPy is - truly a great package.

LINKS

PICS

  • Graph
    Online classifier training proved very effective
  • Table
    We had fancy semantic features, but could not run it on enough data to get good results