Choosing the Right Model

Since the beginning, I’ve been trying various machine learning models as it’s almost impossible to predict which is going to provide the best results. And since the beginning, Random Forest has been out-performing other models. I started to like Random Forest and didn’t really care much about the others. That’s generally a bad idea unless you’re 100% confident. I wasn’t.

Last time, I posted some results from Random Forest. Since then I made a couple of changes to the data, so let’s look at the results on the updated version:

reg3_rf

Although Random Forest has been giving me the best numbers, there’s a clear problem with predicting the same-ish number (1-4) for the majority of games. Overall, the predictions don’t follow the diagonal as desired but are mostly distributed in a column intersecting the diagonal.

I gave SVM (Support Vector Machines) another shot, this time I did a lot of tweaking and got this:

reg4_svm

The numbers are not really important here (although SVM scored better in every aspect). The overall distribution of predictions is the more note-worthy part as they follow the diagonal a lot more. However, I’m still far from calling these “accurate predictions”, that will require a lot more work.

Advertisements

Leave a Reply

Fill in your details below or click an icon to log in:

WordPress.com Logo

You are commenting using your WordPress.com account. Log Out / Change )

Twitter picture

You are commenting using your Twitter account. Log Out / Change )

Facebook photo

You are commenting using your Facebook account. Log Out / Change )

Google+ photo

You are commenting using your Google+ account. Log Out / Change )

Connecting to %s