Saturday, May 30, 2015

Part 4. Making a first model.

“In theory, theory and practice are the same. In practice, they are not”
Albert Einstein
 
 
 
 
    Greetings! Today we will finally start making some predictions and recommendations. In previous part we converted our data set to numeric form so it is ready for most models form scikit-learn library. In this part we will perform next:
  1. Divide data set into parts
  2. Try few simple models from scikit-learn
  3. Try Neural Network from pybrain package
    I will use standard algorithms from scikit-learn and pybrain libraries. If you feel you can implement your own model, you can share it in comments. So let's do the practical Machine Learning.
 

Tuesday, May 12, 2015

Part 3. Data analysis.

"Not everything that can be counted counts,
and not everything that counts can be counted"
Albert Einstein
 
 
 
    Hi there! For now we have database which contains information about movies and also collection of watched films. In this part we will start to analyze data. I will show how I've chosen features to describe a movie, how I've made them to be measurable and what I got in the result. As always I will use "divide and conquer" tactics and separate this article in next sections:
  1. Converting words to numbers
  2. Converting numbers to features
  3. Normalizing and scaling features
    I will try to describe all actions I performed to reach the goal, but mostly I used my intuition. If your opinion differs feel free to post it in comments. So let the analytics begin!