Showing posts with label MongoDB. Show all posts
Showing posts with label MongoDB. Show all posts

Tuesday, May 12, 2015

Part 3. Data analysis.

"Not everything that can be counted counts,
and not everything that counts can be counted"
Albert Einstein
 
 
 
    Hi there! For now we have database which contains information about movies and also collection of watched films. In this part we will start to analyze data. I will show how I've chosen features to describe a movie, how I've made them to be measurable and what I got in the result. As always I will use "divide and conquer" tactics and separate this article in next sections:
  1. Converting words to numbers
  2. Converting numbers to features
  3. Normalizing and scaling features
    I will try to describe all actions I performed to reach the goal, but mostly I used my intuition. If your opinion differs feel free to post it in comments. So let the analytics begin!

Thursday, April 9, 2015

Part 2. Collect information about watched movies


"You can have data without information, but you cannot have information without data."
Daniel Keys Moran



    Hello again. In this part I will complete movie database with watched movies. Later this information will be used to create training set for machine learning algorithm. This part is short but still important. We will consider next topics:
  1. How to collect information about watched movies
  2. How to rate movies
    After finishing these two topics you can start analyzing data and dig for dependencies of favorite movies. If you will find out some interesting observations, post it in comments and I will include it my next Part. Let's do the job.