Thursday, April 9, 2015

Part 2. Collect information about watched movies


"You can have data without information, but you cannot have information without data."
Daniel Keys Moran



    Hello again. In this part I will complete movie database with watched movies. Later this information will be used to create training set for machine learning algorithm. This part is short but still important. We will consider next topics:
  1. How to collect information about watched movies
  2. How to rate movies
    After finishing these two topics you can start analyzing data and dig for dependencies of favorite movies. If you will find out some interesting observations, post it in comments and I will include it my next Part. Let's do the job.

Sunday, April 5, 2015

Part 1. Create movie database.


“It is a capital mistake to theorize before one has data”
Sherlock Holmes

 
 
    Welcome back to my blog. In this part I will try to describe one of the complicated task I encountered, and which is mostly not mentioned in courses and literature. The process of creating data warehouse which is going to be used for our analysis and recommendations.
 
    For myself I divided this part into three sections. Each of them should answer three simple questions:
  1. What data do I need?
  2. Where can I find it?
  3. How shall I use it?
 
    Next I will try to answer these questions and mention problems I faced during creating my movie database. I will describe my thoughts about data, movies and technologies, but your opinion could differ, so I will show some approaches and you can choose any of them or use your own. I will be also grateful if you will share your thoughts in comments. So let's start!
 

Thursday, April 2, 2015

Part 0. Introduction.



"Docendo discimus"
(by teaching, we learn)
 
 
    Hello to everybody reading this blog. I am software engineer from Ukraine. This blog is dedicated for learning Data Science for beginners through practice. You can find dozens of blogs, articles with similar description, but the main distinction is I am also beginner. So if you are experienced in this area do not hesitate to point out my mistakes in comments or just write me via e-mail.
 
    Why have I chosen Data Science? I just like to solve complicated tasks and puzzles. Also I like to play with data: plot it, make some predictions. So fasten your seatbelts, we are taking off!