Showing posts with label web scraping. Show all posts
Showing posts with label web scraping. Show all posts

Sunday, April 5, 2015

Part 1. Create movie database.


“It is a capital mistake to theorize before one has data”
Sherlock Holmes

 
 
    Welcome back to my blog. In this part I will try to describe one of the complicated task I encountered, and which is mostly not mentioned in courses and literature. The process of creating data warehouse which is going to be used for our analysis and recommendations.
 
    For myself I divided this part into three sections. Each of them should answer three simple questions:
  1. What data do I need?
  2. Where can I find it?
  3. How shall I use it?
 
    Next I will try to answer these questions and mention problems I faced during creating my movie database. I will describe my thoughts about data, movies and technologies, but your opinion could differ, so I will show some approaches and you can choose any of them or use your own. I will be also grateful if you will share your thoughts in comments. So let's start!