Data aggregation from hundreds of different websites into one


what would the process be for aggregating data from hundreds of websites into a master database? I would then do calculations on the master data base, then resell.

Thank you.


asked Nov 24 '11 at 08:42
Tony Brattle
1 point

3 Answers


  1. Get legal permission to use each web site's data
  2. Hire a team of web script writers to scrape the data from each web site
  3. Collect the data
  4. Verify data collection is correct (each time you scrape a site)
  5. Store in master data base
  6. Update scripts as needed for data collection
answered Nov 24 '11 at 08:55
Gary E
12,510 points


It depends on the data you are getting. If it is data that the websites you are targetting readily share then it will likely be accessible through some kind of data feed that they provide. If its not readily available you may need to 'scrape' it as Gary mentions.

Once you have the data, depending on how you got it, you may need to 'map' it into your database. Basically, if you scraped it you or whoever did the scraping may have been able to get each data set from each site in the same way and so it will already be arranged in a table in the same way. If you have got it from a feed, a .csv or .xml file for example you may need to write a script to 'map' each data column to fit the arrangement of your database.

Specific methods would depend on what type of data you're getting. What kind of database you want to use and what processes you would like to do with the data. In my experience I have used PHP to get xml, csv, txt feeds, and aggregate them into MySQL databases but have not done much 'scraping'.

answered Nov 24 '11 at 10:55
31 points


Have a look at - It is the Mekanikal Turk for Scraping websites.

answered Feb 17 '12 at 20:31
23 points

Your Answer

  • Bold
  • Italic
  • • Bullets
  • 1. Numbers
  • Quote
Not the answer you're looking for? Ask your own question or browse other questions in these topics: