State of Data Last Week – Sep 05
September 5, 2010 Leave a comment
<Cool Numbers> What do you like to eat? Look at your phone!
- iPhone users purchase fish 26 times more than Android users.
- Facebook serves 1.2M photos a second.
- Progress in Number’s game – Rafael Nadal’s average revolutions per minute for a forehand is 3,300. To compare, Agassi’s was a mere 1,800.
- What comes next? 500, 400, 300, 200, 100, –? Per Washington D.C. math it’s “UNIT”!
<Strategy> From MIT’s best-ever Data Management checklist. Every project should have this, or a similar one, validated at start!
<Analysis> “median Fortune 1000 company could increase its revenue by $2.01 billion a year just by marginally improving the usability of the data already at its disposal”
Start towards the first million by tracking these 6 metrics for your Web App. Everything is put in this excellent spreadsheet model for easy math! (Note: FreshBooks follows this model)
<Big Data> Pomegranate stores billions of tiny little files – no SPOF using ‘distributed extensible hash table’.
How CERN (LHC) manages 20PB of data in JBOD (Just Old Bunch of Disks) hooked up to Linux boxes
<Schema> How can you represent inheritance in a SQL server (RDBMS) database?
<Learning> While at that, check out the book – ‘SQL Antipatterns’. If you’d liked Martin Fowler, you’d love it as well. It’s just been out less than a couple of months!
The problems with ACID, and how to fix it without going noSQL – Daniel Abadi proposes lock avoidance to fix the problem (Actual Paper – PDF)
<Visualization> 500 years of science progress as a Subway Map. Now, Science has time-line thanks to Data.
<Cocktail party cheat-sheet> Heavy drinkers outlive abstainers! “even after adjusting for all covariates, abstainers and heavy drinkers continued to show increased mortality risks of 51 and 45%, respectively, compared to moderate drinkers.”