State of Data Last Week – Oct 09
October 10, 2010 Leave a comment
<Cool Numbers> Decline in marriage, can “Perfect 10” date help?
- For first-time in US history, ‘Never Married’ exceeds Married among young adults – 25-34 age group per US Census data
- For certain small business owners (Florist, Caterer, Wedding Planner) average invoices are way higher today (10/10/10). Interestingly, 10x people – 39,000 couples – are tying the knot compared to last year’s Sunday. Next surge – 11/11/11
- Iowa State University thinks a murder costs society $17.25M -commenter refutes.
- Why Data Services are not so easy – each tweet (144 characters) typically becomes 1000 bytes when accessed over the detailed (‘FireHose’) API
<Analysis> Why Macy’s loaded its Atlanta stores with hats
World Bank Data Challege.
<Inside Intuit> Refer great data people you know to work with us –
<Strategy/Arch> ‘Data Scientist’ does – Obtain; Scrub; Explore; Model; and Interpret
Sometimes all we need to do is ‘Duct-tape Architecture’ – just scaling it to a point beyond breakage. Great example here with scaling with SSD
<Big Data> Ebay throws out 6.5 petabyte GreenPlum, moves it into TeraData.
<Schema> How to do MapReduce with good old Oracle database
<DBMS> Impending MySQL price increases.
<Visualization> Visualizing source-code – how Apache differed from PostgreSQL
<Cocktail party cheat-sheet> Foursquare outage happened because one of the shards had suddenly more data (67GB) while the RAM was only 66GB (Amazon EC2 virtual). Apparently, MongoDB barfs if it has to read from disk, even little ‘I checked into Restaurant X’ details (each about 300 bytes).