State of Data #114
August 31, 2012
Milton Friedman’s Thermostat (or, why we need to be very careful of correlations)
‘you were a passenger in a car watching the driver trying to keep a constant speed on a hilly road. You would see the gas pedal going up and down. You would see the car going downhill and uphill. But if the driver were skilled, and the car powerful enough, you would see the speed stay constant.
So, if you were simply looking at this particular “data generating process”, you could easily conclude: “Look! The position of the gas pedal has no effect on the speed!”
FaceBook has about 180K servers, Google about 1M. How to estimate number of servers from a company’s energy consumption data
Changing Panorama of Data (Martin Fowler)
“big data” is when the size of the data itself becomes part of the problem
Paper on ‘FunSQL: It is time to make SQL functional’
Amazon has changed 1B people’s purchases; Google has changed 1B people’s information access; Facebook has changed 1B people’s identity.
If you get access to ALL the data in the world, what would you do? “Dinner with Data” (courtesy: The Stanford Alumni Club of Shanghai) discusses.
Slides from a Great talk on Data Quality from Deep Web (AT&T Research)
- Excel 2013 suggests Best Chart for your Data
- Freeze to Sizzling – (a) Amazon Glacier – Archival @ $0.01/GB/month; (b) 2 TB SSD in EC2 for rent!
- Top 10 Most Caps Locked Words in Twitter
- ‘But if accept Big Data In Our Servers, We will be Saved from Bankruptcy’