State of Data Last Week – Oct 23
October 23, 2010 Leave a comment
<Analysis> Have diminishing returns set in to investments in higher education? Somewhat counter-intuitive analysis of BLS data thinks so.
<API> Nasdaq releases on-demand historical stock data – nice way to test trading algorithms with accurate, bulk data.
<Architecture> Michael Stonebraker clarifies his position – CAP theorem should not be justification to give up on ACID. Also “rm –rf” type errors cannot be recovered using CAP theorem terms.
<Big Data> Showing off? Average Hadoop cluster has 66 nodes and 114TB data. Ebay has 8500 nodes, 2PB.
<Learning> Cary Milsap continues his “Thinking Clearly about Performance” in ACM – should you open the window (global stuff) or take off your heavy sweater (local)?
<Visualization> Hipmunk visuals for Airline reservation – available flights are aggregated over a single chart, and default-sorted by “agony” (price; duration; layover; red-eye).
- Two sides of reference – WikiPedia puts ZERO tracking file in your computer; dictionary.com puts 234 cookies / beacons
- REST service API is steadily winning the API war over SOAP. 74% of most popular 2000 Web APIs are now in REST protocol.
- Why having a house numbered < 31 (or, digits adding up to 6) could sell earlier – many unwillingly get ‘nudged’ to live in a house numbered after birth / wedding anniversary.
- Using auto-increment / serial numbers for entities is a standard practice in modeling. It could also give away a lot of secrets; like, estimating iPhone sales; or help statisticians win (real) wars.