State of Data #63
August 26, 2011 1 Comment
#analysis – 46 page Internet Marketing Strategy “briefing looking at customer centricity, channel diversification, data, social media and content strategy. This is their usual high grade quality and worth a look”
Disdain Data Diving – “Today’s Big Data heavy-lifting machines and software systems were built back in the day when millions of customers made millions of phone calls and each one had to be captured, stored, and found in a heartbeat. Banking and credit card transactions by the billions had to be put into safekeeping somewhere they could be added up, averaged, and recalled if need be.”
#architecture – MongoDB loves BSON (Binary JSON) for Data Exchange –
“Fast scan-ability. For very large JSON documents, scanning can be slow. To skip a nested document or array we have to scan through the intervening field completely. In addition as we go we must count nestings of braces, brackets, and quotation marks. In BSON, the size of these elements is at the beginning of the field’s value, which makes skipping an element easy.”
#big_pig_data – Angry Birds is played 1.4B minutes a week. Now, they have tied up with a predictive analytics solution provider to help forecast pig smashing abilities.
#Data_Science – Multiple packages in R to read online datasets
#DBMS – A phenomenal paper from NoCOUG on ‘NFS Tuning for Oracle’ (PDF) by Kyle Hailey.
#idea – Facebook engineer suggests reducing disk RPM to reduce data center power cost
#learning – What every Data Programmer Needs to know about Disks (PPT; from OSCON 2011) – very highly recommended especially for ‘Why EC2 I/O is Slow and Unpredictable’ –
“Newer intel chips have the northbridge controller on-die. Southbridge bandwidth is usually <= 10GB/sec, and you are sharing this with other customers’ network and disk I/O. That, and you may be sharing drive spindles.“
#visualization – Stanford’s ‘Republic of Letters’ visualization – “on database of thousands of letters exchanged between prominent intellectuals in the 17th and 18th centuries” – is made on HTML5. Has connections, volume and flow views of over 55,000 letters exchanged among 6,400 correspondents.
- ‘Modern explorer is a statistician..discovers things that no one has seen before’ – BBC4 interviews statisticians e.g., ‘Statisticians stare at your shoes, while mathematicians stare at their own shoes while talking’
- #math What are the odds? Lady wins FOURTH lottery! Over $20M in collective winning.
- Normal? How a statistician views gas pumps, Fast Food restaurant entrance doors and parking spots.
- Risk Analysis – King is the most dangerous job in history. #WhoKnew