State of Data #55
June 30, 2011
#analysis – ‘Web Analytics Career Guide – From Zero to Hero in Five Steps’. “I’ve said repeatedly that if I look into the next xx number of years Analyst is essentially a recession-proof job”.
#architecture – Safari now has the complete video tutorial of ‘An Introduction to Machine Learning with Web Data’ from Hilary Mason, Data Scientist for bit.ly. Running for about 3 hrs in four key areas, highly recommended.
#big_data – Continual fall of cost of Cloud Computing – announcement from Amazon Web Services on 06/29 made “Data Transfer In” FREE, with massive reduction in “Data Transfer Out” rates
#DBMS – Ever wondered how complicated a simple two-table joins could indeed be? Jonathan Lewis, one of the best scientific thinkers out there, recently presented on that topic at Turkish Oracle User Group – the video of the session runs for 55 minutes.
Key takeaways –
- ‘mid-90s through to around 2005, the database world went through dark ages’
- ‘The pace of innovation was glacial – “polishing the round ball”’
- ‘plunging cost of computing is fueling database size growth at a super-Moore pace’
- ‘disk is tape, flash is disk, ram locality is king’, ‘crossed storage chasm’
#visualization – A good play on visualizing statistical open data — ‘Peoplemovin is an experimental project in data visualization by Carlo Zapponi. The main purpose of this project is to create a flow chart visualization framework based on HTML5 technologies’
- Museum of Me – Visualize your life, friends and consciousness with helps from cool robotic arms. Good interplay of Social Data, 3D visualization from Intel. Perhaps subtly narcissistic.
- At a rate of 20% YOY growth when does the data quadruples? 7 years. – ‘The math behind the rule of 72 is easy to extend to triplings (rule of 110), quadrupling (rule of 140), quintupling (rule of 160)’
- von Neumann’s Elephant can indeed be drawn with four parameters if they are complex numbers – with the sample Python code. Brilliant!
- SQL to JFK – San Carlos Airport, next to Larry Ellison’s pod has IATA code ‘SQL’