State of Data Last Week – Nov 06


<Inside Intuit> Database Engineer (10+ years of experience) position open in EMS (Reno, Nevada)

Director, Data Offerings position open in BIO (Mountain View, CA)


<Analysis> Expedia removed an optional field (Company) from “Buy Now” page – it cost $12M profit a year otherwise.

<Architecture> When we like a pastry shop – Yelp uses Amazon Elastic MapReduce (EMR) to analyze (100GB/day) using mrjob – a Python framework to write MapReduce jobs. Yelp has taken down their in-house Hadoop clusters in May, 2010.

<Big Data> A list of references for mining from streaming data – map-reduce is not that great for streaming / non-stored data as user does not know what and how much data is to analyze beforehand. Yahoo’s S4 is quickly getting popular as distributed stream computing platform.

<DBMS> InnoDB – faster storage engine for mySQL – is no more available for free in “Classic MySQL” L

<Learning> Adrian Cockcroft (Netflix Performance Architect; ex-Sun) wrote an amazing set of articles comparing noSQL availability models, and What Netflix needs from noSQL.

<Visualization> Best graphical analysis of 2010 election in 10 visuals comes from New York Times.


  • How long will search be the king? Twice as many people in age group 18-29 discover a product or service through Social Network (Facebook) compared to all age groups of consumers
  • Facebook profile photo angled at 15”? Fun-loving. 16”? Uh-oh. Risky business – Fast Company analyzes.
  • Can mod_pagespeed from Google really speed up your site 2x? Here’s a quick way to find out from another proxy.
  • Stats on P2P file sharing — Larges – 746GB (all 2010 World Cup Soccer matches; ~6GB per 45 min); Oldest – The Matrix Ascii; Most Data Transferred by single torrent – 15.77PB (StarCraft 2)



About Nilendu Misra
I love to learn, create and coach. Things that I do well are - Communicating ideas - verbally or through words and diagrams; Problem Solving - Logical or Abstract; Very Large Scale Systems; think about 'Frighteningly Simple' approach first. Things that I intend to do better are - Establishing Stringent Process; Exchanging Tough Feedback; Keeping up with my reading or To-Do list to be able to completely relax.

Comments are closed.

%d bloggers like this: