State of Data #73
November 4, 2011
#analysis – ‘How good is your data’ for analysis? ‘It points to the “garbage in garbage out” problem. One should always be aware of the potential hazards.’ (Hat tip: Kaiser Fung) The Murky world of Student Loan Statistics – “And yes, the new numbers will show that student-loan debt exceeds credit-card debt.”
#architecture – How InstaGram stored hundreds of millions of key-value pairs in Redis – “Fit the data in memory, and ideally within one of the EC2 high-memory types (the 17GB or 34GB, rather than the 68GB instance type)”
#DBMS – Serving 1M daily users, 100K DB operations/sec with no Cache – journey from mySQL to REDIS
#idea – SaveUp – “Unlike most loyalty programs out there that give you credit based on how much you spend, SaveUp rewards you for how much you save”
#learning – Technical papers on optimizer
#visualization – WhatsUp – Timeline view of most popular Tweeter topics (takes a while to load)
- If you use Siri, how much more data do you consume? “Average of 36.7KB per query”
- #dataviz We are 7 billion now, but each of us still has space equivalent to Red Square, Moscow to stand on!
- If you’re buying a house in Missouri does the price depend on the gender of your real estate agent?
- Best Statistics Question ever