State of Data #56

 #analysis – Mining Twitter for consumer attitudes towards airlines (using R) – (a) “search twitter in 1 line of code”; (b) Estimate sentiment from ‘opinion lexicon’ (how to analyze sarcasm); (c) score/compare/ rinse/repeat 



#architecture –  Facebook has a “serious MySQL problem”?

1,800 servers dedicated to MySQL and 805 servers dedicated to memcached

…it has so much user data, and because every user clicking “Like,” updating his status, joining a new group or otherwise interacting with the site constitutes a transaction its MySQL database has to process. Every second a user has to wait while a Facebook service calls the database is time that user might spend wondering if it’s worth the wait.” 


#big_data –  Patriot Act vs. Data Protection Acts in Europe – what happens when they conflict – very pertinent for that ‘cloud’ thing


#conference – Another TDWI Summit – “Deep Analytics for Big Data”, San Diego, Sept 25-27


#competition – WikiMedia announces ‘a data modeling competition to develop an algorithm that predicts future editing activity on Wikipedia’

#DBMS – Counter Intuitive Fact #2 – A good hardware upgrade could kill the performance of your application. “Daily WTF” analyzes one of the many “whys” —

“Prior to the upgrade, at Wal*Mart waddling speed, the application trickled through the database table, and that meant very little happened in any given second. But after the upgrade, a number of order lines processed quickly, and suddenly the fact that some orders had the same item on two lines meant that the transaction exploded. Roughly 50% of the time that an order had duplicate lines, it now failed.


#visualization – Real Estate Data viz. from Trulia. ‘When does crime happen in big cities’. San Francisco, beware of 9PM!

Skyscraper of Mobile Phone Call Data – Data or Abstract Art? (from New York Times)


  • Twitter acquires BackType for Social Analytics
  • Nordstrom Rack – the only winner in Groupon war? This amazing data visualization from Harvard Business Review shows so.
  • #math – Celebrate a truly odd day this Saturday. Next one is on 9-11-13
  • Data Analysis could lead to Meatless Mondays? – “83% of the average U.S. household’s carbon footprint for food comes from growing and producing it. Transportation is only 11%” “one day per week’s worth of calories from red meat and dairy products.. achieves more GHG reduction than buying all locally sourced food”

About Nilendu Misra
I love to learn, create and coach. Things that I do well are - Communicating ideas - verbally or through words and diagrams; Problem Solving - Logical or Abstract; Very Large Scale Systems; think about 'Frighteningly Simple' approach first. Things that I intend to do better are - Establishing Stringent Process; Exchanging Tough Feedback; Keeping up with my reading or To-Do list to be able to completely relax.

Comments are closed.

%d bloggers like this: