State of Data #62

#analysisHotmail product usage data analysis and how it influences the design —

“three types based on their behavior—Filers, Pilers, and Deleters..
Deleters generally delete email after it arrives. Deleters receive an average of 211 email messages each week and end up deleting almost 80% of them.. The mantra for these people is, “My kitchen has to be clean before I start cooking.

Filers put nearly half of their email (44%) into folders immediately after it arrives.

Pilers receive the least amount of email each week (174 messages). But that means they still receive an average of 9,048 email messages per year. Because most of those messages (57%) never leave the Piler’s inbox, their email starts to pile up


Google has started certification on Analytics with detailed “Analytics IQ Lessons” culminating in an exam

#big_dataWhole controversy around KissMetrics Data Collection practices and their official response to the allegations

#conferenceACM Data Mining Camp, October 2011 – “local, cheap, and high-quality learning opportunity”   

#Data_Science –  Verifying Benford’s Law on Tweets  – it works! 

#DBMSMost Big Data engineers mention ‘performance’ as the #1 priority. ‘3-minute test: What do you know about SQL Performance’ lets you figure out strengths, choose between MySQL; Oracle; PostGres; SQL Server and hammer out.


– Are we becoming too analytical? Serious introspection to be self-aware of possible ‘bandwagon effect’ of ‘big data’ and ‘analytics’–

“But the biggest reason I believe these two products have not taken off is their reliance on the belief that simply giving people their data and letting them analyze it is the way to improve behavior (both for health and for the environment)

One of the first things we teach in introductory human-computer interaction (HCI) is that “you are not your user” and “beware designer ego bias.” Google seemed to have fallen into this well-known trap in their design and testing for Google PowerMeter (and perhaps Google Health).”


#learningStanford University courses on Data – FREE for Fall, 2011, requires about 10 hrs of work a week per course; class begins on October 10 –


#math/statHow likely is it for a telephone number (w/o area code) to be prime? About 6%. With area code it may be somewhere around 4%.


#visualizationDichotomy or Difference? Statistical Graphics vs. Information Visualization – two crisp articles in most recent ‘Statistical Computing and Graphics Newsletter’ (PDF) discuss it from POVs of Computer Science and Statistics.   Follow-up from Andrew Gelman is interesting too.







About Nilendu Misra
I love to learn, create and coach. Things that I do well are - Communicating ideas - verbally or through words and diagrams; Problem Solving - Logical or Abstract; Very Large Scale Systems; think about 'Frighteningly Simple' approach first. Things that I intend to do better are - Establishing Stringent Process; Exchanging Tough Feedback; Keeping up with my reading or To-Do list to be able to completely relax.

Comments are closed.

%d bloggers like this: