State of Data #82

#analysisThe Rapidly changing landscape of Mobile Data – nice aggregate of latest values of the trends 

#architectureZero to Hadoop in 5 minutes

#big_dataExtract from ‘Too big to Know’ – a new book on Big Data and its impact on our brains – ‘designed the Eureqa computer program to find equations that make sense of large quantities of data that have stumped mere humans, including cellular signaling and the effect of cocaine on white blood cells. Eureqa looks for possible equations that explain the relation of some likely pieces of data, and then tweaks and tests those equations to see if the results more accurately fit the data. It keeps iterating until it has an equation that works.’


#Data_ScienceIn Defense of Online Anonymity – Disqus data shows pseudonymous commenters are the best

#DBMSWhy RAID is so important for databases – A Primer

Statistician who is building algorithm to forecast when someone will go back to committing a crime – ‘algorithm that forecasts a particular outcome—someone committing murder, for example—Berk applied a subset of the data to “train” the computer on which qualities are associated with that outcome. “If I could use sun spots or shoe size or the size of the wristband on their wrist, I would,”

#learning40 years of boxplots (pdf) –  ‘Boxplots use robust summary statistics that are always located at actual data points, are quickly computable (originally by hand), and have no tuning parameters. They are particularly useful for comparing distributions across groups.


  • When compassion trumpets data – ‘Doctors don’t really have a clue how to predict how long a patient will live.’ Actual paper (PDF) –  ‘A patient is eligible for hospice care if they have an estimated life expectancy of six months or less. .. the actual length of stay is usually less than six weeks’
  • ‘Top 1%’ is really mostly about ‘Top 0.1%’ – The growth in 1% is mostly sustained by the 0.1%  

About Nilendu Misra
I love to learn, create and coach. Things that I do well are - Communicating ideas - verbally or through words and diagrams; Problem Solving - Logical or Abstract; Very Large Scale Systems; think about 'Frighteningly Simple' approach first. Things that I intend to do better are - Establishing Stringent Process; Exchanging Tough Feedback; Keeping up with my reading or To-Do list to be able to completely relax.

One Response to State of Data #82

  1. Kyle Timatha says:

    This is utterly nice, very indepth, thank you.

