State of Data #81

 #analysis – 1) Full Course Text of ‘Advanced Data Analysis’ from University of Michigan, including relevant course work with ‘R’                       
    2) The final online review depends a lot not on the product, but on…early reviews. People start reacting by saying the opposite of early reviews.

‘studied 51,854 reviews contributed to Amazon, covering 858 books from 2000 to early 2004. We found that the order in which reviews are written matters a great deal: Some newly posted reviews tend to disagree with existing reviews, instead of only focusing on the book.;


#architecture – IBM’s Architecture for Astronomical Big Data

‘A main design challenge is how to process one Exabyte of raw data per day. This is the data amount anticipated when the SKA system as the world’s largest and most sensitive radio telescope will be ready; it’s construction will start in 2016. IBM claims that this data amount exceeds the entire daily Internet traffic. The amount would suffice to fill over 15 million 64 GB iPods.’

#big_data – Can Data Science predict Hit songs? Hey ya! They say you can ‘score’ your own song real soon. Insights –

  • Before the eighties, the danceability of a song was not very relevant to its hit potential. From then on, danceable songs were more likely to become a hit. Also the average danceability of all songs on the charts suddenly increased in the late seventies.
  • In the eighties slower musical styles (tempo 70-89 beats per minute), such as ballads, were more likely to become a hit.

#Data_Science – PageRank algorithm to find the ‘Best Cricket Team’ (pdf) and ‘Best Captains’ in different formats of the game

 – Jonathan Lewis’ ‘Oracle Core: Essential Internals’ has already been dubbed ‘likely be the best Oracle internals book out there for the coming 10 years’ by folks who are top of the trade. 

– Next time you go to a doctor for physical, your data collection may be ‘gamified’ and a whole lot more fun thanks to TonicHealth

#learning – 
Modeling with Data – Tools and Techniques for Scientific Computing’ – now full book available from the author.

‘When I talk to a statistician, a model means a probability distribution over elements, and that’s about it. I’d start talking to a statistician about modeling subject-specific knowledge about the interaction of elements, and giant question marks would appear over his head. Which is not to say that the person is a moron, but just that his understanding of the meaning of the word model is much more narrowly focused than mine.’

#visualization – Visualize CPU Utilization in a Large Data Center – models and approaches



About Nilendu Misra
I love to learn, create and coach. Things that I do well are - Communicating ideas - verbally or through words and diagrams; Problem Solving - Logical or Abstract; Very Large Scale Systems; think about 'Frighteningly Simple' approach first. Things that I intend to do better are - Establishing Stringent Process; Exchanging Tough Feedback; Keeping up with my reading or To-Do list to be able to completely relax.

2 Responses to State of Data #81

  1. Mack Milette says:

    Many thanks for your submission, previously interesting and compelling. I found my way here through Google, I’ll return over again 🙂

  2. tim littleya says:

    Please, keep submitingmore stuff like this its intresting!!

%d bloggers like this: