State of Data -#44

#analysis – ICWSM 2011 published the list of accepted papers. ‘4chan..Analysis of Large Online Community’ (PDF) is interesting for Large Data Practitioners. Folks from MIT analyzed over 5M posts to ‘quantify ephemerality’. Median life of a thread is ~4min; longest lived thread in sample was alive for only 6.2 hrs (think that w.r.t. identify-enabled sites). The authors found ‘anonymity promotes disinhibition, mob-behavior’ but the disinhibition worked better in ‘advice and discussion threads’.  NSFW language warning for contents reported verbatim from /b/ or 4chan in the paper.

#architecture – Take a sneak peek inside World’s 10 largest Data Centers


#big_data – Visualizing News Data for Defense Research & Intelligence Analysis – ‘take terabytes of data from 5000 sources and make it actionable’ (using this editor’s favorite viz tool, Spotfire) – nice 46 min presentation with Q&A later

#DBMS – Running Red Hat, Oracle, and new Xeon processors? You may get about 10% better performance by enabling Turbo Boost

#learning –  Automated Processing of WikiLeaks cable showing friends (Green dots), foes (red), and passersby (teal and blue) – original Stanford Class Project here (PDF). They foundSpain to be US’s most important ally

#visualization –
Beauty of Map – the entire BBC series is now available to watch




About Nilendu Misra
I love to learn, create and coach. Things that I do well are - Communicating ideas - verbally or through words and diagrams; Problem Solving - Logical or Abstract; Very Large Scale Systems; think about 'Frighteningly Simple' approach first. Things that I intend to do better are - Establishing Stringent Process; Exchanging Tough Feedback; Keeping up with my reading or To-Do list to be able to completely relax.

Comments are closed.

%d bloggers like this: