State of Data #67

#analysis – Three Secrets of Business Analytics (from 37Signals)

 How Lloyd’s of London uses R for Insurance

#architecture –  How and why a Portland startup went from PostGres to MongoDB and came back (PDF)

This might make some people cringe. Mongo has a single global read/write lock for the entire server. The efect this has is that if a write ever takes a non-trivial amount of time—page fault combined with slow disk, perhaps—everything backs up. We had high lock % when disk %util was only ~30-40%


#big_data – Convert .csv file to MySQL Database

Yelp opened reviews for 7000 businesses, and calling talented Data Miners from Universities to solve problems – e.g., “Top 10 Positive and Negative words ranked”

#Data_Science –   Building Data Science Teams


All the top data scientists share an innate sense of curiosity. Their curiosity is broad, and extends well beyond their day-to-day activities. They are interested in understanding many different areas of the company, business, industry, and technology. As a result, they are often able to bring disparate areas together in a novel way….I’ve seen data scientists apply novel DNA sequencing techniques to find patterns of fraud.

#DBMS – Is Database Design a dying art or a dead art already (interesting comments too)


 #idea – Story behind Opera’s $84M big data funding

#learning – 
“Is the average number of fair coin tosses required to get a HTH (Head-Tails-Head) pattern greater than, less than, or the same as, the number of tosses required to get a HTT pattern?” Peter Donnelly (TED talk) shows how stats fool juries


#visualization – Meta-visualization – what are the most popular types



