State of Data #72
October 28, 2011 Leave a comment
#analysis – The most comprehensive guide to Mobile Statistics ever
A trip to the darker side – Analyzing Hackers – what do they talk about the most vs. real life threats, and how precious data is traded at ‘black market’ (PDF)
#architecture – Peter Norvig on how AI is revolutionizing search
“Google Voice Search relies on 230 billion real world search queries to learn all the different ways that people articulate given words. So people no longer need to train their speech recognition for their own voice, as Google has enough real world examples to make that step unnecessary.”
Finally, the words are strung together into a language model, which tells you which words are most likely to come after another word. There might be a soundwave that sounds like either “city” or “silly”, but if it follows the words “New York…” then the language model would tell us that “city” is more likely.
#big_data – McKinsey asks ‘Are you ready for the era of Big Data’ with five questions (Jeopardy think music begins here) –
1. What happens in a world of radical transparency, with data widely available?
2. If you could test all of your decisions, how would that change the way you compete?
3. How would your business change if you used big data for widespread, real-time customization?
4. How can big data augment or even replace management?
5. Could you create a new business model based on data?
#conference – AI Challenge 2011 –“watch your ant colony fight for domination against colonies created by other people from around the world”
#Data_Science – What happens when your business is “Statistics as a Platform”
#DBMS – Want a database to play dice and just pick just a random plan (NOT the best plan) for a query? There’s a hidden parameter for that.
#idea – 3 Experts offer ideas on ‘Competing through Data’. Insights –
· Most great revolutions in science are preceded by revolutions in measurement
· one-standard-deviation increase toward data and analytics was correlated with about a 5 to 6 percent improvement in productivity and a slightly larger increase in profitability
· I can have all the data I want to have—but I still have to communicate it to our players.
#learning – ‘Signs that you’re a Bad Programmer’. Best insight –
Inability to think in sets
Transitioning from imperative programming to functional and declarative programming will immediately require you to think about operating
on sets of data as your primitive, not scalar values.
Remedies
Funny enough, visualizing a card dealer cutting a deck of cards and interleaving the two stacks together by flipping through them with his thumbs can jolt the mind
into thinking about sets and how you can operate on them in bulk.
#visualization – Want a handy Baseball correlation ellipse or Image Scatter Plot (like how flu spreads) to communicate data – R Graph Gallery is there to help.
#etc
- Soon.. first DBA to go to space
- #quant Stocks that pay dividend show more fluctuation on ex-dividend month during a bad economy
- #viz Using Chocolate to Teach Calculus and Graphing
- #math (10^2 + 11^2 + 12^2) = (13^2 +14^2) can only get 1 justified reply in form of (21^2 + 22^2 + 23^2 + 24^2) = (25^2 + 26^2 + 27^2)





