State of Data Last Week -#40
March 18, 2011
#analysis – How to use statistics to find out if the art you purchased in eBay is fake or real (PDF; from Significance Magazine, March 2011)
#finance – Wonga and Klarna are using unconventional forms of data / algorithm to provide financial services. E.g., “Consumers who shop online at 3am may get rejected by Klarna. Having a mobile phone with a contract helps to get money from Wonga”
#architecture – ‘HOWTO for organizations to open up data’ discusses from ‘why open data’ to the legalities; ‘technical openness’ (e.g., Bulk API etc). The site is not yet fully developed, and some sections (e.g., FAQ) may lack content.
#big_data – “Google’s Ads Preferences believes I’m a guy interested in politics, Asian food, perfume, celebrity gossip, animated movies and crime but who doesn’t care about “books & literature” or “people & society.” Joel Stein’s latest Time Cover Story – “Data Mining: How Companies Now Know Everything about You”
#DBMS – Ever had those SQL queries where two columns always appear together as filter (e.g., TRANSACTION_TYPE and ZIP)? And those two columns are skewed (think of California Zipcodes vs. Rhode Island’s). A very cool “extended statistics” collection feature in Oracle now tools optimizer with more smartness to evaluate it.
#learning – For those using GoldenGate as Data Replication tool, “Oracle GoldenGate 11g Implementer’s Guide” book was published this week. At first glance, the coverage looks quite extensive. The ebook version can be purchased directly from the publisher.
#visualization – How ‘The New York Times uses R for Data Visualization’ – a 60 minute presentation from Amanda Cox
- Revealing emails – Gmailers are thinner? Hunch – a recommendation engine – analyzes data
- Call for papers opened at Oracle OpenWorld, 2011
- Numbers revealed on Twitter’s 5th birthday – It takes a week to create 1B tweets; 6,939 tweets per sec is the maximum throughput so far (New Year’s 2011)
- Separate Hype from Reality with solid data – ‘One in five divorces linked to facebook’ – except, the originator of the idea acknowledges ‘this may not be representative of all divorces’