State of Data Last Week – Dec 06

#Analysis – Logistic regression is a ‘categorical tool’ e.g., telling fraud/not fraud. Here is a great starter with a worked out case analysis in minutes using R on your laptop.

#Architecture – Top 5 Free, Open Source Data Mining Software

#Big Data – One single racecar streams 27GB of telemetry data during a race weekend from 200 sensors.

Build your own ‘Circular Log’ / ‘Log Rotation Routine’ with MySQL (or any data storage). With later Oracle releases, ‘interval partition’ is in-built for this.

#Learning – Want to learn how to write efficient SQL from the master who created it all? An excellent 16-hr ‘SQL Master Class’ video course from Chris Date shows how to avoid common traps and pitfalls. Best of all, it’s completely FREE for Intuit employees using Safari Online.

#visualization – Logstalgia displays your web access logs as a ‘pong like battle between Web Server and a never ending torrent of requests’. Requests appear as color balls!

glTail is similar FREE, real-time log-visualization tool – ‘each circle is a hit on website, and size of circle indicates the size of request’.


I love to learn, create and coach. Things that I do well are - Communicating ideas - verbally or through words and diagrams; Problem Solving - Logical or Abstract; Very Large Scale Systems; think about 'Frighteningly Simple' approach first. Things that I intend to do better are - Establishing Stringent Process; Exchanging Tough Feedback; Keeping up with my reading or To-Do list to be able to completely relax.

