Friday, April 29, 2011

Diving into R

I've wanted to learn R for a long time. A new project at work is providing an ideal opportunity to finally use it. So far, it's been a great experience. R is an incredibly powerful tool for data analysis. It's allowed to me dive deep into the project's data and automate much of the analysis process.

Programming in R has been easier than expected. I've previously programmed in Matlab which has helped greatly. Some of the concepts are still foreign but I'm confident that they will become less so with time.

The greatest joy has been getting "lost" for hours writing R functions to analyze the data and produce reports. R's interactive interface has made it easy to build up code in an exploratory manner. This is my preferred programming methodology that, I find, allows me to stay in a flow state for long periods of time. The experience has been very similar to programming in Lisp dialects which I also deeply enjoy.

Although there is a lot of good information about R available for free on the web, I've found the following O'Reilly books the best resource for coming up to speed quickly,

A particularly powerful library is ggplot2 by Hadley Wickham. With it, I've been able to create very complex graphs and charts with minimal code. ggplot2 uses a grammar to create graphics in layers that, at first, can be challenging to learn. The website is informative but the book has been the best resource and well worth the money.

Another useful library is brew which I am using to auto-generate pleasant looking reports in PDF via LaTex.

I look forward to working more with R. Data science is a growing interest of mine and this opportunity to use R is adding to the momentum.