This is the first in our Blog Series “Help a Solver Succeed” (HASS), where we ask InnoCentive experts to provide resources that they think might be helpful to you in solving Challenges. Today’s post is from Innovation Development Manager Gabriel Eichler, who is a member of our Client Services team.
Our blogging team has asked me to write a piece for the first issue of the “HASS – Help A Solver Succeed” blog series. This section is dedicated to profiling enabling technologies, services or information that may help our Solvers be more successful at either Solving our problems or be more productive at doing your own work on a daily basis.
Since my educational background and Challenge writing specialty is almost exclusively focused on computational, bioinformatic or statistical Challenges I find it apt to write about a programming language. I have decided to dedicate my HASS entry to the programming language R.
I came to know R during my PhD research at the US National Cancer Institute. Previously I had written extensive amounts of code in Matlab – my previous programming language of choice for rapid prototyping or computational experimentation. Though Matlab has a more sophisticated look and feel, and I knew it quite well, I was instructed that learning R would be essential to my graduate studies. Digging in I learned that R was first distributed in the spring of 1997 by Robert Gentlemen and Ross Ihaka and it resembles the closed source, commercial language S in many ways. However, from the beginning Gentelmen and Ihaka have made R an open source language that thrives off a community of volunteer developers. From nearly the very beginning, R has maintained the Comprehensive R Archive Network (CRAN) resource for everyone to publish their own R extensions or libraries. This brilliant step quickly made R a force to be reckoned with.
I find R to be the best way to quickly model statistical questions, create powerful graphs or even super compute a difficult but parallelizable problem. The interface and kernel are extremely lightweight so your computer is left with maximal resources to compute on what you want. Beyond that, the CRAN resources make R an even more powerful resource because thousands of people have created hundreds of packages meant to assist you in performing complex tasks. In fact, in my nearly 3 years of continual use of R, I have rarely (if ever) encountered situations in which I actually had to write complex procedures for any standard statistical or machine learning algorithm. For example, I was able to develop a multiprocessed, Random Forest based algorithm using mostly code pulled from CRAN.
In summary, I’m a huge fan of the R programming language. If you haven’t already done so I would encourage you to download a free copy and play around with it. I’ll be the first to admit that it’s not as slick as a commercial package such as Matlab or S, but the power of open source has elevated R to be one of the most useful and valuable languages around. Plus, isn’t it kind of cool to participate in InnoCentive’s Crowdsourcing process by using a resource that is, in and of itself, a product of Crowdsourcing?
Thank you, R.
Gabriel Eichler, PhD.