Visual Studio versus R Studio
SQL Server 2016 introduced support for the R language and an
integrated R server. For the first time,
many Microsoft developers are looking at R programming. One of the first questions raised is “which
integrated development environment (IDE) should I use?” Many will be relieved to learn that there is
a Visual Studio extension called R Tools for Visual Studio (RTVS) that supports
the R language. However, there is another
popular IDE called RStudio. Both
products are free and open source. This
leaves the question, which one should I use? Which is better? Better is a
relative term since it begs the question, for what? In this article, we will discuss the
strengths and weakness of each. Where
possible, the evaluation is objective but I feel I would be remiss if I did not
offer my assessment as well. Bear in
mind, both platforms are under constant change so the information here will
become dated.
R Tools for Visual Studio
The screen shot below shows an R project open in R Tools for
Visual Studio (RTVS). We can see that
this is a robust tool with many features.
Note: Integrated visualizations is a new concept to Visual Studio. The left panel shows an R script in the
editor. Multiple scripts can be edited
simultaneously. The upper middle window
shows a new feature, the SQL query editor where queries can be created and
executed. Below this is the output window
which shows the SQL query results. To
the right, we see the plot windows. The
plot scrolling list on the right margin supports browsing plot history. Note:
There are more optional tabs and windows that are not displayed.
R Tools for Visual Studio Screen
RStudio
The screen shot below shows a typical RStudio session. The upper left quadrant is where scripts are
edited. Each tab is a separate script.
Below this is the R console where the scripts execute and output is displayed.
The upper right quadrant show the workspace, which shows all objects in
memory. The hidden History tab stores a
list of all the R statements executed in the session. Below this is the Plots tab which shows the
most recent visualization. The Files tab
(hidden) shows a list of files in the current folder. The Packages tab (hidden) shows a list of all
installed packages. The Help window
provides some quick links to documentation and is where requests for help are
displayed. Note: All the features shown are also available in
RTVS but the settings used for that screen shot did not display them all.
RStudio Screen
Decision Summary
Visual Studio with RTVS and RStudio are competitive in terms
of features thanks to Microsoft’s quick enhancements in recent months. However,
as of this writing, RTVS is still not considered a production release. Therefore, the decision of which IDE to use
needs to be made with an eye towards the future. However, Microsoft’s first preview of RTVS was
in March 2016 and by October of the same year had five releases each with
substantial improvements. With that in
mind, differences tend to focus on the strengths of the platform, i.e. RTVS has
extra features specific to Windows, .Net, and multiple languages while RStudio
supports features that favor RStudio products such as integrated deployment of
Shiny apps to Shinyapps.IO server and RMarkdown documents to RPubs which are
both owned by RStudio. If you are a
developer familiar with Visual Studio using the Microsoft stack or do
work with multiple languages, RTVS is probably a good choice. Microsoft is likely to continue adding
functionality to integrate RTVS with tools like SQL Server, .Net, Office,
SharePoint, and Azure. Visual Studio’s
powerful development and deployment features together with continuously
improving R support, makes this a good choice overall. However, if you are a data scientist or a
developer focused solely on R, RStudio may be the better choice. RStudio’s interface is simpler and more
intuitive which is partly because everything is focused on R. RStudio has outstanding support for document
publishing to virtually any output format.
Creating a slide presentation, an article, a book, or even a web
application is simple and immediately viewable within RStudio. Many, if not most, books on the R language
were completely written and published with RStudio. With a click of a button, you can deploy your
files to a free web server as long as the resource demands fall within limits. If you
are still not sure which IDE is best for you, read on to get more details that may
help you make your decision.
Features
RStudio and RTVS are both robust products with many
features. To discuss these, we’ll start
out with a list of features that each product supports. Below is a table listing key product
features.
RTVS feels like a professional development platform as one
would expect. The extensive tools for
performance testing, environment analysis, deployment, etc. betray the fact
that this is a development platform first, data science tool second. For example, debugging is well supported and
Microsoft has committed to making debugging a first-class feature. Many languages including Python, C, C++, C#,
and Java are supported making Visual Studio a powerful yet complex tool. RStudio comes from the data science world
where publishing your work is critical. It has superb and easy to use
publishing features that developers often overlook. With a few mouse clicks (and a bit of text
formatting), your R script becomes a slide show. The company RStudio has authored many popular R packages
including dplyr, ggplot2, and shiny.
RStudio quickly implements support for new data science related features
such as R Notebooks. It seems likely
that integrated support for RStudio packages will come first to RStudio. Overall, RStudio does a better job
supporting the iterative data exploration required by data science.
Features Specific to RStudio
Feature
|
Manipulate
package that supports interactive widgets like drop downs and sliders to
manipulate plots interactively.
|
Excellent
documentation. RTVS still has minimal
documentation which made even doing this comparison a challenge.
|
Is in
production release status. RTVS is not
at the time of this writing.
|
Documentation
and training materials – excellent and extensive
|
Easy click
to deploy R Markdown or Shiny apps to free web server.
|
Support for R
Presentations which tie several publishing features together for ease of use.
|
Wizard to
import data from completive format such as SAS and SPSS.
|
Wizard to
locate and install new packages.
|
32 and 64
bit versions of R supported. RTVS only
supports 64 bit.
|
Support for
multiple platforms: Linux, Apple OS, Windows, OSX. RTVS is available on where Visual Studio
can run which admittedly is expanding.
|
Pros to RTVS
In general R Studio has more features than RTVS. Microsoft admits this on their RTVS download
site at: http://microsoft.github.io/RTVS-docs/. Specifically, they said “RStudio is a fantastic and mature IDE for R that’s been under
development for years. RTVS is a long way from RStudio, because we’ve only been
developing it since July, 2015. We hope to have all the critical features that
you need to be successful this summer.”
It’s a fair point. Considering
the short time in which they developed RTVS, it is amazing it supports so many
features. No doubt, Microsoft will
continue to improve RTVS but RStudio is also improving their product.
Beyond raw features, there are some good reasons to prefer
Visual Studio. A list of some of these
are:
- Already trained in Visual Studio.
- One IDE for multiple platforms and languages.
- Expectation of ongoing improvements in support for SQL Server R Integration features.
- Helpful if you need to integrate R programs with other languages such as Python or C#.
- Commercial product support.
Developers are not always aware of a difference between data
science and programming. Data science is
an iterative trial and error process of analysis, research, data wrangling, and
experimentation. Programming is the
process of taking a set of requirements and automating them. RStudio has better support for the prior
while Visual Studio is designed for the later.
In other words, the data scientist will analyze data to build a model
which has business value but is not necessarily scalable nor generalized. To enable the enterprise to get the value
from this work, it needs to be operationalized, i.e. cleaned up, redesigned for
performance, perhaps moved into SQL Server, and automated perhaps via a job
scheduler. In the long term, Visual
Studio may be the better choice for this operationalization. Currently, I think both IDEs can be used with
SQL Server/R integration quite well.
R Studio has all the features of RTVS in terms of the R
language but is limited to the R language only.
If you want to program in multiple languages in the same IDE, RTVS is
the only choice. However, the focus on R
allows R Studio to fully support the language in a very intuitive way. In fact, the intangible aspect of
intuitiveness is the best reason to choose RStudio. This is where my opinion comes into play so
others may disagree. I find RStudio’s
features are right where you expect them to be, out on top and easy to
find. Just click the Preview button to
view your Markdown document as a slide show and then publish to the server with a single button click.
Within minutes of installing
RStudio I could fully use the IDE whereas I found RTVS less intuitive. Many features are not obvious and it took
time to figure out how to do things and there is scant documentation. RStudio was designed to do exactly what it does, support data
science. Visual Studio was not designed
for this but tools were added in. An
example of this came when I tried to use a document type called an R
Notebook. This is a special interactive
format where output is rendered in stream right in the editor. It was inspired
by the Jupyter Notebook project. To get
an idea of what I mean see the screen shot below.
R Notebook
The above script is in a special format called R Markdown,
hence the Rmd file extension. The code
between the ```r and ``` is called a code chunk and its output is rendered
within the editor. This creates a
documentation flow very useful to data scientists and teachers, i.e. they can
show their work in real time. The code
chunk can be edited and re-executed causing the plot to refresh. RTVS does not support R Notebooks.
Beyond the notebook functionality, R Studio has amazing
publication support. An R markdown
script can be presented as a slide show, paginated, or flowing text, and
converted into many formats including Word, PDF, and HTML. Many books on the R language were completely
written and published from RStudio. To
give an idea of how extensive publishing support is in RStudio, consider the
screen show below. We can see new files
of many types supported including R Presentation, R HTML, R Markdown, and R
Sweave, which are all aimed as publishing.
Publishing
If we select R Markdown from the above list, we are
presented with several options. We can
choose HTML, PDF, Word (which RTVS also supports) but we also can choose
Presentation and Shiny which gives us more options.
If we choose R Presentation, we can further choose among several
options as we can see in the screen shot below.
This is very handy when you need to create a slide show that
incorporates dynamic code execution.
RStudio even supports creating a file from a template as
shown below.
The support for document publishing is important but I think
less appreciated by developers than data scientists. For teaching, it is invaluable. I no longer need to create a PowerPoint slide
show from my R code. Being able to maintain
it all in one file is a godsend to a speaker.
This is also useful to prepare a presentation to managers and colleagues
explaining how you arrived as your data analysis and conclusions. For research, it means you can send your R
project directly to professional journals, in fact most expect this format.
Shiny
Programmers like shiny things hence the name Shiny for R Studio’s interactive web application support for R. Think of it as server side R scripting pages much as C# supports Active Server Pages (ASP) and Java support Java Server Pages (JSP). When you run code than include the Shiny package, R Studio launches an R web service to support interactive R applications. Shiny is an extensive framework with many functions to support interactive widgets. You can develop and test your Shiny application on your machine and deploy it to a cloud based Shiny server to make it available for others to use. For a modest Shiny application, the Shiny server is free but you can pay for commercial scale support if desired.
Conclusion
R Studio and RTVS are moving targets and you will need to
monitor the progress of each as time goes on.
This article covers highlights of the tools but is not exhaustive. RTVS is rapidly expanding its features but I
think the main point to consider is the direction of those features. Clearly, integration with
the Microsoft stack and Azure will be a high priority. Features to support operationalizing R
programs are on the horizon and Microsoft is at the forefront with identifying
and supporting this need. However, if
you are not on board with the Microsoft ecosystem, support for other tools such
as Amazon Web Services, Shiny.IO, Shiny Server, Oracle, Jupyter Notebooks
(versus Azure ML notebooks), etc. may be
slow in coming. Visual Studio does support open source products
such as MySQL, PostgreSQL, Hadoop, Spark, and Python. RStudio is likely to continue focusing on data science features as it
has been doing since its founding.
Better support for interactive visualizations, dynamic code, and data
wrangling, can be expected.
Your post about technology was very helpful to me. Very clear step-by-step instructions. I appreciate your hard work and thanks for sharing.
ReplyDeleteR Training in Chennai
Spring Training in Chennai