dockerterm

A Docker console for RStudio

This details my motivation and efforts in creating dockerterm, an R package the provides an RStudio Addin for running a Docker container in the RStudio terminal.

Recently I’ve been using Docker quite a bit with various projects I’m working on. I’ve always liked the idea of Docker, but as I’ve become more familiar with it I’ve started to really appreciate its magic. As a “way too brief” overview, Docker provides a convenient way to manage system dependencies by creating lightweight virtual environments known as containers. The use cases for these containers are limitless, but in the context of what I’ve been doing I’ve mainly been using Docker in connection with R. This motivated me to explore what a connection between RStudio and Docker might look like. Essentially, I wanted a way to run R code from RStudio within a Docker container.

Now, before I get too far into this, it’s essential to note that RStudio Server plays very nicely with Docker and the combination of the two can provide the functionality I’m trying to achieve. In fact, the Rocker Project maintains excellent Docker images with various R and RStudio product configurations. When using Docker with RStudio Server, the Docker container acts as the server supporting R and the port used by RStudio Server is mapped to a port on the host machine. After properly starting a Docker container with RStudio server (ie docker run --rm -p 8787:8787 rocker/rstudio) a user simply needs to visit localhost:8787 in a browser on their machine to have access to RStudio and an r session running within the container. This is a workflow I use often. However, there are times when it would be nice to stay in my local version of RStudio and run code that I’m working on against a Docker container, which is how dockerterm was born.

Framework

As I thought about what I wanted to accomplish, an RStudio Addin seemed like the natural mechanism for delivery. The recent addition of the terminal pane to RStudio also seemed like a natural fit, especially since commands can be sent directly from an RStudio source pane to the termianl with CTRL/CMD + ALT + ENTER. The initial idea was to create an RStudio Addin that enabled a user to start a new Docker container based on a specified image that would run in the RStudio terminal. The user could interact with the “remote” R session by sending code from source files via the previously mentioned key combination.

Build

Once I had a loose concept, I started exploring how to make a working proof of concept. First, I needed some way to manipulate the RStudio terminal pane. Enter the rstudioapi package, which provides useful functions for safely interacting with RStudio through R code. For this project, I found the functions pre-pended with terminal to be most helpful for manipulating the terminal pane and executing terminal commands.

Once I could programatically manipulate the terminal pane, I needed a way to start up a Docker container running in the terminal. This can be simply accomplished using the Docker CLI and passing the appropriate options to docker run. I considered using the docker R package, but hesitated to introduce a dependency on python. However, I may move in this direction in the future.

Now I just needed a way to allow the user to specify the image for the Docker container. To accomplish this, I created a simple shiny application using miniUI (a shiny framework designed for RStudio Addin gadgets) that allows a user to select from an existing group of Docker images or to specify an image to download and run from Docker Hub. Users can also specify which command is run within the container, with the default set to R.

Once the user makes their selections and clicks on Run, a new terminal is created and the docker container is attached to that window. If R was passed in as the command, the container starts an r session that the user can interact with as previously mentioned.

PoC Performance

As seen in the clip above, this initial proof of concept works well, at least on my machine. There are some definite drawbacks and limitations that need to be addressed, but for the most part, I’m pleased with the initial functionality.

Limitations

Now, this integration between Docker and RStudio has some major (current) limitations. First and foremost, RStudio and the Docker container aren’t really aware of each other. This means that nifty RStudio things like code completion in the source panel based on the current R session aren’t available when running code in a Docker container. Plotting is also another major issue, since the Docker container doesn’t really have a display to send the plot to. The easiest way to view plots generated in the container is to save them to disk (the working directory in the container is the host directory the container was started from) and then viewing them on the host computer via some system viewer. Definitely no the ideal (or an efficient) workflow. Help files also don’t render in RStudio, althought this is a minor nuisance compared to the other issues.

Limitations aside, I do feel that this application still has its use cases. It provides an easy way to interactively run R code in a Docker container without having to stand up and access the container through RStudio Server. Using this framework, I can run check if a current analysis runs in a different system or I can even test the effect different system packages have on analysis results. In theory, I could use dockerterm to open several different containers based on different images and then run code against each one in sequence to see if an analysis is truly reproducable under different circumstances.

What’s Next

In my ideal vision, this would reach a level of integration similar to PyCharm’s Docker integration. It would be awesome to be able to use RStudio and then specify that the r session running underneath it is in a specific Docker container. However, this is a pretty ambitious goal and likely requires a skillset far beyond my own to pull off.

In the short term, I’d like to see if there’s a way to improve some of the limitations, namely around plotting. I’d also like to explore some other possible applications of this type of Docker and RStudio integration. Something with plumber comes to mind.

Regardless of what comes next, so far this has been a fun project to start to build out, and it’s generated some amount of excitement in the twitterverse. I’m anxious to see what dockerterm becomes.