GraalVM implementation of R, also known as FastR, is compatible with GNU R,can run R code at unparalleled performance,integrates with the GraalVMecosystem and provides additional R level features.
Warning: The support for R is currently experimental.
Installing R
The R language can be installed to a GraalVM build with the gu
command.See graalvm/bin/gu —help
for more information.
Requirements
GraalVM R engine requires the OpenMP runtime library and GFortran 3 runtime libraries to be installedon the target system. Following commands should install those dependencies.
- Ubuntu 18.04 and 19.04:
apt-get install libgfortran3 libgomp1
- Oracle Linux 7:
yum install libgfortran libgomp
- Oracle Linux 8:
yum install compat-libgfortran-48
- MacOS:
brew install gcc@4.9
On macOS it is necessary to run $R_HOME/bin/configure_fastr
.This script will attempt to locate the necessary runtime libraries on your computerand will fine-tune the the GraalVM R installation according to your system.On Linux systems, this script will check that the necessary libraries are installed, and if not,it will suggest how to install them.
Moreover, to install R packages that contain C/C++ or Fortran code, compilersfor those languages must be present on the target system. Following packagessatisfy the dependencies of the most common R packages:
- Ubuntu 18.04 and 19.04:
apt-get install libgfortran3 build-essential gfortran libxml2-dev libc++-dev
- Oracle Linux 7 and 8:
yum groupinstall 'Development Tools' && yum install gcc-gfortran bzip2 libxml2-devel
Note that you still need to install the GFortran 3 runtime libraries unless this isthe default version on your system.
Search Paths for Packages
The default R library location is within the GraalVM installation directory.In order to allow installation of additional packages for users thatdo not have write access to the GraalVM installation directory,edit the R_LIBS_USER
variable in the $R_HOME/etc/Renviron
file.
Running R Code
Run R code with the R
and Rscript
commands:
$ R [polyglot options] [R options] [filename]
$ Rscript [polyglot options] [R options] [filename]
GraalVM R engine uses the same polyglot options as other GraalVM languages and the same R options as GNU R, e.g., bin/R —vanilla
.Use —help
to print the list of supported options. The most important options include:
—jvm
to enable Java interoperability—polyglot
to enable interoperability with other GraalVM languages—vm.Djava.net.useSystemProxies=true
to pass any options to the JVM, this will be translated to-Djava.net.useSystemProxies=true
.
Note: unlike other GraalVM languages, R does not yet ship with aNative Image of its runtime.Therefore the —native
option, which is the default, will still start Rscript on top of JVM,but for the sake of future compatibility the Java interoperability will not be available in such case.
Users can optionally build the native image using:
jre/languages/R/bin/install_r_native_image
Running R Extensions
The GraalVM R engine can run R extensions in two modes:
- native: the native machine code is run directly on your CPU, this is the same as how GNU-R runs R extensions.
- llvm: if the LLVM bitcode is available, it can be interpreted by GraalVM LLVM.
The native mode is better suited for code that does not extensively interact with the R API, for example,plain C or Fortran numerical computations working on primitive arrays. The llvm mode provides significantlybetter performance for extensions that frequently call between R and the C/C++ code, because GraalVM LLVMinterpreter is also partially evaluated by the Truffle library like the R code, both can be inlined and optimizedas one compilation unit. Moreover, GraalVM LLVM is supported byGraalVM tools which allows to, for instance,debug R and C code together.
In one GraalVM R process, any R package can be loaded in either mode. That is, GraalVM R supportsmixing packages loaded in the native mode with packages loaded in the llvm mode in one process.
Generating LLVM Bitcode
As of version 19.3.0, the GraalVM R engine is configured to use theLLVM toolchainto compile R packages native code. This toolchain produces standard executable binaries fora given system, but it also embeds the corresponding LLVM bitcode into them.The binaries produced by the LLVM Toolchain can be loaded in both modes: native or llvm.
The GraalVM R engine can be reconfigured to use your system default compilerswhen installing R packages by running
# use local installation of GGC:
$ R -e 'fastr.setToolchain("native")'
# to revert back to using the GraalVM's LLVM toolchain:
$ R -e 'fastr.setToolchain("llvm")'
Using the system default compilers may be more reliable, but you loose theability to load the R packages built with the LLVM toolchain in the llvm mode,because they will not contain the embedded bitcode. Moreover, mixing packagesbuilt by the local system default compilers and packages built by the LLVMtoolchain in one GraalVM R process may cause linking issues.
Choosing the Running Mode
Starting from the version 19.3.0, the GraalVM R engine uses the following defaults:
- native mode to load the packages
- llvm toolchain to build their sources
To enable the llvm mode for loading the packages, use —R.BackEnd=llvm
.You can also enable each mode selectively for given R packages by using:
—R.BackEndLLVM=package1,package2
—R.BackEndNative=package1,package2
GraalVM R Engine Compatibility
GraalVM implementation of R, known as FastR, is based on GNU R and reusesthe base packages. It is currently based on GNU-R 3.6.1, and moves to new majorversions of R as they become available and stable. The FastR project, maintains an extensive set of unittests for all aspects of the R language and the builtin functionality, and thesetests are available as part of the R source code. GraalVM R engine aims to befully compatible with GNU R, including its native interface as used by R extensions. Itcan install and run unmodified complex R packages like ggplot2
, Shiny
, orRcpp
. As some packages rely on unspecified behavior or implementation detailsof GNU-R, support for packages is work in progress, and some packages might notinstall successfully or work as expected.
Packages can be installed using the install.packages
function or the R CMD INSTALL
shell command.By default, R uses fixed snapshot of the CRAN repository1.This behavior can be overridden by explicitly setting the repos
argument of the install.packages
function.This functionality does not interfere with the checkpoint
package. If you are behind a proxy server, makesure to configure the proxy either with environment variables or using the JVM options,e.g., —vm.Djava.net.useSystemProxies=true
.
Versions of some packages specifically patched for GraalVM implementation of R can be installed using the install.fastr.packages
function that downloads them from the GitHub repository.Currently, those are rJava
and data.table
.
Known limitations of GraalVM implementation of R compared to GNU R:
- Only small parts of the low-level
graphics
package are functional. However, thegrid
package is supported and R can install and run packages based on it likeggplot2
. Support for thegraphics
package in R is planned for future releases. - Encoding of character vectors. Related builtins (e.g.,
Encoding
) are available, but do not execute any useful code. Character vectors are represented as Java Strings and therefore encoded in UTF-16 format. GraalVM implementation of R will add support for encoding in future releases. - Some parts of the native API (e.g.,
DATAPTR
) expose implementation details that are hard to emulate for alternative implementations of R. These are implemented as needed while testing the GraalVM implementation of R with various CRAN packages.
You can use the compatibility checker to find whether the CRAN packages you are interested in are tested on GraalVM and whether the tests pass successfully.
High Performance
GraalVM runtime optimizes R code that runs for extended periods of time.The speculative optimizations based on the runtime behavior of the R code and dynamic compilation employed by GraalVM runtime are capable of removing most of the abstraction penalty incurred by the dynamism and complexity of the R language.
Let us look at an algorithm in R code. The following example calculates themutual information of a large matrix:
x <- matrix(runif(1000000), 1000, 1000)
mutual_R <- function(joint_dist) {
joint_dist <- joint_dist/sum(joint_dist)
mutual_information <- 0
num_rows <- nrow(joint_dist)
num_cols <- ncol(joint_dist)
colsums <- colSums(joint_dist)
rowsums <- rowSums(joint_dist)
for(i in seq_along(1:num_rows)){
for(j in seq_along(1:num_cols)){
temp <- log((joint_dist[i,j]/(colsums[j]*rowsums[i])))
if(!is.finite(temp)){
temp = 0
}
mutual_information <-
mutual_information + joint_dist[i,j] * temp
}
}
mutual_information
}
system.time(mutual_R(x))
# user system elapsed
# 1.321 0.010 1.279
Algorithms such as this one usually require C/C++ code to run efficiently:2
if (!require('RcppArmadillo')) {
install.packages('RcppArmadillo')
library(RcppArmadillo)
}
library(Rcpp)
sourceCpp("r_mutual.cpp")
x <- matrix(runif(1000000), 1000, 1000)
system.time(mutual_cpp(x))
# user system elapsed
# 0.037 0.003 0.040
(Uses r_mutual.cpp.)However, after a few iterations, GraalVM runs the R code efficiently enough tomake the performance advantage of C/C++ negligible:
system.time(mutual_R(x))
# user system elapsed
# 0.063 0.001 0.077
GraalVM implementation of R is primarily aimed at long-running applications. Therefore, the peak performance is usually only achieved after a warmup period. While startup time is currently slower than GNUR’s, due to the overhead from Java class loading and compilation, future releases will contain a native image of R with improved startup.
GraalVM Integration
The R language integration with the GraalVM ecosystem includes:
- seamless interop with other GraalVM languages and with Java
- debugging with Chrome DevTools
- CPU and memory profiling
- VisualVM integration
To start debugging the code start the R script with —inspect
option
$ Rscript --inspect myScript.R
Note that GNU R compatible debugging using, for example, debug(myFunction)
is also supported.
Interoperability
GraalVM supports several other programming languages, including JavaScript, Ruby, Python, and LLVM.GraalVM implementation of R also provides an API for programming language interoperability that lets you execute code from any other language that GraalVM supports. Note that you must start the R script with —polyglot
to have access to other GraalVM languages.
GraalVM execution of R provides the following interoperability primitives:
eval.polyglot('languageId', 'code')
evaluates code in some other language, thelanguageId
can be, e.g.,js
.eval.polyglot(path = '/path/to/file.extension')
evaluates code loaded from a file. The language is recognized from the extension.export('polyglot-value-name', rObject)
exports an R object so that it can be imported by other languages.import('exported-polyglot-value-name')
imports a polyglot value exported by some other language.
Please use the ?functionName
syntax to learn more. The following example demonstrates the interoperability features:
# get an array from Ruby
x <- eval.polyglot('ruby', '[1,2,3]')
print(x[[1]])
# [1] 1
# get a JavaScript object
x <- eval.polyglot(path='r_example.js')
print(x$a)
# [1] "value"
# use R vector in JavaScript
export('robj', c(1,2,3))
eval.polyglot('js', paste0(
'rvalue = Polyglot.import("robj"); ',
'console.log("JavaScript: " + rvalue.length);'))
# JavaScript: 3
# NULL -- the return value of eval.polyglot
(Uses r_example.js.)
R vectors are presented as arrays to other languages. This includes single element vectors, e.g. 42L
or NA
.However, single element vectors that do not contain NA
can be typically used in places where the otherlanguages expect a scalar value. Array subscript or similar operation can be used in other languages to accessindividual elements of an R vector. If the element of the vector is not NA
, the actual valueis returned as a scalar value. If the element is NA
, then a special object that looks like null
is returned. The following Ruby code demonstrates this.
vec = Polyglot.eval("R", "c(NA, 42)")
p vec[0].nil?
# true
p vec[1]
# 42
vec = Polyglot.eval("R", "42")
p vec.to_s
# "[42]"
p vec[0]
# 42
The foreign objects passed to R are implicitly treated as specific R types.The following table gives some examples.
Example of foreign object (Java) | Viewed ‘as if’ on the R side |
---|---|
int[] {1,2,3} | c(1L,2L,3L) |
int[][] { {1, 2, 3}, {1, 2, 3} } | matrix(c(1:3,1:3),nrow=3) |
int[][] { {1, 2, 3}, {1, 3} } | not supported: raises error |
Object[] {1, ‘a’, ‘1’} | list(1L, ‘a’, ‘1’) |
42 | 42L |
In the following code example, we can simply just pass the Ruby array to the R built-in function sum
,which will work with the Ruby array as if it was integer vector.
sum(eval.polyglot('ruby', '[1,2,3]'))
Foreign objects can be also explicitly wrapped into adapters that make them look like the desired R type.In such a case, no data copying occurs if possible. The code snippet below shows the most common use cases.
# gives list instead of an integer vector
as.list(eval.polyglot('ruby', '[1,2,3]'))
# assume the following Java code:
# public class ClassWithArrays {
# public boolean[] b = {true, false, true};
# public int[] i = {1, 2, 3};
# }
x <- new('ClassWithArrays'); # see Java interop below
as.list(x)
# gives: list(c(T,F,T), c(1L,2L,3L))
For more details, please refer tothe executable specificationof the implicit and explicit foreign objects conversions.
Note that R contexts started from other languages or Java (as opposed to via the bin/R
script) will default to non-interactive mode, similar to bin/Rscript
.This has implications on console output (results are not echoed) and graphics (output defaults to a file instead of a window), and some packages may behave differently in non-interactive mode.
See the Polyglot Reference and theEmbedding documentationfor more information about interoperability with other programming languages.
Interoperability with Java
GraalVM R engine provides built-in interoperability with Java. Java class objects can be obtained via java.type(…)
.The standard new
function interprets string arguments as a Java class if such class exists. new
also accepts Java types returned from java.type
.Fields and methods of Java objects can be accessed using the $
operator.Additionally, you can use awt(…)
to open an R drawing devicedirectly on a Java Graphics surface, for more details see Java Based Graphics.
The following example creates a new Java BufferedImage
object, plots random data to it using R’s grid
package,and shows the image in a window using Java’s AWT
framework. Note that you must start the R script with —jvm
to have access to Java interoperability.
library(grid)
openJavaWindow <- function () {
# create image and register graphics
imageClass <- java.type('java.awt.image.BufferedImage')
image <- new(imageClass, 450, 450, imageClass$TYPE_INT_RGB);
graphics <- image$getGraphics()
graphics$setBackground(java.type('java.awt.Color')$white);
grDevices:::awt(image$getWidth(), image$getHeight(), graphics)
# draw image
grid.newpage()
pushViewport(plotViewport(margins = c(5.1, 4.1, 4.1, 2.1)))
grid.xaxis(); grid.yaxis()
grid.points(x = runif(10, 0, 1), y = runif(10, 0, 1),
size = unit(0.01, "npc"))
# open frame with image
imageIcon <- new("javax.swing.ImageIcon", image)
label <- new("javax.swing.JLabel", imageIcon)
panel <- new("javax.swing.JPanel")
panel$add(label)
frame <- new("javax.swing.JFrame")
frame$setMinimumSize(new("java.awt.Dimension",
image$getWidth(), image$getHeight()))
frame$add(panel)
frame$setVisible(T)
while (frame$isVisible()) Sys.sleep(1)
}
openJavaWindow()
For more information on FastR interoperability with Java and other languages implemented with Truffle framework,refer to the Interoperability tutorial.
GraalVM implementation of R provides its own rJava compatible replacement package available at GitHub,which can be installed using:
$ R -e "install.fastr.packages('rJava')"
GraalVM R Engine Additional Features
Java Based Graphics
The GraalVM implementation of R includes its own Java based implementation of the grid
package and the following graphics devices: png
, jpeg
, bmp
, svg
and awt
(X11
is aliased to awt
). The graphics
package and most of its functions are not supported at the moment.
The awt
device is based on the Java Graphics2D
object and users can pass it their own Graphics2D
object instance when opening the device using the awt
function, as shown in the Java interop example.When the Graphics2D
object is not provided to awt
, it opens a new window similarly to X11
.
The svg
device in GraalVM implementation of R generates more lightweight SVG code than the svg
implementation in GNU R.Moreover, functions tailored to manipulate the SVG device are provided: svg.off
and svg.string
.The SVG device is demonstrated in the following code sample. Please use the ?functionName
syntax to learn more.
library(lattice)
svg()
mtcars$cars <- rownames(mtcars)
print(barchart(cars~mpg, data=mtcars))
svgCode <- svg.off()
cat(svgCode)
In-Process Parallel Execution
GraalVM R engine adds a new cluster type SHARED
for the parallel
package. This cluster starts new jobs as new threads inside the same process. Example:
library(parallel)
cl0 <- makeCluster(7, 'SHARED')
clusterApply(cl0, seq_along(cl0), function(i) i)
Worker nodes inherit the attached packages from the parent node with copy-on-write semantics, but not the global environment.This means that you do not need to load again R libraries on the worker nodes but values (including functions) from the globalenvironment have to be transfered to the worker nodes, e.g., using clusterExport
.
Note that unlike with the FORK
or PSOCK
clusters the child nodes in SHARED
cluster are running in the same process,therefore, e.g., locking files with lockfile
or flock
will not work. Moreover, the SHARED
cluster is based onan assumption that packages’ native code does not mutate shared vectors (which is a discouraged practice) and is threadsafe and re-entrant on the C level.
If the code that you want to parallelize does not match these expectations, you can still use the PSOCK
cluster with the GraalVM R engine.The FORK
cluster and functions depending solely on forking (e.g., mcparallel
) are not supported at the moment.
1 More technically, GraalVM implementation of R uses a fixed MRAN URL from $R_HOME/etc/DEFAULT_CRAN_MIRROR
, which is a snapshot of theCRAN repository as it was visible at a given date from the URL string.
2 When this example is run for the first time, it installs the RcppArmadillo
package,which may take few minutes. Note that this example can be run in both R executedwith GraalVM and GNU R.