# Installing the Arrow Package on Linux

On macOS and Windows, when you install.packages("arrow"), you get a binary package that contains Arrow’s C++ dependencies along with it. On Linux, install.packages() retrieves a source package that has to be compiled locally, and C++ dependencies need to be resolved as well. Generally for R packages with C++ dependencies, this requires either installing system packages, which you may not have privileges to do, or building the C++ dependencies separately, which introduces all sorts of additional ways for things to go wrong.

Our goal is to make install.packages("arrow") “just work” for as many Linux distributions, versions, and configurations as possible. This document describes how it works and the options for fine-tuning Linux installation. The intended audience for this document is arrow R package users on Linux, not developers. If you’re contributing to the Arrow project, see vignette(“developing”, package = “arrow”) for guidance on setting up your development environment.

Note also that if you use conda to manage your R environment, this document does not apply. You can conda install -c conda-forge --strict-channel-priority r-arrow and you’ll get the latest official release of the R package along with any C++ dependencies.

Having trouble installing arrow? See the “Troubleshooting” section below.

# Installation basics

Install the latest release of arrow from CRAN with

install.packages("arrow")

Daily development builds, which are not official releases, can be installed from the Ursa Labs repository:

install.packages("arrow", repos = "https://arrow-r-nightly.s3.amazonaws.com")

or for conda users via:

conda install -c arrow-nightlies -c conda-forge --strict-channel-priority r-arrow

You can also install the R package from a git checkout:

git clone https://github.com/apache/arrow
cd arrow/r
R CMD INSTALL .

If you don’t already have the Arrow C++ libraries on your system, when installing the R package from source, it will also download and build the Arrow C++ libraries for you. To speed installation up, you can set

export LIBARROW_BINARY=true

to look for C++ binaries prebuilt for your Linux distribution/version. Alternatively, you can set

export LIBARROW_MINIMAL=false

to build the Arrow libraries from source with optional features such as compression libraries enabled. This will increase the build time but provides many useful features. Prebuilt binaries are built with this flag enabled, so you get the full functionality by using them as well.

Both of these variables are also set this way if you have the NOT_CRAN=true environment variable set.

## Helper function: install_arrow()

If you already have arrow installed and want to upgrade to a different version, install a development build, or try to reinstall and fix issues with Linux C++ binaries, you can call install_arrow(). install_arrow() provides some convenience wrappers around the various environment variables described below. This function is part of the arrow package, and it is also available as a standalone script, so you can access it for convenience without first installing the package:

source("https://raw.githubusercontent.com/apache/arrow/master/r/R/install-arrow.R")

install_arrow() will install from CRAN, while install_arrow(nightly = TRUE) will give you a development build. install_arrow() does not require environment variables to be set in order to satisfy C++ dependencies.

Note that, unlike packages like tensorflow, blogdown, and others that require external dependencies, you do not need to run install_arrow() after a successful arrow installation.

# Contributing

As mentioned above, please report an issue if you encounter ways to improve this. If you find that your Linux distribution or version is not supported, we welcome the contribution of Docker images (hosted on Docker Hub) that we can use in our continuous integration. These Docker images should be minimal, containing only R and the dependencies it requires. (For reference, see the images that R-hub uses.)

You can test the arrow R package installation using the docker-compose setup included in the apache/arrow git repository. For example,

R_ORG=rhub R_IMAGE=ubuntu-gcc-release R_TAG=latest docker-compose build r
R_ORG=rhub R_IMAGE=ubuntu-gcc-release R_TAG=latest docker-compose run r

installs the arrow` R package, including the C++ source build, on the rhub/ubuntu-gcc-release image.