2017-08-03: Since I wrote this in 2014, the universe, specifically Kirill Müller (https://github.com/krlmlr), has provided better solutions to this problem. I now recommend that you use one of these two packages:
- rprojroot: This is the main package with functions to help you express paths in a way that will "just work" when developing interactively in an RStudio Project and when you render your file.
- here: A lightweight wrapper around rprojroot that anticipates the most likely scenario: you want to write paths relative to the top-level directory, defined as an RStudio project or Git repo. TRY THIS FIRST.
I love these packages so much I wrote an ode to here.
I use these packages now instead of what I describe below. I'll leave this gist up for historical interest. 😆
Include this in the .Rprofile
in the top-level directory of an RStudio project:
RPROJ <- list(PROJHOME = normalizePath(getwd()))
attach(RPROJ)
rm(RPROJ)
Then build paths like so:
file.path(PROJHOME, <the_sub_dir>, <the_file_name>)
and never worry about working directory again(?). Read on for the problem I am trying to solve.
My near-daily dilemma
- An R project -- and RStudio Project -- that is big enough to require sub-directory structure
- R scripts and R Markdown files in more than one sub-directory that I want to
render()
andsource()
- During development and informal testing, I want to iterate fast and enjoy RStudio's facilities for running bits of code or sourcing/compiling entire files. The "Compile Notebook" and "Knit HTML" buttons (and
knitr
andrmarkdown
packages in general) assume that working directory = directory where source file lives. In some rather theoretical sense, this is not strictly true, but life is much easier if you resign yourself to this. - In "production," I want to use a
Makefile
or similar to run scripts and compile R Markdown; this file can obviously live in only one place, with the most obvious and canonical choice being the top-level Project directory, where no R scripts or R Markdown files are to be found.
- During development and informal testing, I want to iterate fast and enjoy RStudio's facilities for running bits of code or sourcing/compiling entire files. The "Compile Notebook" and "Knit HTML" buttons (and
- I'm against
setwd()
for all the usual reasons, e.g. portability.
So, what's working directory going to be folks? The above rules and needs admit no obvious solution. Is every one else faffing around with working directory as much as I am?
In the pre-RStudio era, I used to define a path object at the top of every file, whereAmI
, and I constructed absolute paths based on that. I'm returning to this idea but want to upgrade the smarts, so the solution is more general. I think I've answered my own question.
I define the Project home directory to be the directory where the <project_name>.Rproj
file sits.
I cannot believe I am using attach()
but here goes.
Create a .Rprofile
file in the Project home directory that includes these lines:
RPROJ <- list(PROJHOME = normalizePath(getwd()))
attach(RPROJ)
cat("sourcing Project-specific .Rprofile\n")
cat('retrieve the top-level Project directory at any time with PROJHOME or via get("PROJHOME", "RPROJ"):\n',
get("PROJHOME", "RPROJ"), "\n")
rm(RPROJ)
This creates a new environment on the search path, named RPROJ
, containing an object PROJHOME
giving the normalized absolute path to Project home. Since the value is determined at the time of R session start, this should work for different collaborators/machine/OSes. In theory.
The easiest way to retrieve the Project home is simply via PROJHOME
, though in theory that could be masked by objects with the same name earlier in the search path. The most proper way to access is via get("PROJHOME", "RPROJ")
.
Now I can build absolute-but-portable paths like so:
file.path(PROJHOME, <the_sub_dir>, <the_file_name>)
Here's what I see at R session start:
sourcing Project-specific .Rprofile
retrieve the top-level Project directory at any time with PROJHOME or via get("PROJHOME", "RPROJ"):
/Users/jenny/path/to/my-project
The interactive workspace is how I left it the last time I worked on this Project; in particular, it's not cluttered up with PROJHOME
. The working directory of R Console is also how I left it, though this suddenly becomes much less important and I think this work style should eliminate fussing around with working directory. I can clean out the workspace with rm(list = ls())
or RStudio's broom button without harming my ability to build robust paths.
Added 2014-12-12, after using above approach for a couple of months. Since people from #rrhack showed a glimmer of interest, want to add this missing piece.
Above approach will work if and only if path/to/my-project/.Rprofile
is processed upon R start up. When does that happen?
- Use of R through your RStudio Project. Behind the scenes RStudio launches R with working directory set to Project's home directory, before it restores working directory to its last known state.
- Any R process with working directory of
path/to/my-project/
.
What other situations are likely to arise in practice? When will path/to/my-project/.Rprofile
not get processed and PROJHOME
will be undefined?
-
You have R scripts or RMarkdown files in a subdirectory, e.g.,
path/to/my-project/code/my_script.R
. -
You execute or render those files outside of RStudio from a working directory other than
path/to/my-project/
, e.g., via Make or from the shell:~/path/to/my-project/code$ Rscript my_script.R
My current solution: create an additional .Rprofile
in any subdirectory that holds R scripts or RMarkdown files. Continuing the above example, we create path/to/my-project/code/.Rprofile
. The only difference is the specification of PROJHOME
as the parent of working directory:
```R
RPROJ <- list(PROJHOME = normalizePath("..")))
attach(RPROJ)
rm(RPROJ)
```
Jekyll is a static website generator. It supports the construction of relative paths through the notion that a website has a root directory. Within files for individual webpages, the path to root can be specified via YAML frontmatter. This, in turn, allows the construction of paths relative to root. The rationale is to encourage use of relative paths over absolute and to make it easy to develop content before the entire directory structure of a site is fixed.
Example of YAML frontmatter specifying relative path to website root:
---
title: My Page title
root: "../"
---
and here's how links would be built within a page:
<img src="{{ post.root }}images/happy.png" />
<a href="{{ post.root }}2010/01/01/another_post>Relative link to another post</a>
The Project home directory PROJHOME
is equivalent to Jekyll's website root directory post.root
. The use of .Rprofile
to define PROJHOME
is equivalent to Jekyll's use of root: "../"
in YAML frontmatter.
Example from this stackoverflow thread.
This stackoverflow thread is kind of relevant.
It was helpful to re-read the Environments chapter of Hadley's Advanced R book.
Should I just go ahead and set an environment variable, i.e. via Sys.setenv()
?
Should I worry about where the RPROJ
environment ends up in the search path?
This seems tied up with other issues, like building whole websites with rmarkdown
, which currently also has a very "one directory to rule them all" approach (I'm looking at you _output.yaml
, libs
, include
). It needs to be easier to designate home directory for a project or website and then write paths relative to that. The way jekyll
works seems worth considering.
Nice solution! I will implement this in all my current projects! Thanks!