1  Introduction

In R, the fundamental unit of shareable code is the package. A package bundles together code, data, documentation, and tests, and is easy to share with others. As of June 2022, there were over 18,000 packages available on the Comprehensive R Archive Network, or CRAN, the public clearing house for R packages. This huge variety of packages is one of the reasons that R is so successful: the chances are that someone has already solved a problem that you’re working on, and you can benefit from their work by downloading their package.

If you’re reading this book, you already know how to work with packages in the following ways:

The goal of this book is to teach you how to develop packages so that you can write your own, not just use other people’s. Why write a package? One compelling reason is that you have code that you want to share with others. Bundling your code into a package makes it easy for other people to use it, because like you, they already know how to use packages. If your code is in a package, any R user can easily download it, install it and learn how to use it.

But packages are useful even if you never share your code. As Hilary Parker says in her introduction to packages: “Seriously, it doesn’t have to be about sharing your code (although that is an added benefit!). It is about saving yourself time.” Organising code in a package makes your life easier because packages come with conventions. For example, you put R code in R/, you put tests in tests/ and you put data in data/. These conventions are helpful because:

It’s even possible to use packages to structure your data analyses (e.g., Marwick, Boettiger, and Mullen (2018a) or Marwick, Boettiger, and Mullen (2018b)), although we won’t delve deeply into that use case here.

1.1 Philosophy

This book espouses our philosophy of package development: anything that can be automated, should be automated. Do as little as possible by hand. Do as much as possible with functions. The goal is to spend your time thinking about what you want your package to do rather than thinking about the minutiae of package structure.

This philosophy is realized primarily through the devtools package, which is the public face for a suite of R functions that automate common development tasks. The release of version 2.0.0 in October 2018 marked its internal restructuring into a set of more focused packages, with devtools becoming more of a meta-package. The usethis package is the sub-package you are most likely to interact with directly; we explain the devtools-usethis relationship in Section 3.2.

As always, the goal of devtools is to make package development as painless as possible. It encapsulates the best practices developed by Hadley Wickham, initially from his years as a prolific solo developer. More recently, he has assembled a team of developers at RStudio, who collectively look after hundreds of open source R packages, including those known as the tidyverse. The reach of this team allows us to explore the space of all possible mistakes at an extraordinary scale. Fortunately, it also affords us the opportunity to reflect on both the successes and failures, in the company of expert and sympathetic colleagues. We try to develop practices that make life more enjoyable for both the maintainer and users of a package. The devtools meta-package is where these lessons are made concrete.

devtools works hand-in-hand with RStudio, which we believe is the best development environment for most R users. The main alternative is Emacs Speaks Statistics (ESS), which is a rewarding environment if you’re willing to put in the time to learn Emacs and customise it to your needs. The history of ESS stretches back over 20 years (predating R!), but it’s still actively developed and many of the workflows described in this book are also available there. For those loyal to vim, we recommend the Nvim-R plugin.

RStudio

Throughout the book, we highlight specific ways that RStudio can expedite your package development workflow, in specially formatted sections like this.

Together, devtools and RStudio insulate you from the low-level details of how packages are built. As you start to develop more packages, we highly recommend that you learn more about those details. The best resource for the official details of package development is always the official writing R extensions manual1. However, this manual can be hard to understand if you’re not already familiar with the basics of packages. It’s also exhaustive, covering every possible package component, rather than focusing on the most common and useful components, as this book does. Writing R extensions is a useful resource once you’ve mastered the basics and want to learn what’s going on under the hood.

1.2 In this book

The first part of the book is all about giving you all the tools you need to start your package development journey and we highly recommend that you read it in order. We begin in Chapter 2 with a run through the complete development of a small package. It’s meant to paint the big picture and suggest a workflow, before we descend into the detailed treatment of the key components of an R package. Then in Chapter 3 you’ll learn how to prepare your system for package development, and in Chapter 4 you’ll learn the basic structure of a package and how that varies across different states. Next, in Chapter 5, we’ll cover the core workflows that come up repeatedly for package developers. Then the first section of the book in Chapter 6 with another case study, this time focusing on how you might convert a script to a package and discussing the challenges you’ll face along the way.

The remainder of the book is design to be read as needed. Pick and choose between the packages as you need them.

First we cover key package components: Chapter 7 discusses where your code lives and how to organize it, Chapter 8 shows you how to include data in your package, and Chapter 9 covers a few less important files and directories that need to be discussed somewhere.

Next we’ll dive into to the package metadata, starting with DESCRIPTION and NAMESPACE in Chapter 10. We’ll then go deep into dependencies in Chapter 11, discussing when and how to depend on another package. We’ll finish off this part with a look at licensing in Chapter 12.

To ensure your package works as designed (and continues to work as you make changes), it’s essential to test your code, so the next three chapters cover the art and science of testing. Chapter 13 gets you started with the basics of testing with the testthat package. Chapter 14 teaches you how to design and organise tests in the most effective way. Then we finish off our coverage of testing in Chapter 15 which teaches you advanced skills to tackle the most challenging of cases.

If you want other people (including future-you!) to understand how to use the functions in your package, you’ll need to document them. Chapter 16 gets you started using roxygen to document the functions in your package. Function documentation is only helpful if you know what function to look up, so next in Chapter 17 we’ll discuss vignettes, which help you document the package as a whole. We’ll finish up documentation with a discussion of other important markdown files like README.md and NEWS.md in Chapter 18, and creating a package website with pkgdown in Chapter 19.

The book concludes with a look at the release of your package to CRAN and the maintenance thereafter. Chapter 20 dives into R CMD check, the most important tool for verifying your package is free from major defects. You’ll then learn how you can run R CMD check automatically every time you change your package in Chapter 21. If you’re planning on submitting your package to CRAN, Chapter 22 shows you the process and gives you our tips and tricks to maximize the chances of success. We conclude with a discussion the post-release lifecycle in Chapter 23.

This is a lot to learn, but don’t feel overwhelmed. Start with a minimal subset of useful features (e.g. just an R/ directory!) and build up over time. To paraphrase the Zen monk Shunryu Suzuki: “Each package is perfect the way it is — and it can use a little improvement”.

1.3 What you won’t learn

There are a few very important topics that you won’t learn about in this book:

  • Git and GitHub: mastering a version control system is vital to easily collaborate with others, and is useful even for solo work because it allows you to easily undo mistakes. Learn from http://happygitwithr.com/.

  • Compiled code: R code is designed for human efficiency, not computer efficiency, so it’s useful to have a tool in your back pocket that allows you to write fast code. Learn more in https://adv-r.hadley.nz/rcpp.html, or https://cpp11.r-lib.org.

  • Markdown and RMarkdown.

1.4 Acknowledgments

Since the first edition of R Packages was published, the packages supporting the workflows described here have undergone extensive development. The original trio of devtools, roxygen2, and testthat has expanded to include the packages created by the “conscious uncoupling” of devtools, as described in Section 3.2. Most of these packages originate with Hadley Wickham (HW), because of their devtools roots. There are many other significant contributors, many of whom now serve as maintainers:

This book and the R package development community benefit tremendously from experts who smooth over specific pain points:

  • Kevin Ushey, JJ Allaire, and Dirk Eddelbuettel tirelessly answered all sorts of C, C++, and Rcpp questions.
  • Craig Citro wrote much of the initial code to facilitate using Travis-CI with R packages.
  • Jeroen Ooms also helps to maintain R community infrastructure, such as the current R support for Travis-CI (along with Jim Hester), and the Windows toolchain.

TODO: revisit rest of this section when 2nd edition nears completion. Currently applies to and worded for 1st edition.

Often the only way I learn how to do it the right way is by doing it the wrong way first. For suffering through many package development errors, I’d like to thank all the CRAN maintainers, especially Brian Ripley, Uwe Ligges and Kurt Hornik.

This book was written and revised in the open and it is truly a community effort: many people read drafts, fix typos, suggest improvements, and contribute content. Without those contributors, the book wouldn’t be nearly as good as it is, and we are deeply grateful for their help.

A special thanks goes to Peter Li, who read the book from cover-to-cover and provided many fixes. I also deeply appreciate the time the reviewers (Duncan Murdoch, Karthik Ram, Vitalie Spinu and Ramnath Vaidyanathan) spent reading the book and giving me thorough feedback.

Thanks go to all contributors who submitted improvements via github (in alphabetical order): @aaronwolen, @adessy, Adrien Todeschini, Andrea Cantieni, Andy Visser, @apomatix, Ben Bond-Lamberty, Ben Marwick, Brett K, Brett Klamer, @contravariant, Craig Citro, David Robinson, David Smith, @davidkane9, Dean Attali, Eduardo Ariño de la Rubia, Federico Marini, Gerhard Nachtmann, Gerrit-Jan Schutten, Hadley Wickham, Henrik Bengtsson, @heogden, Ian Gow, @jacobbien, Jennifer (Jenny) Bryan, Jim Hester, @jmarshallnz, Jo-Anne Tan, Joanna Zhao, Joe Cainey, John Blischak, @jowalski, Justin Alford, Karl Broman, Karthik Ram, Kevin Ushey, Kun Ren, @kwenzig, @kylelundstedt, @lancelote, Lech Madeyski, @lindbrook, @maiermarco, Manuel Reif, Michael Buckley, @MikeLeonard, Nick Carchedi, Oliver Keyes, Patrick Kimes, Paul Blischak, Peter Meissner, @PeterDee, Po Su, R. Mark Sharp, Richard M. Smith, @rmar073, @rmsharp, Robert Krzyzanowski, @ryanatanner, Sascha Holzhauer, @scharne, Sean Wilkinson, @SimonPBiggs, Stefan Widgren, Stephen Frank, Stephen Rushe, Tony Breyal, Tony Fischetti, @urmils, Vlad Petyuk, Winston Chang, @winterschlaefer, @wrathematics, @zhaoy.

1.5 Conventions

Throughout this book, we write fun() to refer to functions, var to refer to variables and function arguments, and path/ for paths.

Larger code blocks intermingle input and output. Output is commented so that if you have an electronic version of the book, e.g., https://r-pkgs.org, you can easily copy and paste examples into R. Output comments look like #> to distinguish them from regular comments.

1.6 Colophon

This book was authored using quarto inside RStudio. The website is hosted with Netlify, and automatically updated after every commit by GitHub actions. The complete source is available from GitHub.

This version of the book was built with:

library(devtools)
#> Loading required package: usethis
library(roxygen2)
library(testthat)
#> 
#> Attaching package: 'testthat'
#> The following object is masked from 'package:devtools':
#> 
#>     test_file
devtools::session_info()
#> ─ Session info ───────────────────────────────────────────────────
#>  setting  value
#>  version  R version 4.2.1 (2022-06-23)
#>  os       Ubuntu 20.04.4 LTS
#>  system   x86_64, linux-gnu
#>  ui       X11
#>  language (EN)
#>  collate  C.UTF-8
#>  ctype    C.UTF-8
#>  tz       UTC
#>  date     2022-08-09
#>  pandoc   2.5 @ /usr/bin/ (via rmarkdown)
#> 
#> ─ Packages ───────────────────────────────────────────────────────
#>  package     * version    date (UTC) lib source
#>  brio          1.1.3      2021-11-30 [1] RSPM
#>  cachem        1.0.6      2021-08-19 [1] RSPM
#>  callr         3.7.1      2022-07-13 [1] RSPM
#>  cli           3.3.0      2022-04-25 [1] RSPM
#>  crayon        1.5.1      2022-03-26 [1] RSPM
#>  devtools    * 2.4.4      2022-07-20 [1] RSPM
#>  digest        0.6.29     2021-12-01 [1] RSPM
#>  ellipsis      0.3.2      2021-04-29 [1] RSPM
#>  evaluate      0.15       2022-02-18 [1] RSPM
#>  fastmap       1.1.0      2021-01-25 [1] RSPM
#>  fs            1.5.2      2021-12-08 [1] RSPM
#>  glue          1.6.2      2022-02-24 [1] RSPM
#>  htmltools     0.5.3      2022-07-18 [1] RSPM
#>  htmlwidgets   1.5.4      2021-09-08 [1] RSPM
#>  httpuv        1.6.5      2022-01-05 [1] RSPM
#>  jsonlite      1.8.0      2022-02-22 [1] RSPM
#>  knitr         1.39       2022-04-26 [1] RSPM
#>  later         1.3.0      2021-08-18 [1] RSPM
#>  lifecycle     1.0.1      2021-09-24 [1] RSPM
#>  magrittr      2.0.3      2022-03-30 [1] RSPM
#>  memoise       2.0.1      2021-11-26 [1] RSPM
#>  mime          0.12       2021-09-28 [1] RSPM
#>  miniUI        0.1.1.1    2018-05-18 [1] RSPM
#>  pkgbuild      1.3.1      2021-12-20 [1] RSPM
#>  pkgload       1.3.0      2022-06-27 [1] RSPM
#>  prettyunits   1.1.1      2020-01-24 [1] RSPM
#>  processx      3.7.0      2022-07-07 [1] RSPM
#>  profvis       0.3.7      2020-11-02 [1] RSPM
#>  promises      1.2.0.1    2021-02-11 [1] RSPM
#>  ps            1.7.1      2022-06-18 [1] RSPM
#>  purrr         0.3.4      2020-04-17 [1] RSPM
#>  R6            2.5.1      2021-08-19 [1] RSPM
#>  Rcpp          1.0.9      2022-07-08 [1] RSPM
#>  remotes       2.4.2      2021-11-30 [1] RSPM
#>  rlang         1.0.4      2022-07-12 [1] RSPM
#>  rmarkdown     2.14       2022-04-25 [1] RSPM
#>  roxygen2    * 7.2.1      2022-07-18 [1] RSPM
#>  sessioninfo   1.2.2      2021-12-06 [1] RSPM
#>  shiny         1.7.2      2022-07-19 [1] RSPM
#>  stringi       1.7.8      2022-07-11 [1] RSPM
#>  stringr       1.4.0.9000 2022-07-28 [1] Github (tidyverse/stringr@9b46754)
#>  testthat    * 3.1.4      2022-04-26 [1] RSPM
#>  urlchecker    1.0.1      2021-11-30 [1] RSPM
#>  usethis     * 2.1.6      2022-05-25 [1] RSPM
#>  vctrs         0.4.1      2022-04-13 [1] RSPM
#>  xfun          0.31       2022-05-10 [1] RSPM
#>  xml2          1.3.3      2021-11-30 [1] RSPM
#>  xtable        1.8-4      2019-04-21 [1] RSPM
#> 
#>  [1] /home/runner/work/_temp/Library
#>  [2] /opt/R/4.2.1/lib/R/library
#> 
#> ──────────────────────────────────────────────────────────────────

  1. You might also enjoy the “quarto-ized” version at https://rstudio.github.io/r-manuals/r-exts/.↩︎