Second edition

You are reading the work-in-progress second edition of R Packages. This chapter is undergoing heavy restructuring and may be confusing or incomplete.

10.1 Introduction

There are two important files that provide metadata about your package DESCRIPTION and NAMESPACE. The DESCRIPTION provides overall metadata about the package, and the NAMESPACE describes which functions you use from other packages and you expose to the world. In this chapter, you’ll learn the basic structure of these files and some of their simple applications: like the name and title of your package and who wrote it.

We’ll continue in the next chapters to explain:

  • Licensing is a big enough topic that it has a dedicated chapter (Chapter 12). If you have no plans to share your package, you may be able to ignore licensing. But if you plan to share, even if only by putting the code where others can see it, you really should specify a license.

  • The License field which defines who can use your package.

  • The dependencies of your package.


The job of the DESCRIPTION file is to store important metadata about your package. When you first start writing packages, you’ll mostly use these metadata to record what packages are needed to run your package. However, as time goes by, other aspects of the metadata file will become useful to you, such as revealing what your package does (via the Title and Description) and whom to contact (you!) if there are any problems.

Every package must have a DESCRIPTION. In fact, it’s the defining feature of a package (RStudio and devtools consider any directory containing DESCRIPTION to be a package)1. To get you started, usethis::create_package("mypackage") automatically adds a bare-bones DESCRIPTION file. This will allow you to start writing the package without having to worry about the metadata until you need to. This minimal DESCRIPTION will vary a bit depending on your settings, but should look something like this:

Package: mypackage
Title: What the Package Does (One Line, Title Case)
    person("First", "Last", , "first.last@example.com", role = c("aut", "cre"),
           comment = c(ORCID = "YOUR-ORCID-ID"))
Description: What the package does (one paragraph).
License: `use_mit_license()`, `use_gpl3_license()` or friends to pick a
Encoding: UTF-8
Roxygen: list(markdown = TRUE)
RoxygenNote: 7.2.2

If you create a lot of packages, you can customize the default content of new DESCRIPTION files by setting the global option usethis.description to a named list. You can pre-configure your preferred name, email, license, etc. See the article on usethis setup for more details.

DESCRIPTION uses a simple file format called DCF, the Debian control format. You can see most of the structure in the examples in this chapter. Each line consists of a field name and a value, separated by a colon. When values span multiple lines, they need to be indented:

Description: The description of a package is usually long,
    spanning multiple lines. The second and subsequent lines
    should be indented, usually with four spaces.

If you ever need to work with a DESCRIPTION file programmatically, take a look at the desc package, which usethis uses heavily under-the-hood.

This chapter will show you how to use the straightforward DESCRIPTION fields.

10.3 Title and description: What does your package do?

The title and description fields describe what the package does. They differ only in length:

  • Title is a one line description of the package, and is often shown in a package listing. It should be plain text (no markup), capitalised like a title, and NOT end in a period. Keep it short: listings will often truncate the title to 65 characters.
  • Description is more detailed than the title. You can use multiple sentences, but you are limited to one paragraph. If your description spans multiple lines (and it should!), each line must be no more than 80 characters wide. Indent subsequent lines with 4 spaces.

The Title and Description for ggplot2 are:

Title: Create Elegant Data Visualisations Using the Grammar of Graphics
Description: A system for 'declaratively' creating graphics,
    based on "The Grammar of Graphics". You provide the data, tell 'ggplot2'
    how to map variables to aesthetics, what graphical primitives to use,
    and it takes care of the details.

A good title and description are important, especially if you plan to release your package to CRAN, because they appear on the CRAN download page as follows:

Figure 10.1: The CRAN page for ggplot2, highlighting Title and Description.

If you plan to submit your package to CRAN, both the Title and Description are a frequent source of rejections for reasons not covered by the automated R CMD check. In addition to the basics above, here are a few more tips:

  • Put the names of R packages, software, and APIs inside single quotes. This goes for both the Title and the Description. See the ggplot2 example above.
  • If you need to use an acronym, try to do so in Description, not in Title. In either case, explain the acronym in Description, i.e. fully expand it.
  • Don’t include the package name, especially in Title, which is often prefixed with the package name.
  • Do not start with “A package for …” or “This package does …”. This rule makes sense once you look at the list of CRAN packages by name. The information density of such a listing is much higher without a universal prefix like “A package for …”.

If these constraints give you writer’s block, it often helps to spend a few minutes reading Title and Description of packages already on CRAN. Once you read a couple dozen, you can usually find a way to say what you want to say about your package that is also likely to pass CRAN’s human-enforced checks.

You’ll notice that Description only gives you a small amount of space to describe what your package does. This is why it’s so important to also include a README.md file that goes into much more depth and shows a few examples. You’ll learn about that in Section 18.2.

10.3.1 Author: who are you?

To identify the package’s author, and whom to contact if something goes wrong, use the Authors@R field. This field is unusual because it contains executable R code rather than plain text. Here’s an example:

Authors@R: person("Hadley", "Wickham", email = "hadley@rstudio.com",
  role = c("aut", "cre"))
person("Hadley", "Wickham", email = "hadley@rstudio.com", 
  role = c("aut", "cre"))
#> [1] "Hadley Wickham <hadley@rstudio.com> [aut, cre]"

This command says that Hadley Wickham is both the maintainer (cre) and an author (aut) and that his email address is hadley@rstudio.com. The person() function has four main inputs:

  • The name, specified by the first two arguments, given and family (these are normally supplied by position, not name). In English cultures, given (first name) comes before family (last name). In many cultures, this convention does not hold. For a non-person entity, such as “R Core Team” or “RStudio”, use the given argument (and omit family).

  • The email address. It’s important to note that this is the address CRAN uses to let you know if your package needs to be fixed in order to stay on CRAN. Make sure to use an email address that’s likely to be around for a while. CRAN policy requires that this be for a person, as opposed to, e.g., a mailing list.

  • One or more three letter codes specifying the role. These are the most important roles to know about:

    • cre: the creator or maintainer, the person you should bother if you have problems. Despite being short for “creator”, this is the correct role to use for the current maintainer, even if they are not the initial creator of the package.

    • aut: authors, those who have made significant contributions to the package.

    • ctb: contributors, those who have made smaller contributions, like patches.

    • cph: copyright holder. This is used if the copyright is held by someone other than the author, typically a company (i.e. the author’s employer).

    • fnd: funder, the people or organizations that have provided financial support for the development of the package.

    (The full list of roles is extremely comprehensive. Should your package have a woodcutter (wdc), lyricist (lyr) or costume designer (cst), rest comfortably that you can correctly describe their role in creating your package. However, note that packages destined for CRAN must limit themselves to the subset of MARC roles listed in the documentation for person().)

  • The optional comment argument has become more relevant, since person() and CRAN landing pages have gained some nice features around ORCID identifiers. Here’s an example of such usage (note the auto-generated URI):

      "Jennifer", "Bryan",
      email = "jenny@rstudio.com",
      role = c("aut", "cre"),
      comment = c(ORCID = "0000-0002-6983-2759")
    #> [1] "Jennifer Bryan <jenny@rstudio.com> [aut, cre] (<https://orcid.org/0000-0002-6983-2759>)"

You can list multiple authors with c():

Authors@R: c(
    person("Hadley", "Wickham", email = "hadley@rstudio.com", role = "cre"),
    person("Winston", "Chang", email = "winston@rstudio.com", role = "aut"),
    person("RStudio", role = c("cph", "fnd")))

Every package must have at least one author (aut) and one maintainer (cre) (they might be the same person). The maintainer (cre) must have an email address. These fields are used to generate the basic citation for the package (e.g. citation("pkgname")). Only people listed as authors will be included in the auto-generated citation. There are a few extra details if you’re including code that other people have written, which you can learn about in Section 12.5.

An older, still valid approach is to have separate Maintainer and Author fields in DESCRIPTION. However, we strongly recommend the more modern approach of Authors@R and the person() function, because it offers richer metadata for various downstream uses.

10.3.2 Url and BugReports

As well as the maintainer’s email address, it’s a good idea to list other places people can learn more about your package. The URL field is commonly used to advertise the package’s website and to link to a public source repository, where development happens. Multiple URLs are separated with a comma. BugReports is the URL where bug reports should be submitted, e.g., as GitHub issues. For example, devtools has:

URL: https://devtools.r-lib.org/, https://github.com/r-lib/devtools
BugReports: https://github.com/r-lib/devtools/issues

If you use usethis::use_github() to connect your local package to a remote GitHub repository, it will automatically populate URL and BugReports for you. If a package is already connected to a remote GitHub repository, usethis::use_github_links() can be called to just add the relevant links to DESCRIPTION.

10.3.3 Other fields

A few other DESCRIPTION fields are heavily used and worth knowing about:

  • Encoding describes the character encoding of files throughout your package. Our package development workflow always assumes that this is set to Encoding: UTF-8 as this now the most commonly used text encoding, and we are not aware of any reasons to use a different value.

  • Collate controls the order in which R files are sourced. This only matters if your code has side-effects; most commonly because you’re using S4.

  • Version is really important as a way of communicating where your package is in its lifecycle and how it is evolving over time. Learn more in Chapter 23.

  • LazyData is relevant if your package makes data available to the user. If you specify LazyData: true, the datasets are lazy-loaded, which makes them more immediately available, i.e. users don’t have to use data(). The addition of LazyData: true is handled automatically by usethis::use_data(). More detail is given when we talk about external data in Chapter 8.

There are actually many other rarely, if ever, used fields. A complete list can be found in the “The DESCRIPTION file” section of the R extensions manual.

10.3.4 Custom fields

There is also some flexibility to create your own fields to add additional metadata. In the narrowest sense, the only restriction is that you shouldn’t re-purpose the official field names used by R. You should also limit yourself to valid English words, so the field names aren’t flagged by the spell-check.

In practice, if you plan to submit to CRAN, we recommend that any custom field name should start with Config/. We featured an example of this earlier, where Config/Needs/website is used to record additional packages needed to build a package’s website.

You might notice that create_package() writes two more fields we haven’t discussed yet, relating to the use of the roxygen2 package for documentation:

Roxygen: list(markdown = TRUE)
RoxygenNote: 7.2.1

You will learn more about these in Chapter 16. The use of these specific field names is basically an accident of history and, if it were re-done today, they would follow the Config/* pattern recommended above.

10.4 The NAMESPACE file

The following code is an excerpt of the NAMESPACE file from the testthat package.

# Generated by roxygen2 (4.0.2): do not edit by hand

You can see that the NAMESPACE file looks a bit like R code. Each line contains a directive: S3method(), export(), exportClasses(), and so on. Each directive describes an R object, and says whether it’s exported from this package to be used by others, or it’s imported from another package to be used locally.

In total, there are eight namespace directives. Four describe exports:

  • export(): export functions (including S3 and S4 generics).
  • exportPattern(): export all functions that match a pattern.
  • exportClasses(), exportMethods(): export S4 classes and methods.
  • S3method(): export S3 methods.

And four describe imports:

  • import(): import all functions from a package.
  • importFrom(): import selected functions (including S4 generics).
  • importClassesFrom(), importMethodsFrom(): import S4 classes and methods.

I don’t recommend writing these directives by hand. Instead, in this chapter you’ll learn how to generate the NAMESPACE file with roxygen2. There are three main advantages to using roxygen2:

  • Namespace definitions live next to its associated function, so when you read the code it’s easier to see what’s being imported and exported.

  • Roxygen2 abstracts away some of the details of NAMESPACE. You only need to learn one tag, @export, which will automatically generate the right directive for functions, S3 methods, S4 methods and S4 classes.

  • Roxygen2 makes NAMESPACE tidy. No matter how many times you use @importFrom foo bar you’ll only get one importFrom(foo, bar) in your NAMESPACE. This makes it easy to attach import directives to every function that need them, rather than trying to manage in one central place.

Note that you can choose to use roxygen2 to generate just NAMESPACE, just man/*.Rd, or both. If you don’t use any namespace related tags, roxygen2 won’t touch NAMESPACE. If you don’t use any documentation related tags, roxygen2 won’t touch man/.

  1. The relationship between “has a DESCRIPTION file” and “is a package” is not quite this clear-cut. Many non-package projects use a DESCRIPTION file to declare their dependencies, i.e. which packages they rely on. In fact, the project for this book does exactly this! This off-label use of DESCRIPTION makes it easy to piggy-back on package development tooling to install all the packages necessary to work with a non-package project.↩︎