ETC5523: Communicating with Data

Automated tools for package development and documentation

Lecturer: Michael Lydeamore

Department of Econometrics and Business Statistics



Aim

  • Document your functions
  • Create vignettes for your package
  • Distribute your package

Why

  • Documentation informs users of how to use your package
  • Adopting best practice development workflow will make package development easier
  • Distributing your package is needed for adoption of your package by others

Recap of package development

Communicating about your R package

  • What is the goal of the package?
  • What does your function(s) do?
  • How do we use it?
  • Why should we use it?
  • Where do we find and install it?

Documentation is vital

random_number() function

random_number <- function(range) {
  random_numbers <- data.frame(numbers = runif(n = 100, min = 0, max = range))

  ggplot2::ggplot(data = random_numbers, ggplot2::aes(x = numbers)) +
    ggplot2::geom_histogram()
}

random_number(50)

Using our own data

data-raw/smartmeter.R:

smartmeter <- read.csv("data-raw/smartmeter.csv") |>
  clean_solar_data() |>
  select(-c(nmi, meter_serial_number))

usethis::use_data(smartmeter, overwrite = TRUE)

Then in a package function:

last_n_days <- function(n) {
  smartmeter |>
    dplyr::filter(date >= (max(date) - n))
}

Using data in your package

People have become very used to the pipe syntax, and so actually the above function is better written as:

#' @examples
#' 
#' smartmeter |> last_n_days(n = 7)
last_n_days <- function(.data, n) {
  smartmeter |>
    dplyr::filter(date >= (max(date) - n))
}

Tip

Your case may not be as straightforward as this: do not feel like you can never include a data object inside a function. It is still useful!

Using pipe operator in your package

The base pipe (|>) is fine for 99% of cases - I recommend just using that.

If you need the magrittr pipe (%>%) then you can use

usethis::use_pipe()

Including Shiny apps in your package

The process to include a Shiny app is a bit more complicated than data or a function.

Put your app code in inst/[appname]/app.R

In your function to launch your app:

run_app <- function() {
  app_dir <- system.file("app.R", package = "solardash")
  shiny::runApp(app_dir, display.mode = "normal")
}

Files in inst/ are installed with your package but are considered “internal”.

The system.file tells R where to look for your shiny app code.

Including Shiny apps in your package

Warning

Don’t include an example for running a shiny app! It will cause R to hang.

Instead, include your example inside an if statement:

if (interactive()) {
  run_app()
}

Demo

Documentation

Documenting R functions with roxygen2

  • use #' above a function to write documentation for that function
  • roxygen2 uses @ tags to structure documentation, e.g. 
    • any text after @description is the description
    • any text after @param describes the arguments of the function
    • @export signals that it is an exported function
    • any text after @return describes the return object
    • the full list of Rd tags are found here
  • devtools::document() converts the Rd tags to appropriate sections of .Rd files written in the man/ folder

Documenting solarash package

R/cleaning.R

#' Cleans Solar Data from the Jemema Electricity Portal
#' 
#' @param .data A data frame containing raw solar data extracted from the Jemema Electricity Portal.
#' @return A cleaned data frame with columns: date, datetime, energy_kwh, and other relevant columns.
#' 
#' @importFrom tidyr pivot_longer
#' @importFrom dplyr mutate
clean_solar_data <- function(.data) {
  ...
}

Documenting data

  • usethis::use_data_raw() to store R code to process raw data,
  • usethis::use_data() to save a binary file in data/ directory,
  • The data is named praises.
  • Documentation is contained in data.R or name-of-data.R

Documenting data

R/data.R

#' Example Smart Meter Data from the Jemena Electricity Portal
#' 
#' A dataset containing example smart meter data extracted from the Jemena Electricity Portal.
#' @format A data frame with 17520 rows and 4 variables:
#' \describe{
#'   \item{date}{Date of the reading (YYYY-MM-DD)}
#'   \item{datetime}{Date and time of the reading (POSIXct)}
#'   \item{energy_kwh}{Energy consumption in kilowatt-hours (kWh)}
#' \item{con_gen}{Whether the reading is consumption or generation}
#' \item{estimated}{Whether the reading is estimated or actual. In this dataset, all readings are actual.}
#' }
#' 
#' @source \url{https://jemena.com.au/}
"smartmeter"

Make package documentation

  • Add documentation of the “big picture” of your package
usethis::use_package_doc()
  • Above creates the file below

R/solardash-package.R

#' @keywords internal
"_PACKAGE"

## usethis namespace: start
## usethis namespace: end
NULL
  • Default package documentation is built from your DESCRIPTION file
library(solarash)
?solardash

Vignette: a long-form documentation

  • Some documentation doesn’t fit as a package or function documentation.
  • You may want to built a vignette (article) for these cases.
usethis::use_vignette(name = "my-amazing-package", 
                      title = "My amazing package")
  • Edit the created Rmd file
  • Knit the vignette to see what it looks like
  • Use devtools::build() to build package with vignettes included

Quarto vignettes

It is possible!

Add to your DESCRIPTION file:

VignetteBuilder:
  quarto

and change your YAML header to:

vignette: >
  %\VignetteIndexEntry{Vignette's Title}
  %\VignetteEngine{quarto::html}
  %\VignetteEncoding{UTF-8}

Quarto vignettes

Warning

This is still new and there may be some issues with some package checkers. Our experience is that CRAN has accepted packages even when they’ve failed tests because of the Quarto engine

I believe it is where things are heading, so it is worth trying out, even if it costs you some pain now.

Dependencies

Adding dependencies

  • Dependencies are specified in DESCRIPTION file under three categories:
    • Depends: Specify the version of R that the package will work with or package that it is dependent on (e.g. for ggplot2 extension packages, it depends on ggplot2).
    • Imports: External packages that are imported to use in your package. Most external packages are in this category.
    • Suggests: Packages that are not strictly needed but are nice to have, i.e. you use them in examples or vignettes.
  • You can add easily add this via usethis::use_package()

Sharing

Share and collaborate on your package

  • Track changes to your code with Git
usethis::use_git()
  • Collaborate with others via GitHub (or otherwise)
usethis::use_github()

or for existing repo, run from the terminal:

git remote add origin https://github.com/user/repo.git
  • You can install your R package now using:
devtools::install_github("user/repo")

Installing solardash package

devtools::install_github("MikeLydeamore/solardash")
  • The package is found at https://github.com/MikeLydeamore/solardash.
  • It’s a good idea to add a README file with installation instructions – this is displayed in the GitHub repo.
  • You can create a README.Rmd file with
usethis::use_readme_rmd() 
# OR usethis::use_readme_md() if you have no code
  • Make sure you knit the README.Rmd when you modify its contents.

Package documentation website with pkgdown

  • Automatically turns all package documentation into a website.
  • Documentation can now be easily viewable outside of R.
  • Easy to customise appearance of the site using YAML

Using pkgdown

usethis::use_pkgdown()
  • Build site locally with pkgdown::build_site()
  • Site appearance is modified in the _pkgdown.yml file
    • bootswatch themes for the appearance of the whole site
    • organising function / vignette documentation with reference
  • See the vignette for more details
  • Automatically build and deploy your site with GitHub actions
usethis::use_pkgdown_github_pages() # if using this, no need for usethis::use_pkgdown()

Customising pkgdown site

First, modify _pkgdown.yml file

template:
  boostrap: 5

And then the world is your customising oyster:

Add dark mode with

template:
  light-switch: true
  theme: gruvbox-light
  theme-dark: gruvbox-dark

Choose from bootswatch themes. Strangely theme here means syntax highlighting theme, not theme theme. Computer scientists.

template:
  bootswatch: cyborg
  theme: arrow-dark

By default maths is rendered with mathml which is pretty good but not perfect. You can switch with

template:
  math-rendering: katex (or mathjax)

You can move your navigation bar around:

navbar:
  structure:
    left: [intro, reference, articles]
    right: [github, search]

Note that intro is the packaged-named vignette we made before.

The whole package development workflow

available::available("pkgname") # check if package name is available (if planning to publish publicly)
usethis::create_package("pkgname")
usethis::use_git() # set up version control
usethis::use_github() # optional
usethis::use_r("myfile")
# write some functions in a script
usethis::use_data_raw() # if adding data
devtools::load_all() # try it out in the console
usethis::use_package("import-pkgname") # add package to import (or depends or suggests)
usethis::use_package_doc() # add package documentation
usethis::use_pipe() # if you want to add %>% from `magrittr`
usethis::use_vignette("vignette-name") # add vignette
devtools::build() # build vignettes
devtools::install() # to install package
devtools::check() # to build and check a package 
usethis::use_readme_rmd() # to add a README Rmd file
styler::style_pkg() # optional (commit first, then review changes carefully)
usethis::use_pkgdown_github_pages() # for setting up pkgdown website on github
# `usethis::use_pkgdown()` if not using github pages

So you’re ready for CRAN

Getting ready for CRAN

CRAN has quite the set of policies to follow. Some are “optional” but most are mandatory.

Start with:

usethis::use_release_issue(version = NULL)

This will create a GitHub issue with a checklist of things to do before submitting to CRAN.

Extra checks

Some key points:

  1. devtools::check() must give 0 0 0. Realistically it will give one note (new submission), but try to avoid all others.
  2. devtools::run_examples() must run without error
  3. checkhelper::find_missing_tags(): All functions must have @noRd or @export tag
  4. urlchecker::url_check()
  5. rhub::rhub_check()
  6. usethis::use_cran_comments(): Add a cran-comments.md file with responses to common questions

Common errors

One of the most common warnings is

no visible binding for global variable

This is a result of using the non-standard evaluation in dplyr and ggplot2 (and other places). Unfortunately the fix is very unsatisfying.

Include a zzz.R file in your R folder with a call to:

utils::globalVariables(c("var1", "var2", "var3"))

Common errors

All declared Imports should be used

We have packages that are only used in our shiny app but nowhere else.

We handle these like we would examples or tests: Suggests.

But, what would happen now if we tried to launch the shiny app without these suggested packages installed?

Error handling is also communication

The cli package is a modern way to handle messages, warnings, and errors.

Documentation: https://cli.r-lib.org/

run_solar_dashboard <- function() {
  required_packages <- c("DT", "ggplot2", "readr")
  for (pkg in required_packages) {
    if (!requireNamespace(pkg, quietly = TRUE)) {
      cli::cli_abort("Package '{pkg}' must be installed to run the dashboard.")
    }
  }
  
  ...
}

Error handling is also communication

The cli package is a modern way to handle messages, warnings, and errors.

Documentation: https://cli.r-lib.org/

run_solar_dashboard <- function() {
  required_packages <- c("DT", "ggplot2", "readr")
  for (pkg in required_packages) {
    if (!requireNamespace(pkg, quietly = TRUE)) {
      cli::cli_abort(c(
        "Package {.pkg {pkg}} must be installed to run the dashboard.",
        "i" = "You can install it with {.code install.packages('{pkg}')}"
      ))
    }
  }
  
  ...
}

Other tips

  • All software names must be in single quotes
  • You need a license: usethis::use_mit_license("Your Name")
  • The Copyright holder in LICENSE and DESCRIPTION should match
  • Find functions that don’t have an example:
system("grep -c examples man/*Rd", intern = TRUE) |> 
  grep(":0$", x = _, value = TRUE)
  • Make sure your description doesn’t start with “This package …”
  • Makes sure you have a BugReports and URL with devtools::use_github_links()

Submit it!

Don’t be afraid to submit your package to CRAN!

They are helpful and although they may reject your package, they will give you feedback on what to fix.

Week 11 Lesson

Summary

  • Package documentation is important to let others know about the goal of the package, what your function does, and how to use your package.
  • Sharing your package by making it easy to install, commiting to good documentation, and making the documentation accessible helps to build trust to use your package.
  • You can make package development and distribution easy with usethis, devtools, roxygen2, and pkgdown.