Tips and tools for co-maintenance of R packages
‘Rencontres R 2025’ lightning talk companion post
I created GitHub Actions workflows that warn all package authors or maintainers when an R package is at risk of archival, rather than just the officially listed maintainer. You can find the workflows in the actions/
subfolder of this post.
Collaborative work can be a great way to improve the resilience of a project. But collaborative open-source work comes with unique challenges because some open-source projects happen outside of the transactional relationship inherent to businesses. Because this is my own situation, and my main interest, my tips will be more relevant to “smaller” hobby projects and informal collaborations. In these projects, hierarchies and processes are not as clearly defined as they might be in a company.
Of these open-source projects, R packages come with an extra unique set of challenges for co-maintenance due to the specificities of their ecosystem, and in particular, CRAN requirements.
In this blog post, I will only briefly mention what has worked in terms of communication and organization processes in the open-source projects I have been involved in. However, I will not be exhaustive, as this has been covered elsewhere, and because I believe the solutions cannot be detached from the environment in which they are implemented.
This is why I want to spend more time on explaining the unique challenges of co-maintaining R packages, where they come from, and the specific strategies or tools I have developed to address them.
By co-maintainers, I mean individuals who all have write access to the source code repository. They are not engaged in a hierarchical relationship, and can take independent decisions regarding the project.
A typical example of co-maintainance in the R community is the R language itself, where all of R Core is co-maintaining R source code.
Generic tips for co-maintaining an open-source project
Establish a communication channel
Many projects I’ve been in have later regretted their initial underestimation of the importance of choosing a future-proof communication channel.
As many interactions as possible should probably happen in the open. Possibly where the code is hosted if this is possible (e.g., Codeberg, GitLab, GitHub).
But it is likely that not everything can be discussed there. Private conversations will likely sometimes be necessary. I mention below the case of emails from CRAN that they do not like to see shared in public. Another example is the lightr R package, which I’m maintaining, where we sometimes discuss proprietary documentation that was shared in confidence by spectrometer manufacturers.
I don’t think I have found a perfect communication channel for co-maintaining an open-source project yet but:
- email history may be cumbersome to share when onboarding new maintainers, unless a proper mailing list infrastructure is used.
- you should ideally never be dependent on a private proprietary platform such as Slack, which is now locking access to years of archived conversations between maintainers of the pavo R package behind a paywall.
Make all changes via a Pull Request
A low effort change to your workflow that will reduce the need for dedicated communication is to make all changes via pull requests.
In low-resource or small-scale projects, I do not believe it should always be a hard requirements to review every single pull request. But even changes that do not warrant a review should be done as a pull request so that the other maintainers get a notification that something has happened.
Bonus: streamline communication and avoid uncertainty
It may be helpful to align expectations with the other maintainers about how you communicate certain requests for actions.
For example, how do I signal that a review is required vs desired? Is every pull request ready to be merged? How to signal work in progress? etc.
A good example of an attempt to make these processes explicit to enable collaboration at scale in a project involving multiple universities across the world are the Epiverse blueprints.
The specific case of co-maintaining an R package
CRAN requirements and the unique challenges of co-maintaining an R package
R packages have an extra specific set of constraints, mostly due to CRAN requirements and submission process.
Indeed, the CRAN policy, as defined in the ‘Writing R Extensions’ online page, state that:
The mandatory ‘Maintainer’ field should give a single name followed by a valid (RFC 2822) email address in angle brackets. […] For a CRAN package it should be a person, not a mailing list and not a corporate entity: do ensure that it is valid and will remain valid for the lifetime of the package.
So, even if in practice, you have chosen to share the maintenance of the package on a equal footing, only one person can communicate directly with the CRAN maintainers.
In particular, only one person will receive the emails about failing checks and archival threats. This is particularly difficult because (as of writing this post) CRAN usually gives short deadlines and maintainers usually only have a couple of weeks to resubmit. It’s definitely possible that a maintainer is unreachable during this period of time because they are on holidays, or busy with other matters.
In this specific situation, the “official” maintainer, indicated by “cre
” in the DESCRIPTION
, is a bottleneck at two moments:
- when sharing the information of archival risk with their co-maintainers 1
- when re-submissing to CRAN as they need to manually validate the submission by clicking a link received on their email.
Note that outside these two bottlenecks, the other maintainers can help by providing a fix, testing it on different versions of R, etc. as all of this will be necessary before submitting a new version to CRAN.
Tools supporting the co-maintenance of R packages
I have developed two GitHub Actions workflows which can alert all watchers from a given GitHub repository that the package is at risk of archival. While this does not solve the CRAN submission bottleneck, it resolves the archival risk alert bottleneck. In doing so, it allows co-maintainers, or potential external contributors, to jump into action and prepare the release that the listed maintainer can submit as soon as they’re back.
In a single package
The following workflow placed in the repository of an R package will open a new issue each time a new archival deadline is set by the CRAN. It is for example used in the pavo R package we are co-maintaining with Thomas White:
on:
workflow_dispatch:
schedule:
- cron: '42 1 * * *'
jobs:
fetch-deadlines:
runs-on: ubuntu-latest
env:
GITHUB_PAT: ${{ secrets.GITHUB_TOKEN }}
permissions:
issues: write
actions: write
steps:
- uses: actions/checkout@v4
- uses: r-lib/actions/setup-r@v2
with:
use-public-rspm: true
- uses: r-lib/actions/setup-r-dependencies@v2
with:
packages: gh
- name: Fetch deadline for this package
shell: Rscript {0}
run: |
crandb <- tools::CRAN_package_db()
pkgname <- drop(read.dcf("DESCRIPTION", "Package"))
deadline <- crandb[crandb$Package == pkgname, "Deadline"]
if (!is.na(deadline)) {
gh::gh(
"POST /repos/{owner_repo}/issues",
owner_repo = Sys.getenv("GITHUB_REPOSITORY"),
title = paste("Fix CRAN R CMD check issues by", deadline),
body = "This GHA workflow has been disabled. Please re-enable it when closing this issue."
)
gh::gh(
"PUT /repos/{owner_repo}/actions/workflows/{workflow_id}/disable",
owner_repo = Sys.getenv("GITHUB_REPOSITORY"),
workflow_id = basename(Sys.getenv("GITHUB_WORKFLOW"))
) }
A variant where the same issue is re-opened each time is possible. You can take inspiration from the ‘open issue’ job in the render-dashboard.yaml
workflow from cransays, although it doesn’t apply directly to R packages and archival deadlines.
In a universe of packages
I have also developed a centralized system for a universe of packages, as part of my work on Epiverse-TRACE. This system handles all the packages listed in the R-universe of the organization where it is added.
This system has two facets:
- a daily updated dashboard, which allows checking all packages’ status with a quick glance
- an archival risk alert system, where each package archival risk is tracked in a given issue that co-maintainers can subscribe to
on:
workflow_dispatch:
schedule:
- cron: '42 1 * * *'
name: check-deadlines
jobs:
fetch-deadlines:
runs-on: ubuntu-latest
env:
GITHUB_PAT: ${{ secrets.GITHUB_TOKEN }}
permissions:
contents: write
issues: write
steps:
- uses: actions/checkout@v4
- uses: r-lib/actions/setup-pandoc@v2
- uses: r-lib/actions/setup-r@v2
with:
use-public-rspm: true
- uses: r-lib/actions/setup-renv@v2
- name: Fetch deadlines for universe packages
shell: Rscript {0}
run: |
org_cran_pkgs <- glue::glue(
"https://{org}.r-universe.dev/api/packages/",
org = Sys.getenv("GITHUB_REPOSITORY_OWNER")
) |>
jsonlite::fromJSON() |>
dplyr::filter(
`_registered`,
`_cranurl`
) |>
dplyr::pull(Package)
crandb <- tools::CRAN_package_db()
org_pkgs_deadline <- crandb |>
dplyr::filter(
Package %in% org_cran_pkgs,
!is.na(Deadline)
)
issues <- gh::gh(
"/repos/{owner_repo}/issues",
owner_repo = Sys.getenv("GITHUB_REPOSITORY"),
repo = "etdashboard",
state = "all"
) |>
purrr::map(\(x) x[c("title", "number", "state")]) |>
dplyr::bind_rows() |>
dplyr::filter(state == "closed")
org_pkgs_archiveable <- org_pkgs_deadline |>
dplyr::inner_join(issues, by = dplyr::join_by(Package == title))
for (i in seq_len(nrow(org_pkgs_archiveable))) {
pkg <- org_pkgs_archiveable$Package[i]
deadline <- org_pkgs_archiveable$Deadline[i]
issue <- org_pkgs_archiveable$number[i]
gh::gh(
"PATCH /repos/{owner_repo}/issues/{issue_number}",
owner_repo = Sys.getenv("GITHUB_REPOSITORY"),
state = "open",
issue_number = issue
)
gh::gh(
"POST /repos/{owner_repo}/issues/{issue_number}/comments",
owner_repo = Sys.getenv("GITHUB_REPOSITORY"),
issue_number = issue,
body = glue::glue("Package {pkg} is at risk to be archived by {deadline}.")
)
# FIXME: do we need a mechanism to close issues? }
Future changes in CRAN policy
At UseR2024! in Salzburg, Kurt Hornik mentioned in his keynote presentation that CRAN had been looking into reaching out to all package authors for archival notice by opening an issue on GitHub for packages that document a GitHub repository in DESCRIPTION
(either in the URL:
or BugReports:
field).
However, this was mentioned in passing as something that might happen eventually but no promise was made nor any clear timeline established.
Footnotes
Note that several CRAN & R Core members view communication with package maintainers as private communication and frown upon sharing it verbatim on a public channel.↩︎