Skip to contents

Expand names from abbreviated forms or initials

Usage

expand_names(short, expanded)

Arguments

short

A character vector of potentially abbreviated names

expanded

A character vector of potentially expanded names

Value

A character vector with the same length as short

Details

When you have a list xof abbreviated and non-abbreviated names and you want to deduplicate them, this function can be used as expand_names(x, x), which will return the most expanded version available in x for each name

Examples

expand_names(
  c("W A Mozart", "Wolfgang Mozart", "Wolfgang A Mozart"),
  "Wolfgang Amadeus Mozart"
)
#> [1] "Wolfgang Amadeus Mozart" "Wolfgang Amadeus Mozart"
#> [3] "Wolfgang Amadeus Mozart"

# Real-case application example
# Deduplicate names in list, as described in "details"
epi_pkg_authors <- cran_epidemiology_packages |>
  subset(!is.na(`Authors@R`), `Authors@R`, drop = TRUE) |>
  parse_authors_r() |>
  # Drop email, role, ORCID and format as string rather than person object
  lapply(function(x) format(x, include = c("given", "family"))) |>
  unlist()

# With all duplicates
length(unique(epi_pkg_authors))
#> [1] 367

# Deduplicate
epi_pkg_authors_normalized <- expand_names(epi_pkg_authors, epi_pkg_authors)

length(unique(epi_pkg_authors_normalized))
#> [1] 357