Expand names from abbreviated forms or initials
Details
When you have a list x
of abbreviated and non-abbreviated names and you want
to deduplicate them, this function can be used as expand_names(x, x)
, which
will return the most expanded version available in x
for each name
Examples
expand_names(
c("W A Mozart", "Wolfgang Mozart", "Wolfgang A Mozart"),
"Wolfgang Amadeus Mozart"
)
#> [1] "Wolfgang Amadeus Mozart" "Wolfgang Amadeus Mozart"
#> [3] "Wolfgang Amadeus Mozart"
# Real-case application example
# Deduplicate names in list, as described in "details"
epi_pkg_authors <- cran_epidemiology_packages |>
subset(!is.na(`Authors@R`), `Authors@R`, drop = TRUE) |>
parse_authors_r() |>
# Drop email, role, ORCID and format as string rather than person object
lapply(function(x) format(x, include = c("given", "family"))) |>
unlist()
# With all duplicates
length(unique(epi_pkg_authors))
#> [1] 367
# Deduplicate
epi_pkg_authors_normalized <- expand_names(epi_pkg_authors, epi_pkg_authors)
length(unique(epi_pkg_authors_normalized))
#> [1] 357