--- title: "From R to CFF" subtitle: "Crosswalk" description: > A comprehenshive description of the internal mappings performed by `cffr`. author: Diego Hernangómez ORCID logo bibliography: REFERENCES.bib link-citations: yes output: rmarkdown::html_vignette: toc: true vignette: > %\VignetteIndexEntry{From R to CFF} %\VignetteEngine{knitr::rmarkdown} %\VignetteEncoding{UTF-8} --- ```{r, include = FALSE} knitr::opts_chunk$set( collapse = TRUE, comment = "#>" ) library(cffr) ``` The goal of this vignette is to provide an explicit map between the metadata fields used by **cffr** and each one of the valid keys of the [Citation File Format schema version 1.2.0](https://github.com/citation-file-format/citation-file-format/blob/main/schema-guide.md#valid-keys). ## Summary {#summary} We summarize here the fields that **cffr** can parse and the original source of information for each one of them. The details on each key are presented on the next section of the document. The assessment of fields are based on the [Guide to Citation File Format schema version 1.2.0](https://github.com/citation-file-format/citation-file-format/blob/main/schema-guide.md#valid-keys) [@druskat_citation_2021]. ```{r summary , echo=FALSE} keys <- cff_schema_keys(sorted = TRUE) origin <- vector(length = length(keys)) origin[keys == "cff-version"] <- "parameter on function" origin[keys == "type"] <- "Fixed value: 'software'" origin[keys == "identifiers"] <- "DESCRIPTION/CITATION files" origin[keys == "references"] <- "DESCRIPTION/CITATION files" origin[keys %in% c( "message", "title", "version", "authors", "abstract", "repository", "repository-code", "url", "date-released", "contact", "keywords", "license" )] <- "DESCRIPTION file" origin[keys %in% c( "doi", "preferred-citation" )] <- "CITATION file" origin[origin == FALSE] <- "Not parsed by cffr" df <- data.frame( key = paste0("", keys, ""), source = origin ) knitr::kable(df, escape = FALSE) ``` ## Details ### abstract This key is extracted from the "Description" field of the DESCRIPTION file.
Example ```{r abstract} library(cffr) # Create cffr for yaml cff_obj <- cff_create("rmarkdown") # Get DESCRIPTION of rmarkdown to check pkg <- desc::desc(file.path(find.package("rmarkdown"), "DESCRIPTION")) cat(cff_obj$abstract) cat(pkg$get("Description")) ```
[Back to summary](#summary). ### authors This key is parsed from the "Authors" or "Authors\@R" field of the DESCRIPTION file. By default persons with the role "aut" or "cre" are considered, however this can be modified via the `authors_roles` parameter.
Example ```{r authors} # An example DESCRIPTION path <- system.file("examples/DESCRIPTION_many_persons", package = "cffr") pkg <- desc::desc(path) # See persons listed pkg$get_authors() # Default behaviour, use authors and creators (maintainers) cff_obj <- cff_create(path) cff_obj$authors # Use now Copyright holders and maintainers cff_obj_alt <- cff_create(path, authors_roles = c("cre", "cph")) cff_obj_alt$authors ```
[Back to summary](#summary). ### cff-version This key can be set via the parameters of the `cff_create()`/`cff_write()` functions:
Example ```{r cffversion} cff_objv110 <- cff_create("jsonlite", cff_version = "v1.1.0") cat(cff_objv110$`cff-version`) ```
[Back to summary](#summary). ### commit This key is not extracted from the metadata of the package. See the description on the [Guide to CFF schema v1.2.0](https://github.com/citation-file-format/citation-file-format/blob/main/schema-guide.md#commit). > - **description**: The commit hash or revision number of the software > version. > > - **usage**:

> > ``` yaml > commit: 1ff847d81f29c45a3a1a5ce73d38e45c2f319bba > > commit: "Revision: 8612" > ``` [Back to summary](#summary). ### contact This key is parsed from the "Authors" or "Authors\@R" field of the DESCRIPTION file. Only persons with the role "cre" (i.e, the maintainer(s)) are considered.
Example ```{r contact} cff_obj <- cff_create("rmarkdown") pkg <- desc::desc(file.path(find.package("rmarkdown"), "DESCRIPTION")) cff_obj$contact pkg$get_author() ```
[Back to summary](#summary). ### date-released This key is extracted following this logic: - "Date" field or, - If not present, from "Date/Publication". This is present on packages built on CRAN and Bioconductor. Or, - If not present, from "Packaged", that is present on packages built by the [r-universe](https://r-universe.dev/search/).
Example ```{r date-released} # From an installed package cff_obj <- cff_create("rmarkdown") pkg <- desc::desc(file.path(find.package("rmarkdown"), "DESCRIPTION")) cat(pkg$get("Date/Publication")) cat(cff_obj$`date-released`) # A DESCRIPTION file without a Date nodate <- system.file("examples/DESCRIPTION_basic", package = "cffr") tmp <- tempfile("DESCRIPTION") # Create a temporary file file.copy(nodate, tmp) pkgnodate <- desc::desc(tmp) cffnodate <- cff_create(tmp) # Won't appear cat(cffnodate$`date-released`) pkgnodate # Adding a Date desc::desc_set("Date", "1999-01-01", file = tmp) cat(cff_create(tmp)$`date-released`) ```
[Back to summary](#summary). ### doi {#doi} This key is parsed from the "doi" field of the [preferred-citation](#preferred-citation) object.
Example ```{r doi} cff_doi <- cff_create("cffr") cat(cff_doi$doi) cat(cff_doi$`preferred-citation`$doi) ```
[Back to summary](#summary). ### identifiers This key includes all the possible identifiers of the package: - From the DESCRIPTION field, it includes all the urls not included in [url](#url) or [repository-code](#repository-code). - From the CITATION file, it includes all the dois not included in [doi](#doi) and the identifiers (if any) not included in the "identifiers" key of [preferred-citation](#preferred-citation).
Example ```{r identifiers} file <- system.file("examples/DESCRIPTION_many_urls", package = "cffr") pkg <- desc::desc(file) cat(pkg$get_urls()) cat(cff_create(file)$url) cat(cff_create(file)$`repository-code`) cff_create(file)$identifiers ```
[Back to summary](#summary). ### keywords This key is extracted from the DESCRIPTION file. The keywords should appear in the DESCRIPTION as: ``` ... X-schema.org-keywords: keyword1, keyword2, keyword3 ```
Example ```{r keyword} # A DESCRIPTION file without keywords nokeywords <- system.file("examples/DESCRIPTION_basic", package = "cffr") tmp2 <- tempfile("DESCRIPTION") # Create a temporary file file.copy(nokeywords, tmp2) pkgnokeywords <- desc::desc(tmp2) cffnokeywords <- cff_create(tmp2) # Won't appear cat(cffnokeywords$keywords) pkgnokeywords # Adding Keywords desc::desc_set("X-schema.org-keywords", "keyword1, keyword2, keyword3", file = tmp2 ) cat(cff_create(tmp2)$keywords) ```
Additionally, if the source code of the package is hosted on GitHub, **cffr** can retrieve the topics of your repo via the [GitHub API](https://docs.github.com/en/rest) and include those topics as keywords. This option is controlled via the `gh_keywords` parameter:
Example ```{r ghkeyword} # Get cff object from jsonvalidate jsonval <- cff_create("jsonvalidate") # Keywords are retrieved from the GitHub repo jsonval # Check keywords jsonval$keywords # The repo jsonval$`repository-code` ```
[Back to summary](#summary). ### license This key is extracted from the "License" field of the DESCRIPTION file.
Example ```{r license} cff_obj <- cff_create("yaml") cat(cff_obj$license) pkg <- desc::desc(file.path(find.package("yaml"), "DESCRIPTION")) cat(pkg$get("License")) ```
[Back to summary](#summary). ### license-url This key is not extracted from the metadata of the package. See the description on the [Guide to CFF schema v1.2.0](https://github.com/citation-file-format/citation-file-format/blob/main/schema-guide.md#license-url). > - **description**: The URL of the license text under which the software or > dataset is licensed (only for non-standard licenses not included in the > [SPDX License > List](https://github.com/citation-file-format/citation-file-format/blob/main/schema-guide.md#definitionslicense-enum)). > - **usage**:

> `yaml license-url: "https://obscure-licenses.com?id=1234"` [Back to summary](#summary). ### message This key is extracted from the DESCRIPTION field, specifically as: ```{r eval=FALSE} msg <- paste0( 'To cite package "', "NAME_OF_THE_PACKAGE", '" in publications use:' ) ```
Example ```{r message} cat(cff_create("jsonlite")$message) ```
[Back to summary](#summary). ### preferred-citation {#preferred-citation} This key is extracted from the CITATION file. If several references are provided, it would select the first citation as the "preferred-citation" and the rest of them as [references](#references).
Example ```{r preferred-citation} cffobj <- cff_create("rmarkdown") cffobj$`preferred-citation` citation("rmarkdown")[1] ```
[Back to summary](#summary). ### references {#references} This key is extracted from the CITATION file if several references are provided. The first citation is considered as the [preferred-citation](#preferred-citation) and the rest of them as "references". It also extracts the package dependencies and adds those to this fields using `citation(auto = TRUE)` on each dependency.
Example ```{r references} cffobj <- cff_create("rmarkdown") cffobj$references citation("rmarkdown")[-1] ```
[Back to summary](#summary). ### repository This key is extracted from the "Repository" field of the DESCRIPTION file. Usually, this field is auto-populated when a package is hosted on a repo (like CRAN or the [r-universe](https://r-universe.dev/)). For packages without this field on the DESCRIPTION (that is the typical case for an in-development package), **cffr** would try to search the package on any of the default repositories specified on `options("repos")`. In the case of [Bioconductor](https://bioconductor.org/) packages, those are identified if a ["biocViews"](https://contributions.bioconductor.org/description.html#biocviews) is present on the DESCRIPTION file. If **cffr** detects that the package is available on CRAN, it would return the canonical url form of the package (i.e. ).
Example ```{r repository} # Installed package inst <- cff_create("jsonlite") cat(inst$repository) # Demo file downloaded from the r-universe runiv <- system.file("examples/DESCRIPTION_r_universe", package = "cffr") runiv_cff <- cff_create(runiv) cat(runiv_cff$repository) desc::desc(runiv)$get("Repository") # For in development package norepo <- system.file("examples/DESCRIPTION_basic", package = "cffr") # No repo norepo_cff <- cff_create(norepo) cat(norepo_cff[["repository"]]) # Change the name to a known package on CRAN: ggplot2 tmp <- tempfile("DESCRIPTION") file.copy(norepo, tmp) # Change name desc::desc_set("Package", "ggplot2", file = tmp) cat(cff_create(tmp)[["repository"]]) # Show what happens if another repo is set # Save original config orig_options <- options() getOption("repos") # Set new repos options(repos = c( tidyverse = "https://tidyverse.r-universe.dev", CRAN = "https://cloud.r-project.org" )) # Load again the library # Repos are evaluated on load unloadNamespace("cffr") library(cffr) cat(cff_create(tmp)[["repository"]]) # Now it is the tidyverse repo, due to our new config! # Reset original config options(orig_options) getOption("repos") ```
[Back to summary](#summary). ### repository-artifact This key is not extracted from the metadata of the package. See the description on the [Guide to CFF schema v1.2.0](https://github.com/citation-file-format/citation-file-format/blob/main/schema-guide.md#repository-artifact). > - **description**: The URL of the work in a build artifact/binary repository > (when the work is software). > > - **usage**:

> > ``` yaml > repository-artifact: "https://search.maven.org/artifact/org.corpus-tools/cff-maven-plugin/0.4.0/maven-plugin" > ``` [Back to summary](#summary). ### repository-code {#repository-code} This key is extracted from the "BugReports" or "URL" fields on the DESCRIPTION file. **cffr** tries to identify the url of the source on the following repositories: - [GitHub](https://github.com/). - [GitLab](https://about.gitlab.com/). - [R-Forge](https://r-forge.r-project.org/). - [Bitbucket](https://bitbucket.org/).
Example ```{r repository-code} # Installed package on GitHub cff_create("jsonlite")$`repository-code` # GitLab gitlab <- system.file("examples/DESCRIPTION_gitlab", package = "cffr") cat(cff_create(gitlab)$`repository-code`) # Check desc::desc(gitlab) ```
[Back to summary](#summary). ### title This key is extracted from the "Description" field of the DESCRIPTION file. ```{r eval=FALSE} title <- paste0( "NAME_OF_THE_PACKAGE", ": ", "TITLE_OF_THE_PACKAGE" ) ```
Example ```{r title} # Installed package cat(cff_create("testthat")$title) ```
[Back to summary](#summary). ### type Fixed value equal to "software". The other possible value is "dataset". See the description on the [Guide to CFF schema v1.2.0](https://github.com/citation-file-format/citation-file-format/blob/main/schema-guide.md#type). [Back to summary](#summary). ### url {#url} This key is extracted from the "BugReports" or "URL" fields on the DESCRIPTION file. It corresponds to the first url that is different to [repository-code](#repository-code).
Example ```{r url} # Many urls manyurls <- system.file("examples/DESCRIPTION_many_urls", package = "cffr") cat(cff_create(manyurls)$url) # Check desc::desc(manyurls) ```
[Back to summary](#summary). ### version This key is extracted from the "Version" field on the DESCRIPTION file. ```{r version} # Should be (>= 3.0.0) cat(cff_create("testthat")$version) ``` [Back to summary](#summary). ## References