---
title: "From R to CFF"
subtitle: "Crosswalk"
description: >
A comprehenshive description of the internal mappings performed by `cffr`.
author: Diego Hernangómez
bibliography: REFERENCES.bib
link-citations: yes
output:
rmarkdown::html_vignette:
toc: true
vignette: >
%\VignetteIndexEntry{From R to CFF}
%\VignetteEngine{knitr::rmarkdown}
%\VignetteEncoding{UTF-8}
---
```{r, include = FALSE}
knitr::opts_chunk$set(
collapse = TRUE,
comment = "#>"
)
library(cffr)
```
The goal of this vignette is to provide an explicit map between the metadata
fields used by **cffr** and each one of the valid keys of the [Citation File
Format schema version
1.2.0](https://github.com/citation-file-format/citation-file-format/blob/main/schema-guide.md#valid-keys).
## Summary {#summary}
We summarize here the fields that **cffr** can parse and the original source of
information for each one of them. The details on each key are presented on the
next section of the document. The assessment of fields are based on the [Guide
to Citation File Format schema version
1.2.0](https://github.com/citation-file-format/citation-file-format/blob/main/schema-guide.md#valid-keys)
[@druskat_citation_2021].
```{r summary , echo=FALSE}
keys <- cff_schema_keys(sorted = TRUE)
origin <- vector(length = length(keys))
origin[keys == "cff-version"] <- "parameter on function"
origin[keys == "type"] <- "Fixed value: 'software'"
origin[keys == "identifiers"] <- "DESCRIPTION/CITATION files"
origin[keys == "references"] <- "DESCRIPTION/CITATION files"
origin[keys %in% c(
"message",
"title",
"version",
"authors",
"abstract",
"repository",
"repository-code",
"url",
"date-released",
"contact",
"keywords",
"license"
)] <- "DESCRIPTION file"
origin[keys %in% c(
"doi",
"preferred-citation"
)] <- "CITATION file"
origin[origin == FALSE] <- "Not parsed by cffr"
df <- data.frame(
key = paste0("", keys, ""),
source = origin
)
knitr::kable(df, escape = FALSE)
```
## Details
### abstract
This key is extracted from the "Description" field of the DESCRIPTION file.
Example
```{r abstract}
library(cffr)
# Create cffr for yaml
cff_obj <- cff_create("rmarkdown")
# Get DESCRIPTION of rmarkdown to check
pkg <- desc::desc(file.path(find.package("rmarkdown"), "DESCRIPTION"))
cat(cff_obj$abstract)
cat(pkg$get("Description"))
```
[Back to summary](#summary).
### authors
This key is parsed from the "Authors" or "Authors\@R" field of the DESCRIPTION
file. By default persons with the role "aut" or "cre" are considered, however
this can be modified via the `authors_roles` parameter.
Example
```{r authors}
# An example DESCRIPTION
path <- system.file("examples/DESCRIPTION_many_persons", package = "cffr")
pkg <- desc::desc(path)
# See persons listed
pkg$get_authors()
# Default behaviour, use authors and creators (maintainers)
cff_obj <- cff_create(path)
cff_obj$authors
# Use now Copyright holders and maintainers
cff_obj_alt <- cff_create(path, authors_roles = c("cre", "cph"))
cff_obj_alt$authors
```
[Back to summary](#summary).
### cff-version
This key can be set via the parameters of the `cff_create()`/`cff_write()`
functions:
Example
```{r cffversion}
cff_objv110 <- cff_create("jsonlite", cff_version = "v1.1.0")
cat(cff_objv110$`cff-version`)
```
[Back to summary](#summary).
### commit
This key is not extracted from the metadata of the package. See the description
on the [Guide to CFF schema
v1.2.0](https://github.com/citation-file-format/citation-file-format/blob/main/schema-guide.md#commit).
> - **description**: The commit hash or revision number of the software
> version.
>
> - **usage**:
>
> ``` yaml
> commit: 1ff847d81f29c45a3a1a5ce73d38e45c2f319bba
>
> commit: "Revision: 8612"
> ```
[Back to summary](#summary).
### contact
This key is parsed from the "Authors" or "Authors\@R" field of the DESCRIPTION
file. Only persons with the role "cre" (i.e, the maintainer(s)) are considered.
Example
```{r contact}
cff_obj <- cff_create("rmarkdown")
pkg <- desc::desc(file.path(find.package("rmarkdown"), "DESCRIPTION"))
cff_obj$contact
pkg$get_author()
```
[Back to summary](#summary).
### date-released
This key is extracted following this logic:
- "Date" field or,
- If not present, from "Date/Publication". This is present on packages built
on CRAN and Bioconductor. Or,
- If not present, from "Packaged", that is present on packages built by the
[r-universe](https://r-universe.dev/search/).
Example
```{r date-released}
# From an installed package
cff_obj <- cff_create("rmarkdown")
pkg <- desc::desc(file.path(find.package("rmarkdown"), "DESCRIPTION"))
cat(pkg$get("Date/Publication"))
cat(cff_obj$`date-released`)
# A DESCRIPTION file without a Date
nodate <- system.file("examples/DESCRIPTION_basic", package = "cffr")
tmp <- tempfile("DESCRIPTION")
# Create a temporary file
file.copy(nodate, tmp)
pkgnodate <- desc::desc(tmp)
cffnodate <- cff_create(tmp)
# Won't appear
cat(cffnodate$`date-released`)
pkgnodate
# Adding a Date
desc::desc_set("Date", "1999-01-01", file = tmp)
cat(cff_create(tmp)$`date-released`)
```
[Back to summary](#summary).
### doi {#doi}
This key is parsed from the "doi" field of the
[preferred-citation](#preferred-citation) object.
Example
```{r doi}
cff_doi <- cff_create("cffr")
cat(cff_doi$doi)
cat(cff_doi$`preferred-citation`$doi)
```
[Back to summary](#summary).
### identifiers
This key includes all the possible identifiers of the package:
- From the DESCRIPTION field, it includes all the urls not included in
[url](#url) or [repository-code](#repository-code).
- From the CITATION file, it includes all the dois not included in [doi](#doi)
and the identifiers (if any) not included in the "identifiers" key of
[preferred-citation](#preferred-citation).
Example
```{r identifiers}
file <- system.file("examples/DESCRIPTION_many_urls", package = "cffr")
pkg <- desc::desc(file)
cat(pkg$get_urls())
cat(cff_create(file)$url)
cat(cff_create(file)$`repository-code`)
cff_create(file)$identifiers
```
[Back to summary](#summary).
### keywords
This key is extracted from the DESCRIPTION file. The keywords should appear in
the DESCRIPTION as:
```
...
X-schema.org-keywords: keyword1, keyword2, keyword3
```
Example
```{r keyword}
# A DESCRIPTION file without keywords
nokeywords <- system.file("examples/DESCRIPTION_basic", package = "cffr")
tmp2 <- tempfile("DESCRIPTION")
# Create a temporary file
file.copy(nokeywords, tmp2)
pkgnokeywords <- desc::desc(tmp2)
cffnokeywords <- cff_create(tmp2)
# Won't appear
cat(cffnokeywords$keywords)
pkgnokeywords
# Adding Keywords
desc::desc_set("X-schema.org-keywords", "keyword1, keyword2, keyword3",
file = tmp2
)
cat(cff_create(tmp2)$keywords)
```
Additionally, if the source code of the package is hosted on GitHub, **cffr**
can retrieve the topics of your repo via the [GitHub
API](https://docs.github.com/en/rest) and include those topics as keywords. This
option is controlled via the `gh_keywords` parameter:
Example
```{r ghkeyword}
# Get cff object from jsonvalidate
jsonval <- cff_create("jsonvalidate")
# Keywords are retrieved from the GitHub repo
jsonval
# Check keywords
jsonval$keywords
# The repo
jsonval$`repository-code`
```
[Back to summary](#summary).
### license
This key is extracted from the "License" field of the DESCRIPTION file.
Example
```{r license}
cff_obj <- cff_create("yaml")
cat(cff_obj$license)
pkg <- desc::desc(file.path(find.package("yaml"), "DESCRIPTION"))
cat(pkg$get("License"))
```
[Back to summary](#summary).
### license-url
This key is not extracted from the metadata of the package. See the description
on the [Guide to CFF schema
v1.2.0](https://github.com/citation-file-format/citation-file-format/blob/main/schema-guide.md#license-url).
> - **description**: The URL of the license text under which the software or
> dataset is licensed (only for non-standard licenses not included in the
> [SPDX License
> List](https://github.com/citation-file-format/citation-file-format/blob/main/schema-guide.md#definitionslicense-enum)).
> - **usage**:
> `yaml license-url: "https://obscure-licenses.com?id=1234"`
[Back to summary](#summary).
### message
This key is extracted from the DESCRIPTION field, specifically as:
```{r eval=FALSE}
msg <- paste0(
'To cite package "',
"NAME_OF_THE_PACKAGE",
'" in publications use:'
)
```
Example
```{r message}
cat(cff_create("jsonlite")$message)
```
[Back to summary](#summary).
### preferred-citation {#preferred-citation}
This key is extracted from the CITATION file. If several references are
provided, it would select the first citation as the "preferred-citation" and the
rest of them as [references](#references).
Example
```{r preferred-citation}
cffobj <- cff_create("rmarkdown")
cffobj$`preferred-citation`
citation("rmarkdown")[1]
```
[Back to summary](#summary).
### references {#references}
This key is extracted from the CITATION file if several references are provided.
The first citation is considered as the
[preferred-citation](#preferred-citation) and the rest of them as "references".
It also extracts the package dependencies and adds those to this fields using
`citation(auto = TRUE)` on each dependency.
Example
```{r references}
cffobj <- cff_create("rmarkdown")
cffobj$references
citation("rmarkdown")[-1]
```
[Back to summary](#summary).
### repository
This key is extracted from the "Repository" field of the DESCRIPTION file.
Usually, this field is auto-populated when a package is hosted on a repo (like
CRAN or the [r-universe](https://r-universe.dev/)). For packages without this
field on the DESCRIPTION (that is the typical case for an in-development
package), **cffr** would try to search the package on any of the default
repositories specified on `options("repos")`.
In the case of [Bioconductor](https://bioconductor.org/) packages, those are
identified if a
["biocViews"](https://contributions.bioconductor.org/description.html#biocviews)
is present on the DESCRIPTION file.
If **cffr** detects that the package is available on CRAN, it would return the
canonical url form of the package (i.e.
).
Example
```{r repository}
# Installed package
inst <- cff_create("jsonlite")
cat(inst$repository)
# Demo file downloaded from the r-universe
runiv <- system.file("examples/DESCRIPTION_r_universe", package = "cffr")
runiv_cff <- cff_create(runiv)
cat(runiv_cff$repository)
desc::desc(runiv)$get("Repository")
# For in development package
norepo <- system.file("examples/DESCRIPTION_basic", package = "cffr")
# No repo
norepo_cff <- cff_create(norepo)
cat(norepo_cff[["repository"]])
# Change the name to a known package on CRAN: ggplot2
tmp <- tempfile("DESCRIPTION")
file.copy(norepo, tmp)
# Change name
desc::desc_set("Package", "ggplot2", file = tmp)
cat(cff_create(tmp)[["repository"]])
# Show what happens if another repo is set
# Save original config
orig_options <- options()
getOption("repos")
# Set new repos
options(repos = c(
tidyverse = "https://tidyverse.r-universe.dev",
CRAN = "https://cloud.r-project.org"
))
# Load again the library
# Repos are evaluated on load
unloadNamespace("cffr")
library(cffr)
cat(cff_create(tmp)[["repository"]])
# Now it is the tidyverse repo, due to our new config!
# Reset original config
options(orig_options)
getOption("repos")
```
[Back to summary](#summary).
### repository-artifact
This key is not extracted from the metadata of the package. See the description
on the [Guide to CFF schema
v1.2.0](https://github.com/citation-file-format/citation-file-format/blob/main/schema-guide.md#repository-artifact).
> - **description**: The URL of the work in a build artifact/binary repository
> (when the work is software).
>
> - **usage**:
>
> ``` yaml
> repository-artifact: "https://search.maven.org/artifact/org.corpus-tools/cff-maven-plugin/0.4.0/maven-plugin"
> ```
[Back to summary](#summary).
### repository-code {#repository-code}
This key is extracted from the "BugReports" or "URL" fields on the DESCRIPTION
file. **cffr** tries to identify the url of the source on the following
repositories:
- [GitHub](https://github.com/).
- [GitLab](https://about.gitlab.com/).
- [R-Forge](https://r-forge.r-project.org/).
- [Bitbucket](https://bitbucket.org/).
Example
```{r repository-code}
# Installed package on GitHub
cff_create("jsonlite")$`repository-code`
# GitLab
gitlab <- system.file("examples/DESCRIPTION_gitlab", package = "cffr")
cat(cff_create(gitlab)$`repository-code`)
# Check
desc::desc(gitlab)
```
[Back to summary](#summary).
### title
This key is extracted from the "Description" field of the DESCRIPTION file.
```{r eval=FALSE}
title <- paste0(
"NAME_OF_THE_PACKAGE",
": ",
"TITLE_OF_THE_PACKAGE"
)
```
Example
```{r title}
# Installed package
cat(cff_create("testthat")$title)
```
[Back to summary](#summary).
### type
Fixed value equal to "software". The other possible value is "dataset". See the
description on the [Guide to CFF schema
v1.2.0](https://github.com/citation-file-format/citation-file-format/blob/main/schema-guide.md#type).
[Back to summary](#summary).
### url {#url}
This key is extracted from the "BugReports" or "URL" fields on the DESCRIPTION
file. It corresponds to the first url that is different to
[repository-code](#repository-code).
Example
```{r url}
# Many urls
manyurls <- system.file("examples/DESCRIPTION_many_urls", package = "cffr")
cat(cff_create(manyurls)$url)
# Check
desc::desc(manyurls)
```
[Back to summary](#summary).
### version
This key is extracted from the "Version" field on the DESCRIPTION file.
```{r version}
# Should be (>= 3.0.0)
cat(cff_create("testthat")$version)
```
[Back to summary](#summary).
## References