--- title: "Using the tidyverse with terra objects: the tidyterra package" subtitle: "JOSS paper" tags: - R - CRAN - spatial - vector - raster - tidyverse - terra - ggplot2 author: Diego Hernangómez date: "2023-07-18" description: >- Paper published on The Journal of Open Source Software. bibliography: paper.bib link-citations: yes editor_options: markdown: wrap: 80 output: rmarkdown::html_vignette vignette: > %\VignetteIndexEntry{Using the tidyverse with terra objects: the tidyterra package} %\VignetteEngine{knitr::rmarkdown} %\VignetteEncoding{UTF-8} --- [![DOI](https://joss.theoj.org/papers/10.21105/joss.05751/status.svg)](https://doi.org/10.21105/joss.05751) ## Summary **tidyterra** is an **R** [@r-project] package that allows manipulation of spatial data objects as provided by the **terra** package [@R-terra], using the verbs of the packages included in the **tidyverse** [@R-tidyverse], such as **dplyr** [@R-dplyr], **tidyr** [@R-tidyr], or **tibble** [@R-tibble]. This addition enables users that are already familiar with the **tidyverse** to approach spatial data manipulation and analysis more easily and much faster. Furthermore, **tidyterra** extends the functionality of the **ggplot2** package [@R-ggplot2] by providing additional `geoms` and `stats` [^1] like `geom_spatraster()` and `geom_spatvector()`, as well as carefully chosen scales and color palettes specifically designed for map production. [^1]: The term `geoms` refers to geometric objects, and `stats` refers to statistical transformations, following the naming conventions of **ggplot2** **tidyterra** can manipulate the following classes of **terra** objects: 1. `SpatVector` objects, which represent vector data such as points, lines, or polygon geometries. 2. `SpatRaster` objects, which represent raster data in the form of a grid consisting of equally sized rectangles. Each rectangle can contain one or more values. The first stable version of **tidyterra** was included on CRAN on April 24, 2022, and has been actively used by other packages (such as **ebvcuve** [@R-ebvcube], **biomod2** [@R-biomod2], **inlabru** [@R-inlabru], **RCzechia** [@R-rczechia] and **sparrpowR** [@R-sparrpowr]) and cited in academic research and publications (@bahlburg2023, @moraga2023, @Leonardi2023, @meister2023) ever since. ## Statement of need The [**tidyverse**](https://www.tidyverse.org/){.uri} is a compilation of **R** packages that share an underlying design philosophy, grammar, and data structures. The packages within the tidyverse are widely used by **R** users for tidying, transforming, and visualizing data. The **tidyverse** is designed to work with tidy data (*"every column is a variable, every row is an observation, every cell is a single value"*), represented in the form of data frames or **tibbles**. However, it is possible to extend the functionality of **tidyverse** packages to work with new **R** object classes by registering the corresponding S3 methods [@wickham_s32019]. This means that `dplyr::mutate()` can be adapted to work with any object of class `foo` by creating the corresponding S3 method `mutate.foo()`. While other popular packages designed for spatial data handling, such as **sf** [@R-sf] or **stars** [@R-stars], already provide integration with the **tidyverse** as part of their infrastructure, **terra** objects lack this integration natively. Although **terra** offers a wide set of functions for transforming and visualizing `SpatRaster` and `SpatVector` objects, some users who are not familiar with this package would need to make an additional effort to learn that syntax. This may imply an additional challenge during their initial steps in the field of spatial analysis. The **tidyterra** package was developed to address this integration gap. By providing the corresponding S3 methods, data analysts can apply the same syntax and functions they are already familiar with for rectangular data to the objects provided by **terra**. This enables users who are not familiar with spatial data analysis to approach this area more easily. In addition, **tidyterra** also offers functions for plotting **terra** objects using the **ggplot2** syntax. Although packages like **rasterVis** [@R-rastervis] and **ggspatial** [@R-ggspatial] already allow the representation of `SpatRaster` objects via **ggplot2**, **tidyterra** functions provide additional support for advanced mapping. This support includes the integration of faceted maps, contours, and the automatic conversion of spatial layers to the same CRS[^2] via `ggplot2::coord_sf()`. Furthermore, **tidyterra** also provides support for `SpatVector` objects, similar to the native support of **sf** objects in the **ggplot2** package. [^2]: CRS; Coordinate reference system Lastly, **tidyterra** provides a collection of color palettes specifically designed for representing spatial phenomena [@whitebox]. Additionally, it implements the cross-blended hypsometric tints described by @Patterson_Jenny_2011. ## A note on performance The development philosophy of **tidyterra** consists on adapting **terra** objects to data frame-like structures by performing different data transformations, that ultimately may impact in the performance of the package. When manipulating large raster files (i.e. more than 10.000.000 cells), it is recommended to use the native **terra** syntax, that is specifically designed for handling this type of files. In the case of plotting, the default behavior of the geoms provided is to resample `SpatRaster` that presents more than 500.000 cells to speed up the process (as the `terra::plot()` does), however this upper limit can be modified using the `maxcell` parameter of the geom function. Note also that when possible, the help page of each function of **tidyterra** references its equivalent in **terra**. ## Example of use **tidyterra** is available on [**CRAN**](https://CRAN.R-project.org/package=tidyterra), so it can be easily installed using the following commands in **R**: ``` r install.packages("tidyterra") ``` The latest developing version is hosted in [GitHub](https://github.com/dieghernan/tidyterra) and can be installed using the following command in **R**: ``` r remotes::install_github("dieghernan/tidyterra") ``` The following example demonstrates how to manipulate a `SpatRaster` object using the **dplyr** syntax. Additionally, it illustrates how to seamlessly plot a `SpatRaster` object with **ggplot2** using the `geom_spatraster()` function: ``` r library(tidyterra) library(tidyverse) # Load all the packages of tidyverse at once library(scales) # Additional library for labels # Temperatures in Castille and Leon (selected months) rastertemp <- terra::rast(system.file("extdata/cyl_temp.tif", package = "tidyterra" )) # Rename with the tidyverse rastertemp <- rastertemp %>% rename(April = tavg_04, May = tavg_05, June = tavg_06) # Plot with facets ggplot() + geom_spatraster(data = rastertemp) + facet_wrap(~lyr, ncol = 2) + scale_fill_whitebox_c( palette = "muted", labels = label_number(suffix = "º"), n.breaks = 12, guide = guide_legend(reverse = TRUE) ) + labs( fill = "", title = "Average temperature in Castille and Leon (Spain)", subtitle = "Months of April, May and June" ) ```