July 23, 2019
I’ve spent the majority of the summer as an intern with the Texas Policy Lab, working on primarily data science-related matters such as data cleaning and visualization. Most recently, I sought to create a custom theme in ggplot2
for TPL.
The project was my first experience in developing my own R package. Prior to this project, the most familiarity I had with packages were from the install.packages()
and library()
commands.
Hadley Wickham’s book R Packages was enormously helpful in introducing package development to me. I ran into (a lot of) issues in building the package, specifically encountering problems related to local file paths and logo placement on plots.
Creating your own package is a great exercise in trial and error, and taught me a lot about programming in R that I wouldn’t have learned otherwise. I was also struck by how remarkably easy it was to create one’s own package (seriously, it requires the same amount of clicks as starting a new R project), and how thorough online resources were.
The catalyst for creating this package was coming across the Urban Institute’s urbnthemes package on GitHub. I also gathered a lot of inspiration (and borrowed some code) from ggthemes (Jeffrey Arnold), bbplot (BBC News), and hrbrthemes (Bob Rudis). I was impressed by the fact that these organizations were able to use R to create publication-ready plots despite the fact that base ggplot figures can look rather ugly (if we’re being honest).
Because the organization I intern with is still in its infancy, I thought it would be a perfect time to create a standardized theme for figures made in the future. So long as future employees adopt the theme, this package has the potential to create figures specific to our publications, lending TPL organizational credibility and creating cross-report consistency.
I thought a lot about some basic tenets of design, such as font readability, text size, and color contrast. I learned a lot about visual and aesthetic design I wouldn’t know otherwise (Kieran Healy’s section on how graphs can deceive the reader–intentionally or not–opened my eyes to a lot of important visual concepts.
Here’s an overview of some of the packages key features:
You can install the package via GitHub:
Always load library(tpltheme)
after library(ggplot2)
and/or library(tidyverse)
.
The package creates a standardized formats for plots to be used in reports created by the Texas Policy Lab. It primarily relies on set_tpl_theme()
, which allows the user to specify whether the plot theme should align with a standard plot (style = "print"
), or one specially created for plotting geographical data (style = "Texas"
). Calling set_tpl_theme()
after library(tpltheme)
does most of the work for this package!
The user is able to specify whether they want to use Lato or Adobe Caslon Pro in their figures.
To ensure that these fonts are installed and registered, use tpl_font_test()
. If fonts are not properly installed, install both fonts online and then run tpl_font_install()
.
Here are some examples of sample TPL plots with different specifications for style
and font
.
By specifying style = "Texas"
within set_tpl_theme
, the user may also create Texas-specific plots.
And it also works for categorical variables:
If the number of colors exceeds the number of colors in the TPL palette (9), the function tpl_color_pal()
will drop the TPL color palette and return the greatest number of unique colors possible within the RColorBrewer’s “Paired” palette (for more information on the use of RColorBrewer palettes, see this chapter).
The user also has the option to include the TPL logo in single plots. This may be preferred for those reports being made especially public, or to serve as a pseudo-watermark in proprietary plots.
The user can specify the position
of the logo as well as its scale
. The scale argument refers to the size of the logo object, with the specified number corresponding to a multiplication with the normal logo size. In other words, scale = 2
will double the size of the logo. The logo defaults to 1/7th of the size of the plot.
There may be some instances when an all-out logo is not warranted or preferred. If that is the case and the user would still like to watermark their figures, they can use the function add_tpl_logo_text()
to add text to an existing plot object:
The user may also need to specify align
, which moves the plot horizontally across the bottom of the page.
In the event that the user wishes to drop an axis, they may do so with drop_axis()
. The function may drop any combination of axes depending on the user’s input (drop = "x"
, drop = "y"
, drop = "both"
, drop = "neither"
).
Unlike add_tpl_logo()
, drop_axis()
should be added to an existing plot object:
I also put a lot of time into creating a color palette which was both aesthetically pleasing and accessible to color-blind viewers. This was somewhat difficult because there are quite a few types of colorblindness. Thankfully, my boss is colorblind, making test cases a lot more accessible!
The function view_palette
plots base color palettes included in tpltheme
. All TPL color palettes are led by the notation
palette_tpl_*
and therefore can be easily autocompleted within RStudio.
These palettes were created using http://colorbrewer2.org and http://coloors.co and are colorblind friendly.
The diverging and sequential color palettes are from http://colorbrewer2.org and the categorical palette is composed of a variety of colors from https://coolors.co/ and the TPL website.
In action, the color palette looks like this:
The user may specify the color palette in the scale_fill_*
or scale_color_*
functions in a ggplot call. Specifically, the user can specify the palette
(categorical, diverging, sequential) and whether the palette should be reversed.
By calling undo_tpl_theme
, you are able to remove TPL-specific theme settings and restores to ggplot defaults (but why would you want to do that?).
To restore the TPL theme, simply call set_tpl_theme()
: