July 23, 2019
I’ve spent the majority of the summer as an intern with the Texas Policy Lab, working on primarily data science-related matters such as data cleaning and visualization. Most recently, I sought to create a custom theme in
ggplot2 for TPL.
The project was my first experience in developing my own R package. Prior to this project, the most familiarity I had with packages were from the
Hadley Wickham’s book R Packages was enormously helpful in introducing package development to me. I ran into (a lot of) issues in building the package, specifically encountering problems related to local file paths and logo placement on plots.
Creating your own package is a great exercise in trial and error, and taught me a lot about programming in R that I wouldn’t have learned otherwise. I was also struck by how remarkably easy it was to create one’s own package (seriously, it requires the same amount of clicks as starting a new R project), and how thorough online resources were.
The catalyst for creating this package was coming across the Urban Institute’s urbnthemes package on GitHub. I also gathered a lot of inspiration (and borrowed some code) from ggthemes (Jeffrey Arnold), bbplot (BBC News), and hrbrthemes (Bob Rudis). I was impressed by the fact that these organizations were able to use R to create publication-ready plots despite the fact that base ggplot figures can look rather ugly (if we’re being honest).
Because the organization I intern with is still in its infancy, I thought it would be a perfect time to create a standardized theme for figures made in the future. So long as future employees adopt the theme, this package has the potential to create figures specific to our publications, lending TPL organizational credibility and creating cross-report consistency.
I thought a lot about some basic tenets of design, such as font readability, text size, and color contrast. I learned a lot about visual and aesthetic design I wouldn’t know otherwise (Kieran Healy’s section on how graphs can deceive the reader–intentionally or not–opened my eyes to a lot of important visual concepts.
Here’s an overview of some of the packages key features:
You can install the package via GitHub:
The package creates a standardized formats for plots to be used in reports created by the Texas Policy Lab. It primarily relies on
set_tpl_theme(), which allows the user to specify whether the plot theme should align with a standard plot (
style = "print"), or one specially created for plotting geographical data (
style = "Texas"). Calling
library(tpltheme) does most of the work for this package!
The user is able to specify whether they want to use Lato or Adobe Caslon Pro in their figures.
To ensure that these fonts are installed and registered, use
tpl_font_test(). If fonts are not properly installed, install both fonts online and then run
Here are some examples of sample TPL plots with different specifications for
style = "Texas" within
set_tpl_theme, the user may also create Texas-specific plots.
And it also works for categorical variables:
If the number of colors exceeds the number of colors in the TPL palette (9), the function
tpl_color_pal() will drop the TPL color palette and return the greatest number of unique colors possible within the RColorBrewer’s “Paired” palette (for more information on the use of RColorBrewer palettes, see this chapter).
The user also has the option to include the TPL logo in single plots. This may be preferred for those reports being made especially public, or to serve as a pseudo-watermark in proprietary plots.
The user can specify the
position of the logo as well as its
scale. The scale argument refers to the size of the logo object, with the specified number corresponding to a multiplication with the normal logo size. In other words,
scale = 2 will double the size of the logo. The logo defaults to 1/7th of the size of the plot.
There may be some instances when an all-out logo is not warranted or preferred. If that is the case and the user would still like to watermark their figures, they can use the function
add_tpl_logo_text() to add text to an existing plot object:
The user may also need to specify
align, which moves the plot horizontally across the bottom of the page.
In the event that the user wishes to drop an axis, they may do so with
drop_axis(). The function may drop any combination of axes depending on the user’s input (
drop = "x",
drop = "y",
drop = "both",
drop = "neither").
drop_axis() should be added to an existing plot object:
I also put a lot of time into creating a color palette which was both aesthetically pleasing and accessible to color-blind viewers. This was somewhat difficult because there are quite a few types of colorblindness. Thankfully, my boss is colorblind, making test cases a lot more accessible!
view_palette plots base color palettes included in
tpltheme. All TPL color palettes are led by the notation
palette_tpl_* and therefore can be easily autocompleted within RStudio.
These palettes were created using http://colorbrewer2.org and http://coloors.co and are colorblind friendly.
The diverging and sequential color palettes are from http://colorbrewer2.org and the categorical palette is composed of a variety of colors from https://coolors.co/ and the TPL website.
In action, the color palette looks like this:
The user may specify the color palette in the
scale_color_* functions in a ggplot call. Specifically, the user can specify the
palette (categorical, diverging, sequential) and whether the palette should be reversed.
undo_tpl_theme, you are able to remove TPL-specific theme settings and restores to ggplot defaults (but why would you want to do that?).
To restore the TPL theme, simply call