From Field to Figure

A Beginner’s Guide to Taming Messy Ecological Data in R

Annie Adams

What We’ll Cover Today

All our data cleaning and wrangling is powered by the Tidyverse — a collection of R packages designed for clean, readable data workflows.

Q1: What’s your current familiarity with R?

Q2: How do you plan to use these skills?

Source: USDA National Agricultural Statistics Service (NASS)
esmis.nal.usda.gov/publication/honey-bee-colonies

Tracks the percentage of U.S. honey bee colonies affected by various stressors across 44 states and 6 quarters (Q1 2024 – Q2 2025).

Variable	Type	Description
`Quarter`	character	Survey quarter (e.g. `Q1_2024`)
`Period`	character	Text label (e.g. `Jan-Mar 2024`)
`State`	character	U.S. state name — inconsistently cased in raw data
`Varroa_Mites_Pct`	numeric	% of colonies affected by Varroa mites
`Other_Pests_Parasites_Pct`	numeric	% affected by other pests or parasites
`Diseases_Pct`	numeric	% affected by diseases
`Pesticides_Pct`	numeric	% affected by pesticide exposure
`Other_Pct`	numeric	% affected by other causes
`Unknown_Pct`	numeric	% affected by unknown causes

To follow along, please download the workshop materials:

The folder includes:

Resource	Link
R for Data Science	r4ds.hadley.nz
ggplot2 Cheatsheet	rstudio.github.io/cheatsheets
Tidyverse Docs	tidyverse.org

Thank you for joining!
Questions? Feel free to email me!