Code
::tags$iframe(style="width:100%; height:600px;", src="output/infographic.pdf") htmltools
March 2, 2024
As an avid bike rider and lover of alternative transit, I played around with some bike data from Seattle and created an infographic to showcase the trends that I found. I wanted to inform viewers on how bike usage changes seasonally and annually in Seattle, Washington. My goals of this infographic were to quantify the general number of bikes on the road in Seattle, compare bike usage across six different bike counting locations, and look into seasonal biking trends. In order to do this, I utilized data from Data.Seattle.gov. 1 I specifically utilized 6 different data sets that are all analogous in structure, but each for a different location. Each dataset contains four columns: the date, the number of bikes counted that were travelling northbound, the number of bikes counted that were travelling southbound, and the total number of bikes counted (the northbound column + the southbound column). The date column includes hourly data, so each observation in the dataset is a date and an individual hour. When I aggregated the six different locations into one data set, I added a column to specify the location, and removed the north and south bound columns.
To achieve the goals stated above, I created three different visualizations to include in my infographic. I first created a bar plot to show the total number of bikes counted in my aggregated dataset for each year (2014-2020). I also created an area graph that shows the total number of bikes counted monthly at each of the six different locations. Unlike the bar plot, this graph takes month and location into consideration. The third plot I created was a heatmap that aims to look at trends both monthly and yearly. I hoped that this plot could reveal any potential global warming trends- i.e. did biking start to become less popular in June as years progressed?
When creating these plots, I made many different considerations to implement my design. The first decision I made was which types of plots I would use to convey each of my goals. I decided on a bar plot for the yearly total number of bikes counted because it easily conveyed the differences among the years. I used an area plot specifically for the monthly data as I wanted to visualize how usage varies from location to location. This graph also allowed me to see if certain locations were more popular in certain months than others- for example, were bike trails more popular in summer than a busy street? Finally, I decided on a heatmap to visualize how bike usage changed across both month and year because the different tiles make the count of bikes digestable with a continuous color scale. On all these plots, I made updates to the plot titles. In the case of the bar plot, I removed the title all together and utilized annotations instead. To remove extra clutter, I removed every other x axis tick mark (month) in the area plot. I updated the font for all plots to be consistent with the infographic title font. I made these updates in the theme()
layer of each plot. Other updates to the theme()
layer include adjusting the margins of the legend, rotating the legend (for the heatmap), increasing legend size (for the heatmap and area plot), and updating the size of x and y axis tick marks/ labels (all three plots).
When arranging the elements in my infographic, I considered many different options to get my message across in a clean and concise way that was not visually overwhelming. I wanted my area plot and bar plot to be near each other, towards the bottom of the plot. I wanted these two plots to represent land type objects (the area plot a mountain, and the bar plot to be city buildings). I wanted them towards the bottom so they looked more grounded and didn’t represent a floating mountain/ buildings. Because of this, I placed the heatmap toward the top of the map. To eliminate too much text and blank space, I also added a space needle building image. I broke up the text into smaller chunks and spread them out to avoid paragraphs of text. In order to contextualize my data, I added a paragraph that provided a bit of background on the data itself and when it started to be collected. I also provided average annual and monthly temperatures via text boxes to allow viewers to contextualize what the temperatures might be when they see peaks and valleys of bike counts, both yearly and monthly. The central message that I wanted to get across was that time of year/ temperature played a large role in the number of bikers that were counted. I made this the central message not only through my plots, but also through the text and annotations that incorporated temperature in relation to each of the three plots.
Once I decided on the current color palette, I used color blind simulator to see if the palette I chose was color friendly. The color palette I chose was centered around differentiating the different locations in my area plot, and these differences were still clear among all color deficiencies. Before creating my infographic, I used a DEI lens by looking into the different locations that the bike sensors were implemented to see if they were in only wealthy neighborhoods that would likely count more bikes. The bike locations came from different road types (i.e. bike pathways, public roads, etc.) and were not privy to any certain area.
I have included all code to my three different visualizations below! All plots were generated using ggplot()
and I used magick()
to add these plots together create a base infographic. When utilizing magick()
, some text was rendering fuzzy, despite being clear in the individual PNGs.To make text as legible as possible, I ended up individually adding the PNGs to a Canva document. I used the base infographic that I created below (with the background color and title) as the background, and then added all other elements on top in Canva. I hope you enjoy exploring my infographic and code below!
##~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
## import data ----
##~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
broadway<- read.csv(here::here("posts", "bike_infographic","raw-data","Broadway_Cycle_Track_North_Of_E_Union_St_Bicycle_Counter__Out_of_Service__20240202.csv"))
burke <- read.csv(here::here("posts", "bike_infographic","raw-data","Burke_Gilman_Trail_north_of_NE_70th_St_Bicycle_and_Pedestrian_Counter_20240202.csv"))
chief <- read.csv(here::here("posts", "bike_infographic","raw-data","Chief_Sealth_Trail_North_of_Thistle_Bicycle_Counter__Out_of_Service__20240202.csv"))
elliott <- read.csv(here::here("posts", "bike_infographic","raw-data","Elliott_Bay_Trail_in_Myrtle_Edwards_Park_Bicycle_and_Pedestrian_Counter__Out_of_Service__20240202.csv"))
mts <- read.csv(here::here("posts", "bike_infographic","raw-data","MTS_Trail_west_of_I-90_Bridge_Bicycle_and_Pedestrian_Counter__Out_of_Service__20240202.csv"))
fifty_eight <- read.csv(here::here("posts", "bike_infographic","raw-data","NW_58th_St_Greenway_at_22nd_Ave_NW_Bicycle_Counter__Out_of_Service__20240202.csv"))
##~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
## merge data ----
##~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
# rename total bike column for dataframes with bike only, create total bike column for dataframes with pedestrians ( since we dont want to include pedestrians in our dataframe)
#do this renaming and totaling schema for all 7 datasets
broadway <- broadway %>% rename("bike_total" = "Broadway.Cycle.Track.North.Of.E.Union.St.Total")
burke$bike_total <- burke$Bike.North + burke$Bike.South
chief$bike_total <- chief$Bike.North + chief$Bike.South
elliott$bike_total <- elliott$Bike.North + elliott$Bike.South
fifty_eight <- fifty_eight %>% rename("bike_total" = "NW.58th.St.Greenway.st.22nd.Ave.NW.Total")
mts$bike_total <- mts$Bike.East + mts$Bike.West
#rename southbound and northbound columns to have consistent naming for all datasets
#add locations column for each dataset with a string of what the location is
#do this renaming and adding column step for all 7 datasets
burke_clean <- burke %>%
rename("SB" = "Bike.South" , "NB" = "Bike.North") %>%
select(Date, bike_total, NB, SB, ) %>%
mutate(loc = "Burke")
chief_clean <- chief %>%
rename("SB" = "Bike.South" , "NB" = "Bike.North") %>%
select(Date, bike_total, NB, SB, ) %>%
mutate(loc = "Chief")
elliott_clean <- elliott %>%
rename("SB" = "Bike.South" , "NB" = "Bike.North") %>%
select(Date, bike_total, NB, SB, ) %>%
mutate(loc = "Elliott")
fifty_eight_clean <- fifty_eight %>%
rename ("EB" = "East", "WB" = "West") %>%
select(Date, bike_total, EB, WB) %>%
mutate(loc = "58th")
mts_clean <- mts %>%
rename("WB" = "Bike.West" , "EB" = "Bike.East") %>%
select(Date, bike_total, WB, EB, ) %>%
mutate(loc = "MTS Trail")
broadway_clean <- broadway %>%
mutate(loc = "Broadway")
#merge all cleaned dataframes that track north and south traffic
bike_data <- bind_rows(broadway_clean, burke_clean, chief_clean, elliott_clean, mts_clean, fifty_eight_clean)
##~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
## create filtered dataframes ----
##~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
# update bike data ( base data frame) date column to be in correct format, add year and month column for future filtering ------
#update Date column to type POSIXct for future wrangling, format = current way date column is formatted
bike_data$Date <- as.POSIXct(bike_data$Date, format = "%m/%d/%Y %I:%M:%S %p")
# create year column with year
bike_data$Year <- year(bike_data$Date)
#create month column with month
bike_data$Month <- month(bike_data$Date)
years <- c(2014,2015,2016,2017,2018,2019,2020)
bike <- bike_data[bike_data$Year %in% years, ]
#create a dataframe of daily with the date, location and sum of bike counts for that day (i.e. aggregate hourly counts to be daily)-----
bike_data_daily <- bike %>%
#create date column that is aggregated by day
mutate(date = floor_date(Date, unit = "day")) %>%
#group by date and location
group_by(date, loc) %>%
#create new column that has the daily count of bikes at each location each day
summarize(daily_sum = sum(bike_total, na.rm = TRUE), .groups = 'drop') %>%
#drop na values (0)
drop_na()
#create a dataframe that aggregates the monthly bike counts across all locations, should have two columns only ( month, monthly_total)
bike_data_monthly <-bike %>%
#group by month
group_by(Month) %>%
#create new column that has the monthly count of bikes at all locations
summarize(monthly_total = sum(bike_total, na.rm = TRUE)) %>%
#drop na values (0)
drop_na()
# create data frame of monthly bike counts at each locations, should have three columns ( month, monthly total, location)
bike_data_monthly_loc <-bike %>%
#group by month and location
group_by(Month, loc) %>%
#create new column that has the monthly count of bikes for each location
summarize(monthly_total = sum(bike_total, na.rm = TRUE)) %>%
#drop na values (0)
drop_na()
bike_data_year_month<- bike %>%
group_by(Year, Month) %>%
summarize(monthly_total = sum(bike_total, na.rm = TRUE)) %>%
#drop na values (0)
drop_na()
#make month a factor
bike_data_year_month$Month <- factor(month.abb[bike_data_year_month$Month], levels = month.abb)
#make year a factor
bike_data_year_month$Year <- factor(bike_data_year_month$Year)
#create filtered dataframe for heatmap
bike_data_yearly <- bike %>%
group_by(Year) %>%
summarize(yearly_total = sum(bike_total, na.rm = TRUE)) # add yearly totals for every month/year
#add font awesome icons
font_add(family = "fa-brands",
regular = here::here("posts", "bike_infographic","otfs", "Font Awesome 6 Brands-Regular-400.otf"))
font_add(family = "fa-regular",
regular = here::here("posts", "bike_infographic","otfs", "Font Awesome 6 Free-Regular-400.otf"))
font_add(family = "fa-solid",
regular = here::here("posts", "bike_infographic","otfs", "Font Awesome 6 Free-Solid-900.otf"))
#..........................import fonts..........................
# `name` is the name of the font as it appears in Google Fonts
# `family` is the user-specified id that you'll use to apply a font in your ggpplot
#add montserrat font
font_add_google(name = "Montserrat", family = "montserrat")
#................enable {showtext} for rendering.................
showtext_auto()
#create breaks and labels for x axis labeling
month_breaks <- 1:12
month_labels <- c("Jan", "", "Mar","", "May", "", "Jul","", "Sep","", "Nov","")
#month_labels <- c("Jan", "Feb", "Mar", "Apr", "May", "Jun", "Jul", "Aug", "Sep", "Oct", "Nov", "Dec")
#create color palette
custom_colors <- c(
"Broadway" = "#706513",
"Burke" = "#B57114",
"Elliott" = "#962B09",
"Chief"= "#F2C078",
"MTS Trail" = "#C1DBB3",
"58th"= "#3891A6")
#create geom area plot to show how bike traffic changes seasonally across diff locations
# add data, fill area by location of bike sensor
mountain <- ggplot(data = bike_data_monthly_loc, aes(x = Month, y = monthly_total, fill = loc)) +
#decrease the opacity
geom_area(alpha = 0.6) +
#add color palette defined above
scale_fill_manual(values = custom_colors, labels = c("58th St.", "Broadway St.", "Burke Gilman Trail", "Chief Sealth Trail", "Elliott Bay Trail", "MTS Trail")) +
#add rotated upward bike image, play around with sizes to fit top ridge of graph
geom_image(y = 510000, x = 2.5, image = "images/rotate_up_bike.png", size = .2 ) +
#add rotated downward bike image, play around with sizes to fit top ridge of graph
geom_image(y = 750000, x = 9.5, image = "images/rotate_down_bike.png", size = .2 ) +
#add title, subtitle, x and y axis labels, and legend title
labs(#title = "Seasonal Changes see heavier\nbike traffic in Spring & Summer",
#subtitle = "Data from 2014 - 2020",
x = "Month",
y = "Number of Bikers Counted",
fill = "Location") +
theme_minimal() +
#add values and labels to x axis
scale_x_continuous(breaks = month_breaks, labels = month_labels)+
#convert y axis labels to be a standard number ( including e before)
scale_y_continuous(labels = scales::comma)+
#update theme
theme(
#remove grid elements and background elements
panel.grid.major = element_blank(),
panel.grid.minor = element_blank(),
axis.text.x = element_text(angle = 45, hjust = 1, colour = "#FFFDD0", size = 28, margin = margin(t = .4)),
axis.text.y = element_text( colour = "#FFFDD0", size = 28),
axis.line = element_blank(), # Removes axis lines
axis.ticks = element_blank(), # Removes axis ticks
axis.title.x = element_blank(), # Removes x-axis title
axis.title.y = element_blank(), # Removes y-axis title
# make background transparent for infographic
panel.background = element_rect(fill = "transparent", colour = NA),
plot.background = element_rect(fill = "transparent", colour = NA),
#update font to montserrat
text = element_text(family = "montserrat") ,
#update legend margins to be closer to plot
legend.margin = margin(t = 0, r = 0, b = -5, l = -10, unit = "pt"),
legend.box.margin = margin(t = 0, r = -5, b = -25, l = -5, unit = "pt"),
#updaete space and size between legend elements
legend.spacing.x = unit(6, "pt"),
legend.spacing.y = unit(.75, "cm"),
legend.position = c(.8,1),
legend.key.size = unit(30, "pt"),
legend.justification = c(1, 1),
legend.text = element_text(size = 20, colour = "#FFFDD0"),
legend.title = element_text(colour = "#FFFDD0", size = 28))+ # update font size
guides(fill = guide_legend(byrow = TRUE))
#save plot as png to add to infographic base using magick
png('output/mountain.png',width = 22, height = 20, units = 'in', res = 300, bg = "transparent")
print(mountain) # Ensure the plot is explicitly printed
invisible(dev.off())
mountain
#initiate ggplot with year
city_buildings <- ggplot(bike_data_yearly, aes(x = Year, y = yearly_total)) +
#add geom column layer, lower opacity, adjust width to represent buidlings, make bars creme with a think black outline
geom_col(width = .7, alpha = .7, fill = "#FFFDD0", color = "black", size = .2) +
#add year at the bottom of each bar in green
geom_text(aes(label = Year, y = 0.05 * max(yearly_total)), size = 5, color = "black")+
theme_minimal()+
#update theme
theme(
#remove grid elements and background elements
panel.grid.major = element_blank(),
panel.grid.minor = element_blank(),
panel.background = element_blank(),
plot.background = element_blank(),
axis.line = element_blank(), # Removes axis lines
axis.text.x = element_blank(), # Removes x-axis labels
axis.text.y = element_blank(), # Removes y-axis labels
axis.ticks = element_blank(), # Removes axis ticks
axis.title.x = element_blank(), # Removes x-axis title
axis.title.y = element_blank(), # Removes y-axis title
text = element_text(family = "montserrat") ) # update font
#save plot as png to add to infographic base using magick
ggsave('output/city_buildings.png', plot = city_buildings, device = 'png',width = 700, height = 800, units = 'px', dpi = 700)
city_buildings
#initiate ggplot with year and month, fill by monthly total
heatmap <- ggplot(bike_data_year_month, aes(x = Month, y = Year, fill = monthly_total)) +
geom_tile(color = "white") + # Add white border to the tile
scale_fill_gradient(low = "#FFFDD0", high = "#B57114", labels = scales::comma) + # Define colors for the gradient
theme_minimal() +
scale_y_discrete(limits = rev(levels(bike_data_year_month$Year))) + # Reverse order of years on y axis
#labs(title = "Seasonal Changes see heavier\nbike traffic in Spring & Summer") + # Add title
guides(
fill = guide_colourbar(title = "Monthly Total", title.position = "bottom", title.hjust = 0.5, barwidth = 5, barheight = .25, # Adjust the size of the gradient bar and place legend underneath plot
label.position = "bottom")
) +
theme(
#angle x axis ticks, update size and typeface color
axis.text.x = element_text(angle = 45, hjust = 1, colour = "#FFFDD0", size = 11),
#update size and typeface color for y axis ticks
axis.text.y = element_text(colour = "#FFFDD0", size = 11),
#remove grid elements
panel.grid.major = element_blank(),
panel.grid.minor = element_blank(),
# remove background by making transparent
panel.background = element_rect(fill = "transparent", colour = NA),
plot.background = element_rect(fill = "transparent", colour = NA),
#remove axis line and x and y axis labels
axis.line = element_blank(),
axis.title.x = element_blank(),
axis.title.y = element_blank(),
#update typface
text = element_text(family = "montserrat"),
#place legend at bottom
legend.position = "bottom",
#update legend marigins
legend.margin = margin(t = -5, r = -10, b = -5, l = -10, unit = "pt"),
#update legend box margins
legend.box.margin = margin(t = 0, r = 0, b = 0, l = 0, unit = "pt"),
#update spacing around x and y axis of legend
legend.spacing.x = unit(2, "pt"),
legend.spacing.y = unit(2, "pt"),
legend.key.size = unit(1, "pt"),
#update legend text size
legend.text = element_text(size = 10, colour = "#FFFDD0"),
#remove legend title
legend.title = element_blank(),
plot.title = element_text(colour = "#FFFDD0", hjust = 0.5, size = 14)
)
#save plot as png to add to infographic base using magick
ggsave('output/heatmap.png', plot = heatmap, device = 'png',width = 500, height = 500, units = 'px', dpi =300)
heatmap
#.........................create caption.........................
#create bike color palette for future reference
bike_colors <- c("#706513", "#B57114", "#962B09", "#F2C078", "#C1DBB3","#3891A6")
#text color for infographic
txt <- bike_colors[4]
#background color for infographic
bg <- bike_colors[3]
#initialize infographic background
g_base <- ggplot() +
labs(
title = "Seattle Bike Trends", # add title to infographic
subtitle = "Bikes counts from six bike sensor stations from 2014 to 2020", # add subtitle to infographic
) +
theme_void() +
#update theme elements
theme(
# specify typeface and font size for title
text = element_text(family = "montserrat", size = 20, lineheight = 1.2, colour = txt),
#fill infographic background with specified background color
plot.background = element_rect(fill = bg, colour = bg),
#make title bold, adjust margins and horizontal justification
plot.title = element_text(size = 35, face = "bold", hjust = 0.5, margin = margin(b = 20)),
#update subtitle typeface, adjst margins and horizontal justification
plot.subtitle = element_text(family = "montserrat", hjust = 0.5, margin = margin( t = -20, b = 30), size = 12),
#adjut plot margins
plot.margin = margin(b = 10, t = 7, r = 10, l = 10)
)
g_base
#read all images created above and remove edges that are the background color from the plots
mountain_image <- image_read('output/mountain.png') %>%
image_trim()
buildings_image <- image_read('output/city_buildings.png') %>%
image_trim()
heatmap_image <- image_read('output/heatmap.png') %>%
image_trim()
space_needle_image <- image_read('output/SeattleSpaceNeedle.png')
title_image <- image_read('output/g_base.png')
#arrange all plots onto infographic base
title_image %>%
# add mountain plot on the middle left of plot
image_composite(image_scale(mountain_image, '400x'), offset = '+2+450') %>%
#add the buildings image on the lower right
image_composite(image_scale(buildings_image, "300x"), gravity = "Center", offset = "+180+425") %>%
#add the heatmap on the top right
image_composite(image_scale(heatmap_image, "350"), offset = "+280+125") %>%
#add the space needle png next to the buildings
image_composite(image_scale(space_needle_image, '100x'), gravity = "Center", offset = '-10+444') %>%
#create png
image_write(path = "output/postcard.png")
#assign png to an R variable
infographic_base <- readPNG('output/postcard.png')
#print infographic
grid.raster(infographic_base)
Seattle Department of Transportation. (2023). Bike Counters. https://www.seattle.gov/transportation/projects-and-programs/programs/bike-program/bike-counters↩︎