class: inverse, center, middle # 36-315: Statistical Graphics and Visualization ## Lecture 3 Meghan Hall <br> Department of Statistics & Data Science <br> Carnegie Mellon University <br> May 26, 2021 --- layout: true <div class="my-footer"><span>cmu-36315.netlify.app</span></div> --- # From last time <br> .large[The grammar of graphics] <br> .medium[How graphics are constructed in R] <br> .large[Tidyverse principles] <br> .medium[For any necessary data manipulation] --- # Updates <br> .large[Lab 1] <br> .medium[Solution has been posted] <br> .medium[Piazza reminder] <br> .large[Homework 1] <br> .medium[Due Tuesday June 1, posted soon] --- # Today <br> .large[Bar graphs] <br> .medium[Of all shapes & sizes] <br> .large[Tidyverse principles] <br> .medium[For any necessary data manipulation] --- # Tidyverse review: the pipe <br> <br> .huge.center[`%>%`] <br> <br> .medium[Allows you to "chain" multiple operations together] <br> <br> .medium[Helps avoid intermediate steps] <br> <br> .medium[Easier to read!] <br> <br> .medium[Remember the `ggplot` equivalent is `+`] --- # Tidyverse review: the pipe ```r penguins1 <- mutate(penguins, big_penguin = ifelse(body_mass_g >= 5000, "yes", "no")) penguins2 <- filter(penguins1, !is.na(body_mass_g)) ``` -- ```r penguins <- mutate(penguins, big_penguin = ifelse(body_mass_g >= 5000, "yes", "no")) penguins <- filter(penguins, !is.na(body_mass_g)) ``` -- ```r big_penguins <- penguins %>% mutate(penguins, big_penguin = ifelse(body_mass_g >= 5000, "yes", "no")) %>% filter(!is.na(body_mass_g)) ``` --- # Tidyverse review: pivoting .medium[Let's look at some sample data] <table class="table" style="font-size: 16px; width: auto !important; margin-left: auto; margin-right: auto;"> <thead> <tr> <th style="text-align:right;font-weight: bold;color: white !important;background-color: #bb0000 !important;"> id </th> <th style="text-align:right;font-weight: bold;color: white !important;background-color: #bb0000 !important;"> 2007 </th> <th style="text-align:right;font-weight: bold;color: white !important;background-color: #bb0000 !important;"> 2008 </th> <th style="text-align:right;font-weight: bold;color: white !important;background-color: #bb0000 !important;"> 2009 </th> </tr> </thead> <tbody> <tr> <td style="text-align:right;"> 1 </td> <td style="text-align:right;"> 2005 </td> <td style="text-align:right;"> 2983 </td> <td style="text-align:right;"> 3124 </td> </tr> <tr> <td style="text-align:right;"> 2 </td> <td style="text-align:right;"> 3076 </td> <td style="text-align:right;"> 2965 </td> <td style="text-align:right;"> 3231 </td> </tr> <tr> <td style="text-align:right;"> 3 </td> <td style="text-align:right;"> 3687 </td> <td style="text-align:right;"> 3642 </td> <td style="text-align:right;"> 3810 </td> </tr> </tbody> </table> -- .medium[Components of "tidy" data:] 1. Each variable must have its own column 2. Each observation must have its own row 3. Each value must have its own cell --- # Tidyverse review: pivoting ```r penguins_pivoted <- penguins_example %>% pivot_longer(`2007`:`2009`, names_to = "year", values_to = "body_mass") ``` -- <table class="table" style="font-size: 16px; width: auto !important; margin-left: auto; margin-right: auto;"> <thead> <tr> <th style="text-align:right;font-weight: bold;color: white !important;background-color: #bb0000 !important;"> id </th> <th style="text-align:left;font-weight: bold;color: white !important;background-color: #bb0000 !important;"> year </th> <th style="text-align:right;font-weight: bold;color: white !important;background-color: #bb0000 !important;"> body_mass </th> </tr> </thead> <tbody> <tr> <td style="text-align:right;"> 1 </td> <td style="text-align:left;"> 2007 </td> <td style="text-align:right;"> 2005 </td> </tr> <tr> <td style="text-align:right;"> 1 </td> <td style="text-align:left;"> 2008 </td> <td style="text-align:right;"> 2983 </td> </tr> <tr> <td style="text-align:right;"> 1 </td> <td style="text-align:left;"> 2009 </td> <td style="text-align:right;"> 3124 </td> </tr> <tr> <td style="text-align:right;"> 2 </td> <td style="text-align:left;"> 2007 </td> <td style="text-align:right;"> 3076 </td> </tr> <tr> <td style="text-align:right;"> 2 </td> <td style="text-align:left;"> 2008 </td> <td style="text-align:right;"> 2965 </td> </tr> <tr> <td style="text-align:right;"> 2 </td> <td style="text-align:left;"> 2009 </td> <td style="text-align:right;"> 3231 </td> </tr> <tr> <td style="text-align:right;"> 3 </td> <td style="text-align:left;"> 2007 </td> <td style="text-align:right;"> 3687 </td> </tr> <tr> <td style="text-align:right;"> 3 </td> <td style="text-align:left;"> 2008 </td> <td style="text-align:right;"> 3642 </td> </tr> <tr> <td style="text-align:right;"> 3 </td> <td style="text-align:left;"> 2009 </td> <td style="text-align:right;"> 3810 </td> </tr> </tbody> </table> --- # Tidyverse review: pivoting <br> .left[![pivot](figs/Lec3/pivot.png)] <br> <br> .right[From [R for Data Science](https://r4ds.had.co.nz/)] --- # Today's data .center[![GOT](figs/Lec3/GOT.png)] --- # Today's data ```r got %>% glimpse() ``` ``` ## Rows: 38 ## Columns: 25 ## $ name <chr> "Battle of the Golden Tooth", "Battle at the Mummer… ## $ year <dbl> 298, 298, 298, 298, 298, 298, 298, 299, 299, 299, 2… ## $ battle_number <dbl> 1, 2, 3, 4, 5, 6, 7, 8, 9, 10, 11, 12, 13, 14, 15, … ## $ attacker_king <chr> "Joffrey/Tommen Baratheon", "Joffrey/Tommen Barathe… ## $ defender_king <chr> "Robb Stark", "Robb Stark", "Robb Stark", "Joffrey/… ## $ attacker_1 <chr> "Lannister", "Lannister", "Lannister", "Stark", "St… ## $ attacker_2 <chr> NA, NA, NA, NA, "Tully", "Tully", NA, NA, NA, NA, N… ## $ attacker_3 <chr> NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA,… ## $ attacker_4 <chr> NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA,… ## $ defender_1 <chr> "Tully", "Baratheon", "Tully", "Lannister", "Lannis… ## $ defender_2 <chr> NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA,… ## $ defender_3 <lgl> NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA,… ## $ defender_4 <lgl> NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA,… ## $ attacker_outcome <chr> "win", "win", "win", "loss", "win", "win", "win", "… ## $ battle_type <chr> "pitched battle", "ambush", "pitched battle", "pitc… ## $ major_death <dbl> 1, 1, 0, 1, 1, 0, 0, 0, 0, 0, 0, 0, 0, 1, 1, 1, 0, … ## $ major_capture <dbl> 0, 0, 1, 1, 1, 0, 0, 0, 0, 0, 0, 1, 1, 0, 1, 0, 0, … ## $ attacker_size <dbl> 15000, NA, 15000, 18000, 1875, 6000, NA, NA, 1000, … ## $ defender_size <dbl> 4000, 120, 10000, 20000, 6000, 12625, NA, NA, NA, N… ## $ attacker_commander <chr> "Jaime Lannister", "Gregor Clegane", "Jaime Lannist… ## $ defender_commander <chr> "Clement Piper, Vance", "Beric Dondarrion", "Edmure… ## $ summer <dbl> 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, … ## $ location <chr> "Golden Tooth", "Mummer's Ford", "Riverrun", "Green… ## $ region <chr> "The Westerlands", "The Riverlands", "The Riverland… ## $ note <chr> NA, NA, NA, NA, NA, NA, NA, NA, NA, "Greyjoy's troo… ``` --- # Today's data <table class="table" style="font-size: 16px; width: auto !important; margin-left: auto; margin-right: auto;"> <thead> <tr> <th style="text-align:left;font-weight: bold;color: white !important;background-color: #bb0000 !important;"> name </th> <th style="text-align:left;font-weight: bold;color: white !important;background-color: #bb0000 !important;"> attacker_king </th> <th style="text-align:left;font-weight: bold;color: white !important;background-color: #bb0000 !important;"> region </th> <th style="text-align:right;font-weight: bold;color: white !important;background-color: #bb0000 !important;"> year </th> <th style="text-align:right;font-weight: bold;color: white !important;background-color: #bb0000 !important;"> attacker_size </th> </tr> </thead> <tbody> <tr> <td style="text-align:left;"> Battle of the Golden Tooth </td> <td style="text-align:left;"> Joffrey/Tommen Baratheon </td> <td style="text-align:left;"> The Westerlands </td> <td style="text-align:right;"> 298 </td> <td style="text-align:right;"> 15000 </td> </tr> <tr> <td style="text-align:left;"> Battle at the Mummer's Ford </td> <td style="text-align:left;"> Joffrey/Tommen Baratheon </td> <td style="text-align:left;"> The Riverlands </td> <td style="text-align:right;"> 298 </td> <td style="text-align:right;"> </td> </tr> <tr> <td style="text-align:left;"> Battle of Riverrun </td> <td style="text-align:left;"> Joffrey/Tommen Baratheon </td> <td style="text-align:left;"> The Riverlands </td> <td style="text-align:right;"> 298 </td> <td style="text-align:right;"> 15000 </td> </tr> <tr> <td style="text-align:left;"> Battle of the Green Fork </td> <td style="text-align:left;"> Robb Stark </td> <td style="text-align:left;"> The Riverlands </td> <td style="text-align:right;"> 298 </td> <td style="text-align:right;"> 18000 </td> </tr> <tr> <td style="text-align:left;"> Battle of the Whispering Wood </td> <td style="text-align:left;"> Robb Stark </td> <td style="text-align:left;"> The Riverlands </td> <td style="text-align:right;"> 298 </td> <td style="text-align:right;"> 1875 </td> </tr> <tr> <td style="text-align:left;"> Battle of the Camps </td> <td style="text-align:left;"> Robb Stark </td> <td style="text-align:left;"> The Riverlands </td> <td style="text-align:right;"> 298 </td> <td style="text-align:right;"> 6000 </td> </tr> <tr> <td style="text-align:left;"> Sack of Darry </td> <td style="text-align:left;"> Joffrey/Tommen Baratheon </td> <td style="text-align:left;"> The Riverlands </td> <td style="text-align:right;"> 298 </td> <td style="text-align:right;"> </td> </tr> <tr> <td style="text-align:left;"> Battle of Moat Cailin </td> <td style="text-align:left;"> Balon/Euron Greyjoy </td> <td style="text-align:left;"> The North </td> <td style="text-align:right;"> 299 </td> <td style="text-align:right;"> </td> </tr> <tr> <td style="text-align:left;"> Battle of Deepwood Motte </td> <td style="text-align:left;"> Balon/Euron Greyjoy </td> <td style="text-align:left;"> The North </td> <td style="text-align:right;"> 299 </td> <td style="text-align:right;"> 1000 </td> </tr> <tr> <td style="text-align:left;"> Battle of the Stony Shore </td> <td style="text-align:left;"> Balon/Euron Greyjoy </td> <td style="text-align:left;"> The North </td> <td style="text-align:right;"> 299 </td> <td style="text-align:right;"> 264 </td> </tr> </tbody> </table> --- # Data types <br> <br> .large[Categorical/qualitative] <br> .medium[Ordered vs. unordered/nominal] <br> .medium[`name`, `attacker_king`, `region`] <br> -- <br> .large[Numeric/quantitative] <br> .medium[Discrete vs. continuous] <br> .medium[`attacker_size`, `defender_size`] <br> -- <br> .large[What about `year`?] --- # The beauty of a bar chart <br> .large[Comparing a numeric variable and a categorical one] <br> .medium[Standard, grouped, stacked] <br> .large[Easy for eyes to perceive comparisons by length] <br> .medium[Assuming a fixed start point] --- class: left # .left[Today's agenda] .pull-left[ .large[standard bar chart] <br> ordered <br> with a label <br> <br> <br> .large[grouped bar chart] <br> legend_position` <br> aggregate data <br> <br> <br> .large[stacked bar chart] <br> percent stacked <br> add_count <br> <br> ] -- .pull-right[ .large[stacked bar chart] <br> pivot_longer <br> case_when <br> <br> <br> .large[standard bar chart] <br> separate <br> pivot_longer <br> <br> <br> .large[grouped bar chart] <br> year as character <br> string manipulation ] --- class: left # .left[Today's agenda] .pull-left[ .large[**standard bar chart**] <br> *ordered* <br> *with a label* <br> **in which regions do battles happen?** <br> <br> <br> .large[grouped bar chart] <br> legend_position` <br> aggregate data <br> <br> <br> .large[stacked bar chart] <br> percent stacked <br> add_count <br> <br> ] .pull-right[ .large[stacked bar chart] <br> pivot_longer <br> case_when <br> <br> <br> .large[standard bar chart] <br> separate <br> pivot_longer <br> <br> <br> .large[grouped bar chart] <br> year as character <br> string manipulation ] --- # 1. Standard bar chart ```r got %>% ggplot(aes(x = region)) + geom_bar() ``` <img src="figs/Lec3/standard-1-1.png" width="504" style="display: block; margin: auto;" /> --- # 1. Standard bar chart ```r got %>% ggplot(aes(x = region)) + geom_bar() + coord_flip() ``` <img src="figs/Lec3/standard-2-1.png" width="504" style="display: block; margin: auto;" /> --- # 1. Standard bar chart ```r got %>% * count(region) %>% ggplot(aes(x = reorder(region, n), y = n)) + geom_bar(stat = "identity") + coord_flip() ``` -- <img src="figs/Lec3/standard-3-1.png" width="504" style="display: block; margin: auto;" /> --- # 1. Standard bar chart ```r got %>% count(region) %>% ggplot(aes(x = reorder(region, n), y = n)) + geom_bar(stat = "identity") + coord_flip() + * geom_text(aes(label = n), hjust = -0.5) + scale_y_continuous(limits = c(0, 18)) ``` --- # 1. Standard bar chart <img src="figs/Lec3/standard-4-1.png" width="504" style="display: block; margin: auto;" /> --- class: left # .left[Today's agenda] .pull-left[ .large[standard bar chart] <br> ordered <br> with a label <br> <br> <br> .large[**grouped bar chart**] <br> legend_position` <br> aggregate data <br> **attacking kings use which battle types?** <br> <br> <br> .large[stacked bar chart] <br> percent stacked <br> add_count <br> <br> ] .pull-right[ .large[stacked bar chart] <br> pivot_longer <br> case_when <br> <br> <br> .large[standard bar chart] <br> separate <br> pivot_longer <br> <br> <br> .large[grouped bar chart] <br> year as character <br> string manipulation ] --- # 2. Grouped bar chart ```r got %>% ggplot(aes(x = attacker_king, fill = battle_type)) + geom_bar(position = "dodge") ``` -- <img src="figs/Lec3/grouped-1-1.png" width="504" style="display: block; margin: auto;" /> --- # 2. Grouped bar chart ```r got %>% * filter(!is.na(attacker_king) & !is.na(battle_type)) %>% ggplot(aes(x = attacker_king, fill = battle_type)) + geom_bar(position = "dodge") + * theme(legend.position = c(0.1, 0.8)) ``` -- <img src="figs/Lec3/grouped-2-1.png" width="504" style="display: block; margin: auto;" /> --- # 2. Grouped bar chart ```r got %>% * group_by(attacker_king, battle_type) %>% * summarize(mean_size = mean(attacker_size, na.rm = TRUE)) %>% filter(!is.na(attacker_king) & !is.na(battle_type)) %>% ggplot(aes(x = attacker_king, y = mean_size, fill = battle_type)) + geom_bar(stat = "identity", position = "dodge") + theme(legend.position = c(0.1, 0.8)) ``` --- # 2. Grouped bar chart <img src="figs/Lec3/grouped-3-1.png" width="504" style="display: block; margin: auto;" /> --- class: left # .left[Today's agenda] .pull-left[ .large[standard bar chart] <br> ordered <br> with a label <br> <br> <br> .large[grouped bar chart] <br> legend_position` <br> aggregate data <br> <br> <br> .large[**stacked bar chart**] <br> percent stacked <br> add_count <br> **in which regions do kings attack?** <br> **which attacking kings win?** <br> <br> ] .pull-right[ .large[stacked bar chart] <br> pivot_longer <br> case_when <br> <br> <br> .large[standard bar chart] <br> separate <br> pivot_longer <br> <br> <br> .large[grouped bar chart] <br> year as character <br> string manipulation ] --- # 3. Stacked bar chart ```r got %>% filter(!is.na(attacker_king) & !is.na(region)) %>% ggplot(aes(x = attacker_king, fill = region)) + geom_bar(position = "stack") ``` -- <img src="figs/Lec3/stacked-1-1.png" width="504" style="display: block; margin: auto;" /> --- # 3. Stacked bar chart ```r got %>% filter(!is.na(attacker_king) & !is.na(region)) %>% ggplot(aes(x = attacker_king)) + geom_bar() + facet_wrap(~region) ``` -- <img src="figs/Lec3/stacked-2-1.png" width="504" style="display: block; margin: auto;" /> --- # 3. Stacked bar chart ```r got %>% filter(!is.na(attacker_king) & !is.na(region)) %>% ggplot(aes(x = region)) + geom_bar() + coord_flip() + facet_wrap(~attacker_king) ``` -- <img src="figs/Lec3/stacked-3-1.png" width="504" style="display: block; margin: auto;" /> --- # 3. Stacked bar chart ```r got %>% * add_count(attacker_king) %>% ggplot(aes(x = attacker_king, y = n, fill = attacker_outcome)) + geom_bar(stat = "identity", position = "fill") ``` --- # `add_count` <table class="table" style="font-size: 16px; width: auto !important; margin-left: auto; margin-right: auto;"> <thead> <tr> <th style="text-align:right;font-weight: bold;color: white !important;background-color: #bb0000 !important;"> battle_number </th> <th style="text-align:left;font-weight: bold;color: white !important;background-color: #bb0000 !important;"> attacker_king </th> <th style="text-align:right;font-weight: bold;color: white !important;background-color: #bb0000 !important;"> n </th> </tr> </thead> <tbody> <tr> <td style="text-align:right;"> 1 </td> <td style="text-align:left;"> Joffrey/Tommen Baratheon </td> <td style="text-align:right;"> 14 </td> </tr> <tr> <td style="text-align:right;"> 2 </td> <td style="text-align:left;"> Joffrey/Tommen Baratheon </td> <td style="text-align:right;"> 14 </td> </tr> <tr> <td style="text-align:right;"> 3 </td> <td style="text-align:left;"> Joffrey/Tommen Baratheon </td> <td style="text-align:right;"> 14 </td> </tr> <tr> <td style="text-align:right;"> 4 </td> <td style="text-align:left;"> Robb Stark </td> <td style="text-align:right;"> 10 </td> </tr> <tr> <td style="text-align:right;"> 5 </td> <td style="text-align:left;"> Robb Stark </td> <td style="text-align:right;"> 10 </td> </tr> <tr> <td style="text-align:right;"> 6 </td> <td style="text-align:left;"> Robb Stark </td> <td style="text-align:right;"> 10 </td> </tr> <tr> <td style="text-align:right;"> 7 </td> <td style="text-align:left;"> Joffrey/Tommen Baratheon </td> <td style="text-align:right;"> 14 </td> </tr> <tr> <td style="text-align:right;"> 8 </td> <td style="text-align:left;"> Balon/Euron Greyjoy </td> <td style="text-align:right;"> 7 </td> </tr> <tr> <td style="text-align:right;"> 9 </td> <td style="text-align:left;"> Balon/Euron Greyjoy </td> <td style="text-align:right;"> 7 </td> </tr> <tr> <td style="text-align:right;"> 10 </td> <td style="text-align:left;"> Balon/Euron Greyjoy </td> <td style="text-align:right;"> 7 </td> </tr> </tbody> </table> --- # 3. Stacked bar chart ```r got %>% add_count(attacker_king) %>% ggplot(aes(x = attacker_king, y = n, fill = attacker_outcome)) + * geom_bar(stat = "identity", position = "fill") ``` -- <img src="figs/Lec3/percent-2-1.png" width="504" style="display: block; margin: auto;" /> --- # 3. Stacked bar chart ```r got %>% filter(!is.na(attacker_king) & !is.na(attacker_outcome)) %>% add_count(attacker_king) %>% ggplot(aes(x = attacker_king, y = n, fill = attacker_outcome)) + geom_bar(stat = "identity", position = "fill") + * scale_y_continuous(name = "Percent Win", labels = scales::percent) ``` -- <img src="figs/Lec3/percent-3-1.png" width="504" style="display: block; margin: auto;" /> --- class: left # .left[Today's agenda] .pull-left[ .large[standard bar chart] <br> ordered <br> with a label <br> <br> <br> .large[grouped bar chart] <br> legend_position` <br> aggregate data <br> <br> <br> .large[stacked bar chart] <br> percent stacked <br> add_count <br> <br> ] .pull-right[ .large[**stacked bar chart**] <br> pivot_longer <br> case_when <br> **which kings win?** <br> <br> <br> .large[standard bar chart] <br> separate <br> pivot_longer <br> <br> <br> .large[grouped bar chart] <br> year as character <br> string manipulation ] --- # `pivot_longer` <table class="table" style="font-size: 16px; width: auto !important; margin-left: auto; margin-right: auto;"> <thead> <tr> <th style="text-align:right;font-weight: bold;color: white !important;background-color: #bb0000 !important;"> battle_number </th> <th style="text-align:left;font-weight: bold;color: white !important;background-color: #bb0000 !important;"> attacker_king </th> <th style="text-align:left;font-weight: bold;color: white !important;background-color: #bb0000 !important;"> defender_king </th> <th style="text-align:left;font-weight: bold;color: white !important;background-color: #bb0000 !important;"> attacker_outcome </th> </tr> </thead> <tbody> <tr> <td style="text-align:right;"> 1 </td> <td style="text-align:left;"> Joffrey/Tommen Baratheon </td> <td style="text-align:left;"> Robb Stark </td> <td style="text-align:left;"> win </td> </tr> <tr> <td style="text-align:right;"> 2 </td> <td style="text-align:left;"> Joffrey/Tommen Baratheon </td> <td style="text-align:left;"> Robb Stark </td> <td style="text-align:left;"> win </td> </tr> <tr> <td style="text-align:right;"> 3 </td> <td style="text-align:left;"> Joffrey/Tommen Baratheon </td> <td style="text-align:left;"> Robb Stark </td> <td style="text-align:left;"> win </td> </tr> <tr> <td style="text-align:right;"> 4 </td> <td style="text-align:left;"> Robb Stark </td> <td style="text-align:left;"> Joffrey/Tommen Baratheon </td> <td style="text-align:left;"> loss </td> </tr> <tr> <td style="text-align:right;"> 5 </td> <td style="text-align:left;"> Robb Stark </td> <td style="text-align:left;"> Joffrey/Tommen Baratheon </td> <td style="text-align:left;"> win </td> </tr> <tr> <td style="text-align:right;"> 6 </td> <td style="text-align:left;"> Robb Stark </td> <td style="text-align:left;"> Joffrey/Tommen Baratheon </td> <td style="text-align:left;"> win </td> </tr> <tr> <td style="text-align:right;"> 7 </td> <td style="text-align:left;"> Joffrey/Tommen Baratheon </td> <td style="text-align:left;"> Robb Stark </td> <td style="text-align:left;"> win </td> </tr> <tr> <td style="text-align:right;"> 8 </td> <td style="text-align:left;"> Balon/Euron Greyjoy </td> <td style="text-align:left;"> Robb Stark </td> <td style="text-align:left;"> win </td> </tr> <tr> <td style="text-align:right;"> 9 </td> <td style="text-align:left;"> Balon/Euron Greyjoy </td> <td style="text-align:left;"> Robb Stark </td> <td style="text-align:left;"> win </td> </tr> <tr> <td style="text-align:right;"> 10 </td> <td style="text-align:left;"> Balon/Euron Greyjoy </td> <td style="text-align:left;"> Robb Stark </td> <td style="text-align:left;"> win </td> </tr> </tbody> </table> --- # `pivot_longer` ```r got %>% select(battle_number, attacker_king, defender_king, attacker_outcome) %>% pivot_longer(ends_with("king"), names_to = "type", values_to = "king", values_drop_na = TRUE) ``` -- <table class="table" style="font-size: 16px; width: auto !important; margin-left: auto; margin-right: auto;"> <thead> <tr> <th style="text-align:right;font-weight: bold;color: white !important;background-color: #bb0000 !important;"> battle_number </th> <th style="text-align:left;font-weight: bold;color: white !important;background-color: #bb0000 !important;"> attacker_outcome </th> <th style="text-align:left;font-weight: bold;color: white !important;background-color: #bb0000 !important;"> type </th> <th style="text-align:left;font-weight: bold;color: white !important;background-color: #bb0000 !important;"> king </th> </tr> </thead> <tbody> <tr> <td style="text-align:right;"> 1 </td> <td style="text-align:left;"> win </td> <td style="text-align:left;"> attacker_king </td> <td style="text-align:left;"> Joffrey/Tommen Baratheon </td> </tr> <tr> <td style="text-align:right;"> 1 </td> <td style="text-align:left;"> win </td> <td style="text-align:left;"> defender_king </td> <td style="text-align:left;"> Robb Stark </td> </tr> <tr> <td style="text-align:right;"> 2 </td> <td style="text-align:left;"> win </td> <td style="text-align:left;"> attacker_king </td> <td style="text-align:left;"> Joffrey/Tommen Baratheon </td> </tr> <tr> <td style="text-align:right;"> 2 </td> <td style="text-align:left;"> win </td> <td style="text-align:left;"> defender_king </td> <td style="text-align:left;"> Robb Stark </td> </tr> <tr> <td style="text-align:right;"> 3 </td> <td style="text-align:left;"> win </td> <td style="text-align:left;"> attacker_king </td> <td style="text-align:left;"> Joffrey/Tommen Baratheon </td> </tr> <tr> <td style="text-align:right;"> 3 </td> <td style="text-align:left;"> win </td> <td style="text-align:left;"> defender_king </td> <td style="text-align:left;"> Robb Stark </td> </tr> <tr> <td style="text-align:right;"> 4 </td> <td style="text-align:left;"> loss </td> <td style="text-align:left;"> attacker_king </td> <td style="text-align:left;"> Robb Stark </td> </tr> <tr> <td style="text-align:right;"> 4 </td> <td style="text-align:left;"> loss </td> <td style="text-align:left;"> defender_king </td> <td style="text-align:left;"> Joffrey/Tommen Baratheon </td> </tr> </tbody> </table> --- # `case_when` ```r got %>% select(battle_number, attacker_king, defender_king, attacker_outcome) %>% pivot_longer(ends_with("king"), names_to = "type", values_to = "king", values_drop_na = TRUE) %>% mutate(outcome = case_when(type == "attacker_king" ~ attacker_outcome, attacker_outcome == "win" ~ "loss", attacker_outcome == "loss" ~ "win")) ``` -- <table class="table" style="font-size: 16px; width: auto !important; margin-left: auto; margin-right: auto;"> <thead> <tr> <th style="text-align:right;font-weight: bold;color: white !important;background-color: #bb0000 !important;"> battle_number </th> <th style="text-align:left;font-weight: bold;color: white !important;background-color: #bb0000 !important;"> attacker_outcome </th> <th style="text-align:left;font-weight: bold;color: white !important;background-color: #bb0000 !important;"> type </th> <th style="text-align:left;font-weight: bold;color: white !important;background-color: #bb0000 !important;"> king </th> <th style="text-align:left;font-weight: bold;color: white !important;background-color: #bb0000 !important;"> outcome </th> </tr> </thead> <tbody> <tr> <td style="text-align:right;"> 1 </td> <td style="text-align:left;"> win </td> <td style="text-align:left;"> attacker_king </td> <td style="text-align:left;"> Joffrey/Tommen Baratheon </td> <td style="text-align:left;"> win </td> </tr> <tr> <td style="text-align:right;"> 1 </td> <td style="text-align:left;"> win </td> <td style="text-align:left;"> defender_king </td> <td style="text-align:left;"> Robb Stark </td> <td style="text-align:left;"> loss </td> </tr> <tr> <td style="text-align:right;"> 2 </td> <td style="text-align:left;"> win </td> <td style="text-align:left;"> attacker_king </td> <td style="text-align:left;"> Joffrey/Tommen Baratheon </td> <td style="text-align:left;"> win </td> </tr> <tr> <td style="text-align:right;"> 2 </td> <td style="text-align:left;"> win </td> <td style="text-align:left;"> defender_king </td> <td style="text-align:left;"> Robb Stark </td> <td style="text-align:left;"> loss </td> </tr> <tr> <td style="text-align:right;"> 3 </td> <td style="text-align:left;"> win </td> <td style="text-align:left;"> attacker_king </td> <td style="text-align:left;"> Joffrey/Tommen Baratheon </td> <td style="text-align:left;"> win </td> </tr> <tr> <td style="text-align:right;"> 3 </td> <td style="text-align:left;"> win </td> <td style="text-align:left;"> defender_king </td> <td style="text-align:left;"> Robb Stark </td> <td style="text-align:left;"> loss </td> </tr> </tbody> </table> --- # 4. Stacked bar chart ```r got %>% select(battle_number, attacker_king, defender_king, attacker_outcome) %>% filter(!is.na(attacker_outcome)) %>% pivot_longer(ends_with("king"), names_to = "type", values_to = "king", values_drop_na = TRUE) %>% mutate(outcome = case_when(type == "attacker_king" ~ attacker_outcome, attacker_outcome == "win" ~ "loss", attacker_outcome == "loss" ~ "win")) %>% add_count(king) %>% ggplot(aes(x = king, y = n, fill = outcome)) + geom_bar(stat = "identity", position = "fill") + scale_y_continuous(name = "Percent Win", labels = scales::percent) ``` --- # 4. Stacked bar chart <img src="figs/Lec3/pivot-9-1.png" width="504" style="display: block; margin: auto;" /> --- class: left # .left[Today's agenda] .pull-left[ .large[standard bar chart] <br> ordered <br> with a label <br> <br> <br> .large[grouped bar chart] <br> legend_position` <br> aggregate data <br> <br> <br> .large[stacked bar chart] <br> percent stacked <br> add_count <br> <br> ] .pull-right[ .large[stacked bar chart] <br> pivot_longer <br> case_when <br> <br> <br> .large[**standard bar chart**] <br> separate <br> pivot_longer <br> **which commanders initiated the most battles?** <br> <br> <br> .large[grouped bar chart] <br> year as character <br> string manipulation ] --- # `separate` <table class="table" style="font-size: 16px; width: auto !important; margin-left: auto; margin-right: auto;"> <thead> <tr> <th style="text-align:right;font-weight: bold;color: white !important;background-color: #bb0000 !important;"> battle_number </th> <th style="text-align:left;font-weight: bold;color: white !important;background-color: #bb0000 !important;"> attacker_commander </th> </tr> </thead> <tbody> <tr> <td style="text-align:right;"> 1 </td> <td style="text-align:left;"> Jaime Lannister </td> </tr> <tr> <td style="text-align:right;"> 2 </td> <td style="text-align:left;"> Gregor Clegane </td> </tr> <tr> <td style="text-align:right;"> 3 </td> <td style="text-align:left;"> Jaime Lannister, Andros Brax </td> </tr> <tr> <td style="text-align:right;"> 4 </td> <td style="text-align:left;"> Roose Bolton, Wylis Manderly, Medger Cerwyn, Harrion Karstark, Halys Hornwood </td> </tr> <tr> <td style="text-align:right;"> 5 </td> <td style="text-align:left;"> Robb Stark, Brynden Tully </td> </tr> <tr> <td style="text-align:right;"> 6 </td> <td style="text-align:left;"> Robb Stark, Tytos Blackwood, Brynden Tully </td> </tr> <tr> <td style="text-align:right;"> 7 </td> <td style="text-align:left;"> Gregor Clegane </td> </tr> <tr> <td style="text-align:right;"> 8 </td> <td style="text-align:left;"> Victarion Greyjoy </td> </tr> <tr> <td style="text-align:right;"> 9 </td> <td style="text-align:left;"> Asha Greyjoy </td> </tr> <tr> <td style="text-align:right;"> 10 </td> <td style="text-align:left;"> Theon Greyjoy </td> </tr> </tbody> </table> --- # `separate` ```r got %>% select(battle_number, attacker_commander) %>% * separate(attacker_commander, c("a","b","c","d","e","f"), * sep = ", ") ``` -- <table class="table" style="font-size: 16px; width: auto !important; margin-left: auto; margin-right: auto;"> <thead> <tr> <th style="text-align:right;font-weight: bold;color: white !important;background-color: #bb0000 !important;"> battle_number </th> <th style="text-align:left;font-weight: bold;color: white !important;background-color: #bb0000 !important;"> a </th> <th style="text-align:left;font-weight: bold;color: white !important;background-color: #bb0000 !important;"> b </th> <th style="text-align:left;font-weight: bold;color: white !important;background-color: #bb0000 !important;"> c </th> <th style="text-align:left;font-weight: bold;color: white !important;background-color: #bb0000 !important;"> d </th> <th style="text-align:left;font-weight: bold;color: white !important;background-color: #bb0000 !important;"> e </th> <th style="text-align:left;font-weight: bold;color: white !important;background-color: #bb0000 !important;"> f </th> </tr> </thead> <tbody> <tr> <td style="text-align:right;"> 1 </td> <td style="text-align:left;"> Jaime Lannister </td> <td style="text-align:left;"> </td> <td style="text-align:left;"> </td> <td style="text-align:left;"> </td> <td style="text-align:left;"> </td> <td style="text-align:left;"> </td> </tr> <tr> <td style="text-align:right;"> 2 </td> <td style="text-align:left;"> Gregor Clegane </td> <td style="text-align:left;"> </td> <td style="text-align:left;"> </td> <td style="text-align:left;"> </td> <td style="text-align:left;"> </td> <td style="text-align:left;"> </td> </tr> <tr> <td style="text-align:right;"> 3 </td> <td style="text-align:left;"> Jaime Lannister </td> <td style="text-align:left;"> Andros Brax </td> <td style="text-align:left;"> </td> <td style="text-align:left;"> </td> <td style="text-align:left;"> </td> <td style="text-align:left;"> </td> </tr> <tr> <td style="text-align:right;"> 4 </td> <td style="text-align:left;"> Roose Bolton </td> <td style="text-align:left;"> Wylis Manderly </td> <td style="text-align:left;"> Medger Cerwyn </td> <td style="text-align:left;"> Harrion Karstark </td> <td style="text-align:left;"> Halys Hornwood </td> <td style="text-align:left;"> </td> </tr> <tr> <td style="text-align:right;"> 5 </td> <td style="text-align:left;"> Robb Stark </td> <td style="text-align:left;"> Brynden Tully </td> <td style="text-align:left;"> </td> <td style="text-align:left;"> </td> <td style="text-align:left;"> </td> <td style="text-align:left;"> </td> </tr> <tr> <td style="text-align:right;"> 6 </td> <td style="text-align:left;"> Robb Stark </td> <td style="text-align:left;"> Tytos Blackwood </td> <td style="text-align:left;"> Brynden Tully </td> <td style="text-align:left;"> </td> <td style="text-align:left;"> </td> <td style="text-align:left;"> </td> </tr> <tr> <td style="text-align:right;"> 7 </td> <td style="text-align:left;"> Gregor Clegane </td> <td style="text-align:left;"> </td> <td style="text-align:left;"> </td> <td style="text-align:left;"> </td> <td style="text-align:left;"> </td> <td style="text-align:left;"> </td> </tr> <tr> <td style="text-align:right;"> 8 </td> <td style="text-align:left;"> Victarion Greyjoy </td> <td style="text-align:left;"> </td> <td style="text-align:left;"> </td> <td style="text-align:left;"> </td> <td style="text-align:left;"> </td> <td style="text-align:left;"> </td> </tr> </tbody> </table> --- # `pivot_longer` ```r got %>% select(battle_number, attacker_commander) %>% separate(attacker_commander, c("a","b","c","d","e","f"), sep = ", ") %>% * pivot_longer(a:f, values_to = "commander", * names_to = NULL, values_drop_na = TRUE) ``` -- <table class="table" style="font-size: 16px; width: auto !important; margin-left: auto; margin-right: auto;"> <thead> <tr> <th style="text-align:right;font-weight: bold;color: white !important;background-color: #bb0000 !important;"> battle_number </th> <th style="text-align:left;font-weight: bold;color: white !important;background-color: #bb0000 !important;"> commander </th> </tr> </thead> <tbody> <tr> <td style="text-align:right;"> 1 </td> <td style="text-align:left;"> Jaime Lannister </td> </tr> <tr> <td style="text-align:right;"> 2 </td> <td style="text-align:left;"> Gregor Clegane </td> </tr> <tr> <td style="text-align:right;"> 3 </td> <td style="text-align:left;"> Jaime Lannister </td> </tr> <tr> <td style="text-align:right;"> 3 </td> <td style="text-align:left;"> Andros Brax </td> </tr> <tr> <td style="text-align:right;"> 4 </td> <td style="text-align:left;"> Roose Bolton </td> </tr> <tr> <td style="text-align:right;"> 4 </td> <td style="text-align:left;"> Wylis Manderly </td> </tr> <tr> <td style="text-align:right;"> 4 </td> <td style="text-align:left;"> Medger Cerwyn </td> </tr> <tr> <td style="text-align:right;"> 4 </td> <td style="text-align:left;"> Harrion Karstark </td> </tr> </tbody> </table> --- # 5. Standard bar chart ```r got %>% select(battle_number, attacker_commander) %>% separate(attacker_commander, c("a","b","c","d","e","f"), sep = ", ") %>% pivot_longer(a:f, values_to = "commander", names_to = NULL, values_drop_na = TRUE) %>% count(commander) %>% ggplot(aes(x = commander, y = n)) + geom_bar(stat = "identity") ``` --- # 5. Standard bar chart <img src="figs/Lec3/separate-6-1.png" width="504" style="display: block; margin: auto;" /> --- # 5. Standard bar chart ```r got %>% select(battle_number, attacker_commander) %>% separate(attacker_commander, c("a","b","c","d","e","f"), sep = ", ") %>% pivot_longer(a:f, values_to = "commander", names_to = NULL, values_drop_na = TRUE) %>% count(commander) %>% * filter(n > 2) %>% * ggplot(aes(x = reorder(commander, -n), y = n)) + geom_bar(stat = "identity") ``` --- # 5. Standard bar chart <img src="figs/Lec3/separate-7-1.png" width="504" style="display: block; margin: auto;" /> --- class: left # .left[Today's agenda] .pull-left[ .large[standard bar chart] <br> ordered <br> with a label <br> <br> <br> .large[grouped bar chart] <br> legend_position` <br> aggregate data <br> <br> <br> .large[stacked bar chart] <br> percent stacked <br> add_count <br> <br> ] .pull-right[ .large[stacked bar chart] <br> pivot_longer <br> case_when <br> <br> <br> .large[standard bar chart] <br> separate <br> pivot_longer <br> <br> <br> .large[**grouped bar chart**] <br> year as character <br> string manipulation <br> **which kings attacked in which years?** <br> ] --- # 6. Grouped bar chart ```r got %>% filter(!is.na(attacker_king)) %>% count(attacker_king, year) %>% ggplot(aes(x = attacker_king, y = n, fill = year)) + geom_bar(stat = "identity", position = "dodge") ``` -- <img src="figs/Lec3/string-1-1.png" width="504" style="display: block; margin: auto;" /> --- # 6. Grouped bar chart ```r got %>% filter(!is.na(attacker_king)) %>% count(attacker_king, year) %>% * ggplot(aes(x = attacker_king, y = n, fill = as.character(year))) + geom_bar(stat = "identity", position = "dodge") ``` -- <img src="figs/Lec3/string-2-1.png" width="504" style="display: block; margin: auto;" /> --- # 6. Grouped bar chart ```r got %>% filter(!is.na(attacker_king)) %>% count(attacker_king, year) %>% ggplot(aes(x = attacker_king, y = n, fill = as.character(year))) + geom_bar(stat = "identity", position = "dodge") + * geom_text(aes(label = n), position = position_dodge(0.9), * vjust = -0.5) ``` --- # 6. Grouped bar chart <br> <br> <img src="figs/Lec3/string-3-1.png" width="504" style="display: block; margin: auto;" /> --- # 6. Grouped bar chart ```r got %>% filter(!is.na(attacker_king)) %>% count(attacker_king, year) %>% ggplot(aes(x = attacker_king, y = n, fill = as.character(year))) + geom_bar(stat = "identity", position = "dodge") + geom_text(aes(label = n), position = position_dodge(0.9), vjust = -0.5) + * labs(fill = "Year", * title = "Number of battles by attacking king and year") + * theme(axis.title = element_blank(), * axis.text = element_text(size = 14), * plot.title = element_text(size = 16)) ``` --- # 6. Grouped bar chart <br> <br> <img src="figs/Lec3/string-4-1.png" width="504" style="display: block; margin: auto;" /> --- # 6. Grouped bar chart ```r got %>% filter(!is.na(attacker_king)) %>% count(attacker_king, year) %>% * mutate(attacker_king = str_replace(attacker_king, "/", "/\n"), * attacker_king = str_replace(attacker_king, " ", "\n")) %>% ggplot(aes(x = attacker_king, y = n, fill = as.character(year))) + geom_bar(stat = "identity", position = "dodge") + geom_text(aes(label = n), position = position_dodge(0.9), vjust = -0.5) + labs(fill = "Year", title = "Number of battles by attacking king and year") + theme(axis.title = element_blank(), axis.text = element_text(size = 14), plot.title = element_text(size = 16)) ``` --- # 6. Grouped bar chart <br> <br> <img src="figs/Lec3/string-5-1.png" width="504" style="display: block; margin: auto;" /> --- # Upcoming <br> .large[Lab 2 on Thursday May 27] <br> .medium[Assignments due 11:30am EDT Friday] <br> .large[Lecture 4 on Friday May 28] <br> .medium[Histograms and box plots]