+ - 0:00:00
Notes for current slide
Notes for next slide

36-315: Statistical Graphics and Visualization

Lecture 1

Meghan Hall
Department of Statistics & Data Science
Carnegie Mellon University
May 21, 2021

1

Teaching team

Instructor: Meghan Hall


Grad TA: Galen Vincent

Undergrad TAs



Office hours TBD

2

Course objectives

  1. Create statistical graphics.

3

Course objectives

  1. Create statistical graphics.

  2. Understand the fundamentals of data and reproducible data analysis.

3

Course objectives

  1. Create statistical graphics.

  2. Understand the fundamentals of data and reproducible data analysis.

  3. Write about statistical graphics.

3

Course objectives

  1. Create statistical graphics.

  2. Understand the fundamentals of data and reproducible data analysis.

  3. Write about statistical graphics.

  4. Speak about statistical graphics and data analyses.

3

Course objectives

  1. Create statistical graphics.

  2. Understand the fundamentals of data and reproducible data analysis.

  3. Write about statistical graphics.

  4. Speak about statistical graphics and data analyses.

  5. Assess and critique statistical graphics.

3

Course tools



R & RStudio



4

Course tools



R & RStudio



ggplot2 and related packages



4

Course tools



R & RStudio



ggplot2 and related packages



R Markdown



4

Course components




syllabus snippet

5

Course components



Lectures

6

Course components



Lectures

Labs

6

Course components



Lectures

Labs

Homework

6

Course components



Lectures

Labs

Homework

Code style

6

Code style



Code must be written with the tidyverse style guide

Ignore section II, focus on I.2, I.4, I.5

It will match lecture notes, lab notes, etc.

You can use the R package styler if you want

7

Course components



Lectures

Labs

Homework

Code style

8

Course components



Lectures

Labs

Homework

Code style

Graphics discussion

8

Course components



Lectures

Labs

Homework

Code style

Graphics discussion

Midterm

8

Course components



Lectures

Labs

Homework

Code style

Graphics discussion

Midterm

Group project

8

Course components

9

Various logistics



Course website(s)

10

Various logistics



Course website(s)

Piazza

10

Various logistics



Course website(s)

Piazza

Communication

10

Various logistics



Course website(s)

Piazza

Communication

Office hours

10

Various logistics



Course website(s)

Piazza

Communication

Office hours

Extensions

10

Various logistics



Course website(s)

Piazza

Communication

Office hours

Extensions

Regrades

10

Various logistics



Course website(s)

Piazza

Communication

Office hours

Extensions

Regrades

Integrity

10

Why do we visualize data?



x y
55.3846 97.1795
51.5385 96.0256
46.1538 94.4872
42.8205 91.4103
40.7692 88.3333
38.7179 84.8718
35.6410 79.8718
33.0769 77.5641
28.9744 74.4872
26.1538 71.4103
11

Why do we visualize data?


Mean of x Mean of y
54.26327 47.83225
12

Why do we visualize data?


Mean of x Mean of y
54.26327 47.83225


SD of x SD of y
16.76514 26.9354
12

Why do we visualize data?


Mean of x Mean of y
54.26327 47.83225


SD of x SD of y
16.76514 26.9354


Variable Min Max
x 22.3077 98.2051
y 2.9487 99.4872
12

Why do we visualize data?

13

Why do we visualize data?




Explore


Diagnose


Explain

14

By the end of the class



You can...

     Ask relevant questions from data

     Know which types of visualizations are appropriate for your data

     Know which types of visualizations are appropriate for your audience

     Create plots that are

         Effective in their properties

         Elegant & aesthetically-pleasing

15

Some golden rules of graphs



Don’t add complexity without a good reason.

16

Some golden rules of graphs



Don’t add complexity without a good reason.

Everything (everything!) must be readable.

16

Some golden rules of graphs



Don’t add complexity without a good reason.

Everything (everything!) must be readable.

Don’t distort data, intentionally or not.

16

Some golden rules of graphs



Don’t add complexity without a good reason.

Everything (everything!) must be readable.

Don’t distort data, intentionally or not.

Be mindful of the data-to-ink ratio.

16

Some golden rules of graphs



Don’t add complexity without a good reason.

Everything (everything!) must be readable.

Don’t distort data, intentionally or not.

Be mindful of the data-to-ink ratio.

All axes, labels, etc. should have real titles, not code variable names.

16

Some golden rules of graphs



Don’t add complexity without a good reason.

Everything (everything!) must be readable.

Don’t distort data, intentionally or not.

Be mindful of the data-to-ink ratio.

All axes, labels, etc. should have real titles, not code variable names.

Always strive for clarity.

16

Some golden rules of graphs



Don’t add complexity without a good reason.

Everything (everything!) must be readable.

Don’t distort data, intentionally or not.

Be mindful of the data-to-ink ratio.

All axes, labels, etc. should have real titles, not code variable names.

Always strive for clarity.

Titles, subtitles, and captions should add information.

16

Upcoming


Lecture 2 on Monday May 24
grammar of graphics and tidyverse principles


Lab 1 on Tuesday May 25
be on the lookout for a survey about times

17

Teaching team

Instructor: Meghan Hall


Grad TA: Galen Vincent

Undergrad TAs



Office hours TBD

2
Paused

Help

Keyboard shortcuts

, , Pg Up, k Go to previous slide
, , Pg Dn, Space, j Go to next slide
Home Go to first slide
End Go to last slide
Number + Return Go to specific slide
b / m / f Toggle blackout / mirrored / fullscreen mode
c Clone slideshow
p Toggle presenter mode
t Restart the presentation timer
?, h Toggle this help
Esc Back to slideshow