class: center, middle, inverse, title-slide #
Introduction to Data Visualization for Meta-Analysis
##
with tidymeta and ggplot2
###
Malcolm Barrett
Install R:
bit.ly/pm605_r
Handout:
bit.ly/pm605_tut
###
04/23/2018
--- class: inverse-ns, center # Data Visualization with R ## <span style = 'color:#E69F00'>ggplot2</span> and the <span style = 'color:#E69F00'>tidyverse</span> are <span style = 'color:#56B4E9'>friendly and consistent</span> tools for data analysis and visualization --- class: inverse-ns, center # Data Visualization with R ## <span style = 'color:#6C7B7F'>ggplot2 and the tidyverse are friendly and consistent tools for data analysis and visualization</span> ## <span style = 'color:#E69F00'>Better plots</span> are <span style = 'color:#56B4E9'>better communication</span> --- class: inverse-ns, center # Data Visualization with R ## <span style = 'color:#6C7B7F'>ggplot2 and the tidyverse are friendly and consistent tools for data analysis and visualization</span> ## <span style = 'color:#6C7B7F'>Better plots are better communication</span> ## <span style = 'color:#E69F00'>tidymeta</span> makes it easy to manipulate and plot meta-analysis results --- # Introduction to the Data ## What's the impact of <span style = 'color:#E69F00'>intrauterine device (IUD)</span> use on risk of <span style = 'color:#56B4E9'>cervical cancer</span>? --- # Introduction to the Data ## <span style = 'color:#E5E5E5'>What's the impact of intrauterine device (IUD) use on risk of cervical cancer?</span> ## 16 studies: 4,945 cases and 7,537 controls --- # Introduction to the Data ## <span style = 'color:#E5E5E5'>What's the impact of intrauterine device (IUD) use on risk of cervical cancer?</span> ## <span style = 'color:#E5E5E5'>16 studies: 4,945 cases and 7,537 controls</span> ## Women who used IUDs were at a <span style = 'color:#E69F00'>third less risk</span> than those who didn't (OR 0.64) --- ```r library(tidymeta) *iud_cxca ``` ``` ## # A tibble: 16 x 26 ## study_id study_name author es l95 u95 lnes lnl95 lnu95 ## <int> <chr> <chr> <dbl> <dbl> <dbl> <dbl> <dbl> <dbl> ## 1 1 Roura, 2016 Roura 0.600 0.300 1.20 -0.511 -1.20 0.182 ## 2 2 Lassise, 1991 Lassi… 0.800 0.500 1.20 -0.223 -0.693 0.182 ## 3 3 Li, 2000 Li 0.890 0.730 1.08 -0.117 -0.315 0.0770 ## 4 4 Shields, 2004 Shiel… 0.500 0.300 0.820 -0.693 -1.20 -0.198 ## 5 5 Castellsague… Caste… 0.630 0.380 1.06 -0.462 -0.968 0.0583 ## 6 6 Castellsague… Caste… 0.450 0.300 0.670 -0.799 -1.20 -0.400 ## 7 7 Brinton, 1990 Brint… 0.690 0.500 0.900 -0.371 -0.693 -0.105 ## 8 8 Parazzini, 1… Paraz… 0.600 0.300 1.10 -0.511 -1.20 0.0953 ## 9 9 Williams, 19… Willi… 1.00 0.600 1.60 0. -0.511 0.470 ## 10 10 Hammouda, 20… Hammo… 0.300 0.100 0.500 -1.20 -2.30 -0.693 ## 11 11 Castellsague… Caste… 1.08 0.370 3.20 0.0770 -0.994 1.16 ## 12 12 Castellsague… Caste… 0.340 0.0500 2.56 -1.08 -3.00 0.940 ## 13 13 Castellsague… Caste… 0.870 0.340 2.23 -0.139 -1.08 0.802 ## 14 14 Castellsague… Caste… 0.490 0.190 1.23 -0.713 -1.66 0.207 ## 15 15 Castellsague… Caste… 0.240 0.0900 0.660 -1.43 -2.41 -0.416 ## 16 16 Celentano, 1… Celen… 0.500 0.170 1.47 -0.693 -1.77 0.385 ## # ... with 17 more variables: selnes <dbl>, group <fct>, case_num <dbl>, ## # control_num <dbl>, start_recruit <dbl>, stop_recruit <dbl>, ## # pub_year <dbl>, numpap <dbl>, ses <dbl>, gravidity <dbl>, ## # lifetimepart <dbl>, coitarche <dbl>, hpvstatus <dbl>, smoking <dbl>, ## # location <chr>, aair <dbl>, hpvrate <dbl> ``` --- # Five variables from `iud_cxca` we'll use ## `study_name` ## `lnes` ## `selnes` ## `group` ## `pub_year` --- # Five variables from `iud_cxca` we'll use ## <span style = 'color:#E69F00'><code>study_name</code></span> = <span style = 'color:#56B4E9'>Author + study year</span> ## `lnes` ## `selnes` ## `group` ## `pub_year` --- # Five variables from `iud_cxca` we'll use ## `study_name` ## <span style = 'color:#E69F00'><code>lnes</code></span> = <span style = 'color:#56B4E9'>ln(Odds Ratio)</span> ## `selnes` ## `group` ## `pub_year` --- # Five variables from `iud_cxca` we'll use ## `study_name` ## `lnes` ## <span style = 'color:#E69F00'><code>selnes</code></span> = <span style = 'color:#56B4E9'>SE of ln(OR)</span> ## `group` ## `pub_year` --- # Five variables from `iud_cxca` we'll use ## `study_name` ## `lnes` ## `selnes` ## <span style = 'color:#E69F00'><code>group</code></span> = <span style = 'color:#56B4E9'>Study design</span> ## `pub_year` --- # Five variables from `iud_cxca` we'll use ## `study_name` ## `lnes` ## `selnes` ## `group` ## <span style = 'color:#E69F00'><code>pub_year</code></span> = <span style = 'color:#56B4E9'>Publication year</span> --- class: inverse-ns, center, middle # Meta-Analysis Plot Types --- class: inverse-ns, center, middle # Meta-Analysis Plot Types ## <span style = 'color:#E69F00'>Forest Plot</span> --- ## `forest_plot()` <img src="ma_workshop_files/figure-html/unnamed-chunk-2-1.png" width="60%" style="display: block; margin: auto;" /> --- class: inverse-ns, center, middle # Meta-Analysis Plot Types ## <span style = 'color:#6C7B7F'>Forest Plot</span> ## <span style = 'color:#E69F00'>Funnel Plot</span> --- ## `funnel_plot()` <img src="ma_workshop_files/figure-html/unnamed-chunk-3-1.png" width="75%" style="display: block; margin: auto;" /> --- class: inverse-ns, center, middle # Meta-Analysis Plot Types ## <span style = 'color:#6C7B7F'>Forest Plot</span> ## <span style = 'color:#6C7B7F'>Funnel Plot</span> ## <span style = 'color:#E69F00'>Influence/Sensitivity Plot</span> --- ## `influence_plot()` <img src="ma_workshop_files/figure-html/unnamed-chunk-4-1.png" width="60%" style="display: block; margin: auto;" /> --- class: inverse-ns, center, middle # Meta-Analysis Plot Types ## <span style = 'color:#6C7B7F'>Forest Plot</span> ## <span style = 'color:#6C7B7F'>Funnel Plot</span> ## <span style = 'color:#6C7B7F'>Influence/Sensitivity Plot</span> ## <span style = 'color:#E69F00'>Cumulative Plot</span> --- ## `cumulative_plot()` <img src="ma_workshop_files/figure-html/unnamed-chunk-5-1.png" width="60%" style="display: block; margin: auto;" /> --- class: inverse, center, middle # A Crash Course in the Tidyverse --- background-image: url(http://hexb.in/hexagons/ggplot2.png) background-position: 90% 10% # ggplot2: Elegant Data Visualizations in R --- background-image: url(http://hexb.in/hexagons/ggplot2.png) background-position: 90% 10% # ggplot2: Elegant Data Visualizations in R ## Based on a Grammar of Graphics --- background-image: url(http://hexb.in/hexagons/ggplot2.png) background-position: 90% 10% # ggplot2: Elegant Data Visualizations in R ## <span style = 'color:#E5E5E5'>Based on a Grammar of Graphics</span> ## Data is mapped to aesthetics; Statistics and plot are linked --- background-image: url(http://hexb.in/hexagons/ggplot2.png) background-position: 90% 10% # ggplot2: Elegant Data Visualizations in R ## <span style = 'color:#E5E5E5'>Based on a Grammar of Graphics</span> ## <span style = 'color:#E5E5E5'>Data is mapped to aesthetics; Statistics and plot are linked</span> ## Sensible defaults; Infinitely extensible --- ```r library(ggplot2) p <- ggplot(iud_cxca, aes(case_num + control_num, lnes, color = group)) p ``` <img src="ma_workshop_files/figure-html/unnamed-chunk-6-1.png" style="display: block; margin: auto;" /> --- ```r library(ggplot2) p <- p + geom_point() p ``` <img src="ma_workshop_files/figure-html/unnamed-chunk-7-1.png" style="display: block; margin: auto;" /> --- ```r p <- p + geom_smooth(method = "lm", se = FALSE) p ``` <img src="ma_workshop_files/figure-html/unnamed-chunk-8-1.png" style="display: block; margin: auto;" /> --- ```r p + labs(title = "The Effect of Sample Size on Estimate", x = "Sample Size", y = "ln(Odds Ratio)") + scale_color_discrete(name = "Study Design") + theme_minimal() + theme(text = element_text(size = 16)) ``` <img src="ma_workshop_files/figure-html/unnamed-chunk-9-1.png" style="display: block; margin: auto;" /> --- # Tidy Data is Easier to Plot </div> --- # Tidy Data is Easier to Plot </div> ### .medium[Each <span style = 'color:#E69F00'>column</span> is a single <span style = 'color:#56B4E9'>variable</span>] --- # Tidy Data is Easier to Plot </div> ### <span style = 'color:#E5E5E5'>.medium[Each column is a single variable]</span> ### .medium[Each <span style = 'color:#E69F00'>row</span> is a single <span style = 'color:#56B4E9'>observation</span>] --- # Tidy Data is Easier to Plot </div> ### <span style = 'color:#E5E5E5'>.medium[Each column is a single variable]</span> ### <span style = 'color:#E5E5E5'>.medium[Each row is a single observation]</span> ### .medium[Each <span style = 'color:#E69F00'>cell</span> is a <span style = 'color:#56B4E9'>value</span>] --- # Our Tidy Tools .pull-left[ ### `%>%` ### `mutate()` ### `arrange()` ### `group_by()` ### `tidy()` ] .pull-right[  ] --- # Our Tidy Tools .pull-left[ ### <span style = 'color:#E69F00'><code>%>%</code></span>: <span style = 'color:#56B4E9'>passes</span> the results of one function to the next ### `mutate()` ### `arrange()` ### `group_by()` ### `tidy()` ] .pull-right[  ] --- # Our Tidy Tools .pull-left[ ### `%>%` ### <span style = 'color:#E69F00'><code>mutate()</code></span>: <span style = 'color:#56B4E9'>changes</span> or creates a new variable ### `arrange()` ### `group_by()` ### `tidy()` ] .pull-right[  ] --- # Our Tidy Tools .pull-left[ ### `%>%` ### `mutate()` ### <span style = 'color:#E69F00'><code>arrange()</code></span>: <span style = 'color:#56B4E9'>sorts</span> a data set by a variable ### `group_by()` ### `tidy()` ] .pull-right[  ] --- # Our Tidy Tools .pull-left[ ### `%>%` ### `mutate()` ### `arrange()` ### <span style = 'color:#E69F00'><code>group_by()</code></span>: <span style = 'color:#56B4E9'>groups</span> a data set by a variable ### `tidy()` ] .pull-right[  ] --- # Our Tidy Tools .pull-left[ ### `%>%` ### `mutate()` ### `arrange()` ### `group_by()` ### <span style = 'color:#E69F00'><code>tidy()</code></span>: <span style = 'color:#56B4E9'>tidies</span> statistical results ] .pull-right[  ] --- # Tidy Meta-Analysis ## <span style = 'color:#E69F00'><code>meta_analysis()</code></span> --- # Tidy Meta-Analysis ## <span style = 'color:#E69F00'><code>meta_analysis()</code></span> ```r ma <- iud_cxca %>% group_by(group) %>% meta_analysis(yi = lnes, sei = selnes, slab = study_name, exponentiate = TRUE) ma ``` ``` ## # A tibble: 21 x 11 ## group study type estimate std.error statistic p.value conf.low ## <fct> <chr> <chr> <dbl> <dbl> <dbl> <dbl> <dbl> ## 1 Nested … Roura, 2… study 0.600 0.354 -1.44 NA 0.300 ## 2 Nested … Subgroup… summ… 0.600 0.354 -1.44 0.149 0.300 ## 3 Populat… Lassise,… study 0.800 0.223 -0.999 NA 0.516 ## 4 Populat… Li, 2000 study 0.890 0.0999 -1.17 NA 0.732 ## 5 Populat… Shields,… study 0.500 0.257 -2.70 NA 0.302 ## 6 Populat… Castells… study 0.630 0.262 -1.77 NA 0.377 ## 7 Populat… Castells… study 0.450 0.205 -3.90 NA 0.301 ## 8 Populat… Subgroup… summ… 0.655 0.146 -2.90 0.00374 0.492 ## 9 Clinic-… Brinton,… study 0.690 0.150 -2.47 NA 0.514 ## 10 Clinic-… Parazzin… study 0.600 0.331 -1.54 NA 0.313 ## # ... with 11 more rows, and 3 more variables: conf.high <dbl>, ## # meta <list>, weight <dbl> ``` --- # Forest Plot ## <span style = 'color:#E69F00'><code>forest_plot()</code></span> --- # Forest Plot ## <span style = 'color:#E69F00'><code>forest_plot()</code></span> ```r ma %>% forest_plot(group = group) ``` --- # Forest Plot ## `forest_plot()` ```r ma %>% forest_plot(group = group) ``` ## <span style = 'color:#E69F00'><code>text_table()</code></span> --- # Forest Plot ## `forest_plot()` ```r ma %>% forest_plot(group = group) ``` ## <span style = 'color:#E69F00'><code>text_table()</code></span> ```r ma %>% text_table(group = group, "Weights" = weight) ``` --- # patchwork: Compose ggplots .pull-right[  ] --- # patchwork: Compose ggplots .pull-left[ ## <span style = 'color:#E69F00'>Join</span> ggplots quickly and accurately ] .pull-right[  ] --- # patchwork: Compose ggplots .pull-left[ ## <span style = 'color:#E69F00'>Join</span> ggplots quickly and accurately ```r library(patchwork) forest_plot() + text_table() ``` ] .pull-right[  ] --- # Funnel Plot ## <span style = 'color:#E69F00'><code>funnel_plot()</code></span> --- # Funnel Plot ## <span style = 'color:#E69F00'><code>funnel_plot()</code></span> ```r ma %>% funnel_plot(log_summary = TRUE) ``` --- # Influence Plot ## <span style = 'color:#E69F00'><code>sensitivity()</code></span> --- # Influence Plot ## <span style = 'color:#E69F00'><code>sensitivity()</code></span> ```r ma %>% sensitivity(exponentiate = TRUE) ``` --- # Influence Plot ## `sensitivity()` ```r ma %>% sensitivity(exponentiate = TRUE) ``` ## <span style = 'color:#E69F00'><code>influence_plot()</code></span> --- # Influence Plot ## `sensitivity()` ```r ma %>% sensitivity(exponentiate = TRUE) ``` ## <span style = 'color:#E69F00'><code>influence_plot()</code></span> ```r ma %>% sensitivity(exponentiate = TRUE) %>% influence_plot() ``` --- # Cumulative Plot ## <span style = 'color:#E69F00'><code>cumulative()</code></span> --- # Cumulative Plot ## <span style = 'color:#E69F00'><code>cumulative()</code></span> ```r ma %>% arrange(desc(weight)) %>% cumulative(exponentiate = TRUE) ``` --- # Cumulative Plot ## `cumulative()` ```r ma %>% arrange(desc(weight)) %>% cumulative(exponentiate = TRUE) ``` ## <span style = 'color:#E69F00'><code>cumulative_plot()</code></span> --- # Cumulative Plot ## `cumulative()` ```r ma %>% arrange(desc(weight)) %>% cumulative(exponentiate = TRUE) ``` ## <span style = 'color:#E69F00'><code>cumulative_plot()</code></span> ```r ma %>% arrange(desc(weight)) %>% cumulative(exponentiate = TRUE) %>% cumulative_plot(sum_lines = FALSE) ``` --- # Importing Stata data, saving ggplots --- # Importing Stata data, saving ggplots ## <span style = 'color:#E69F00'>haven</span>: <span style = 'color:#56B4E9'><code>read_dta()</code></span> --- # Importing Stata data, saving ggplots ## <span style = 'color:#E69F00'>haven</span>: <span style = 'color:#56B4E9'><code>read_dta()</code></span> ```r library(haven) data <- read_dta("stata_data.dta") ``` --- # Importing Stata data, saving ggplots ## haven: `read_dta()` ```r library(haven) data <- read_dta("stata_data.dta") ``` ## <span style = 'color:#E69F00'>ggplot2</span>: <span style = 'color:#56B4E9'><code>ggsave()</code></span> --- # Importing Stata data, saving ggplots ## haven: `read_dta()` ```r library(haven) data <- read_dta("stata_data.dta") ``` ## <span style = 'color:#E69F00'>ggplot2</span>: <span style = 'color:#56B4E9'><code>ggsave()</code></span> ```r library(ggplot2) p <- forest_plot(ma, group = group) ggsave(p, "forest_plot.png", dpi = 320, height = 8) ``` --- class: inverse, center # tidymeta --- class: inverse, center # tidymeta ## <span style = 'color:#E69F00'><code>meta_analysis()/your_favorite_function() + tidy()</code></span> --- class: inverse, center # tidymeta ## `meta_analysis()/your_favorite_function() + tidy()` ## <span style = 'color:#E69F00'><code>forest_plot()/text_table()</code></span> --- class: inverse, center # tidymeta ## `meta_analysis()/your_favorite_function() + tidy()` ## `forest_plot()/text_table()` ## <span style = 'color:#E69F00'><code>sensitivity()/influence_plot()</code></span> --- class: inverse, center # tidymeta ## `meta_analysis()/your_favorite_function() + tidy()` ## `forest_plot()/text_table()` ## `sensitivity()/influence_plot()` ## <span style = 'color:#E69F00'><code>cumulative()/cumulative_plot()</code></span> --- class: inverse, center # Resources ## [R for Data Science](http://r4ds.had.co.nz/): A comprehensive but friendly introduction to the tidyverse. Free online. ## [DataCamp](https://www.datacamp.com/): ggplot2 courses and tidyverse courses ## [ggplot2: Elegant Graphics for Data Analysis](https://smile.amazon.com/ggplot2-Elegant-Graphics-Data-Analysis/dp/331924275X/ref=sr_1_2?ie=UTF8&qid=1524362742&sr=8-2&keywords=ggplot2): The official ggplot2 book --- class: inverse, center, middle  ###
[github.com/malcolmbarrett/tidymeta](https://github.com/malcolmbarrett/tidymeta) ###
[github.com/malcolmbarrett/ma_viz_workshop](https://github.com/malcolmbarrett/ma_viz_workshop) Slides created via the R package [**xaringan**](https://github.com/yihui/xaringan).