To get ready for class: 1. Get births ready as usual 2. Install package dagitty Brad Pitt in Movie Snatch DAGs (& the class project)

EPID 799C Fall 2017 Overview Class project chat Overview / Review of DAGs DAGs in R

Class Project Lets review last years http://learnr.web.unc.edu/files/2018/10/EPID-799C-Projects-20 17.zip (Find at bottom of schedule by the due date) Project Themes Use your own data / existing projects

Play with new visuals (extensions to ggplot? Maps?) Replicate a previous analysis (718?) Explore a new technique (multi-level) Use new R tools (purrr, advanced dplyr) Its for you! Invest in yourself! Were just making you do it.

Back to DAGs, Confounding & EMM From last class (and upcoming HW4). These were all CRUDE effects. Still, theyre different! DAGs inform our control set.

What is a DAG? DAGs (Directed Acyclic Graphs) document causal assumptions / knowledge from our head or literature. They can be used to guide us to better answer questions like: Does a change in A prompt a change in B? Or the other way around? Or not in either direction, but their association is

caused by some third thing? DAG Requirements Directed: , not -Acyclic: AB, not AB and BA Graph: Connected, not dangling*. Other disciplines, from engineering to other forms of statistical model, relax these requirements.

Overview Directed: , not - Acyclic: AB, not AB and BA Graph: Connected, not dangling*. Mediation Confounders

Colliders Inducing a biased causal association through controlling a collider is: collider stratification bias Effect Measure Modifiers

EMM: Two notes Note that an EMM may or may not be a confounder (influencing the values of A and B directly, vs. the effect of A on B), so may not be in the DAG node network. EMMs are good for a DAG notes though! Also note that it may be rare that an exposure / intervention switches direction entirely. Effect measure modification worth acknowledging may be a

matter of degree or important to report because of context - regardless of statistical interaction, p=whatever. In Sum: Create a model, throw things in, reduce by p-value / backwards selection, or any of a number of techniques. May be good at

predicting outcome from exposure and other variables. There is nuance here, but In Sum: if we want the causal effect, we have to be intentional about what parts of this flow we block. We leave direct and indirect causal paths, and leave blocked paths with colliders (do not

control!) How do we do this? Encode the nodes and directed edges of the DAG from the literature / content knowledge By eye, hand, or software 1. document all paths between Exposure and Outcome 2. Identify whether they are already blocked (collider),

backdoor (confounded), or causal (direct or indirect through a mediator) 3. Select nodes to statistically control, often ideally as few as possible, to block the open backdoor paths without blocking A note on functional form! Remember maternal age? In order to improve precision

of estimates (and acknowledge the linear assumptions of GLMs), it behoves us to model covariates as well as possible balancing parsimony and interpretation. Hence mage2 or splines of some kind. This does not apply as directly to our exposures, which we want to interpret in actionable, communicable ways! .and a note on

interpretation! Relatedly, the Table 2 fallacy suggests Reminder: Table 2 fallacy! (Westrich & Greenland 2013). If those estimates are largely uninterpretable in causal inference context, might as well let them go and model them more precisely (albeit obfuscated). Westreich, D., and S. Greenland. The Table 2 Fallacy: Presenting and Interpreting Confounder

and Modifier Coefficients. American Journal of Epidemiology 177, no. 4 (February 15, 2013): 29298. https://doi.org/10.1093/aje/kws412. DAG critiques Reality isnt DAGGY DAGs (even if large) are a model, and so a simplified version of reality, in this particular case requiring unidirectionality and acyclic assumptions. Reality is often a system with feedback loops and inter-relationships that

may not be modeled well with unidirectional models, perhaps especially with social / network processes. There are other methods! Not all nodes / edges are alike Race-ethnicity in particular is a heavily overloaded construct for causal inference (Vanderweele & Robinson 2014), reaching back to represent historical and current systemic oppression and racism, physical phenotype, experiences of cultural and ascribed identity. Parts of this construct may

have different causal relationships. Break apart if you can, and regardless, be mindful / nuanced in your interpretation. Not always interested in causal effects Prediction, association, other techniques have a place in a public health toolbox. VanderWeele, Tyler J., and Whitney R. Robinson. On the Causal Interpretation of Race in Regressions Adjusting for Confounding and Mediating Variables: Epidemiology 25, no. 4 (July 2014): 47384. doi:10.1097/EDE.0000000000000105.

Our (toy) DAG Directed Acyclic Graphs (DAGs) inform our variable selection and treatment in models (based on their status as mediators, confounders, effect measure modifiers, etc. We will not elaborate in this class! Take the Epi sequence for more. DAG from EPID 716 / Christy Avery

Lets Try: Dagitty Check it out here: www.dagitty.net Premade DAG for you: dagitty.net/moAh6a6 DAGs in R The online dagitty tool ( http://www.dagitty.net/dags.html#) exports R code for the dagitty package. Can be directly downloaded

into R! The new ggdag package is a beefed-up version using tidy data structures we can recognize. DL the R script from the website Lets Try: DAGs in R with dagitty

dagitty() : makes DAGs adjustmentSets() : minimal / all adjustment sets paths() : paths!

children(), etc. downloadGraph design in dagitty, pull down in R instrumentalVariables(), SEM stuff, testing your data against a DAG, etc. Currently beyond me! Lets Try: DAGs in R with ggdag

tidy_dagitty() ggdag(dag_df) + theme_dag() gets you a nice ggplot2 geometries, edit plot as usual See: https://cran.r-project.org/web/packages/ggdag/vig nettes/intro-to-ggdag.html

And: https://github.com/malcolmbarrett/ggdag Practical Uses R may be helpful for quickly changing and rerunning for minimal sets a few different DAG scenarios. Probably better than hand. Nice pair with dagitty website. Maybe useful for SEM? I dunno. But you can stick

your DAGs in papers / R Markdown, make ggplots of them, etc. Other Packages If you do this a lot, find your favorite! R skills let you translate data structures across packages as you need to. dagitty * ggdag: Today! DiagrammeR: Prettier, but no adjustment sets?

https://donlelek.github.io/2015-03-31-dags-with-r/ dagR: Prettier and adjustment sets, but funny syntax? http://rstudio-pubs-static.s3.amazonaws.com/2609_e3d86d07 48c04eb18d5f56d6a99feb3f.html