Introduction
Causal inference often requires investigating how an effect occurs. Does affect directly, or does it work through a mediator ?
The because package provides a fully automated Bayesian
mediation analysis tool, because_mediation(), which
decomposes the Total Effect of an exposure on an outcome into: 1.
Direct Effect: The effect of
not mediated by other variables in the graph. 2. Indirect
Effect(s): The effect propagated through intermediate variables
().
This is calculated by multiplying the posterior distributions of coefficients along each path, preserving full uncertainty quantification.
Example: Ecological Mediation (Elevation Gradient)
In this example, we investigate how Elevation affects Plant Abundance. We hypothesize that Elevation acts through a causal chain involving Temperature and Soil Moisture:
- Elevation determines Temperature (higher elevation lower temperature).
- Temperature influences Soil Moisture (lower temperature lower evaporation higher moisture).
- Abundance is driven by Moisture, Temperature, and potentially a direct effect of Elevation (e.g., due to UV radiation or partial pressure of gases).
1. Simulate Data
We simulate plots along an elevation gradient.
library(because)
set.seed(42)
N <- 200
# 1. Elevation (Exogenous variable)
# Ranges roughly from 500m to 1500m
Elevation <- rnorm(N, mean = 1000, sd = 200)
# 2. Temperature (Mediator 1)
# Decreases with Elevation (Lapse rate approx effect)
# Coef: -0.01 implies 100m climb -> -1 degree C
Temp <- 25 - 0.01 * Elevation + rnorm(N, sd = 2)
# 3. Moisture (Mediator 2)
# Driven by Temperature. Cooler -> Moister.
# We model a negative relationship with Temp.
Moisture <- 20 - 2 * Temp + rnorm(N, sd = 5)
# 4. Plant Abundance (Outcome)
# - Positive effect of Moisture (+0.5)
# - Positive effect of Temperature (+1.5)
# - Direct negative effect of Elevation (-0.005) due to harsh conditions
Abundance <- 10 + 0.5 * Moisture + 1.5 * Temp - 0.005 * Elevation + rnorm(N, sd = 10)
eco_data <- data.frame(Elevation, Temp, Moisture, Abundance)
head(eco_data)2. Standardize Data
Important: For mediation analysis, it is highly recommended to standardize your continuous variables (mean = 0, sd = 1) before fitting.
Standardization ensures that: 1. Coefficients are comparable: All effects are expressed in standard deviation units (standardized effects). 2. Scale Invariance: The calculation of Indirect Effects (product of coefficients) is more interpretable relative to the Total Effect. 3. Convergence: MCMC sampling often behaves better with standardized scales.
3. Fit the Structural Equation Model
We define the structural equations reflecting our causal DAG. Notice
the chain:
Elevation -> Temp -> Moisture -> Abundance.
# Define the structural equations
eco_eqs <- list(
Temp ~ Elevation,
Moisture ~ Temp,
Abundance ~ Moisture + Temp + Elevation
)
# Fit the model
# We use a short chain for demonstration purposes. Use more iterations for real analysis.
fit <- because(
equations = eco_eqs,
data = eco_data_std,
n.iter = 2000
)
summary(fit)We can also plot the fitted causal model with its standardised paths:
plot_dag(fit)4. Perform Mediation Analysis
We want to understand the Total Effect of Elevation on Abundance, and decompose it into its direct and indirect components.
# Run Mediation Analysis for Elevation -> Abundance
med_results <- because_mediation(fit, exposure = "Elevation", outcome = "Abundance")Inspect the Summary
med_results$summaryInterpretation (Standardized Units): * Total
Effect: The net impact of Elevation on Abundance in SD units. *
Direct Effect: The path
Elevation -> Abundance (independent of mediators). *
Total Indirect Effect: The sum of all mediated paths.
In our simulation, Elevation lowers Temp, which raises Moisture, which
increases Abundance.
Inspect Individual Paths
The because_mediation function automatically traces all
valid paths from exposure to outcome in the DAG.
med_results$pathsWe expect to see three distinct paths:
-
Direct:
Elevation -> Abundance -
Short Indirect:
Elevation -> Temp -> Abundance- Elevation lowers Temp (negative correlation); Temp increases Abundance (positive correlation).
- The product of these effects is Negative.
-
Long Indirect (Chain):
Elevation -> Temp -> Moisture -> Abundance- Elevation lowers Temp (negative).
- Lower Temp raises Moisture (negative relationship positive change in moisture).
- Higher Moisture raises Abundance (positive).
- The chain involves two negative links and one positive link, resulting in a Positive indirect effect.
The function handles this decomposition automatically, providing credibility intervals for each specific mechanism.
