Network effects and peer effects

Why networks break the standard toolkit

The first econometrics class teaches you to think of observations as independent. Unit i gets treatment, unit j does not, and the only way i’s treatment affects j’s outcome is through some omitted aggregate channel (general equilibrium, market price). You assume this away with SUTVA, the Stable Unit Treatment Value Assumption, and move on.

Networks break SUTVA on contact. If i takes up microfinance and tells j about it, j’s decision is partially caused by i’s treatment. If a Community Development Entity receives an NMTC allocation and funds a project that employs workers from three towns, the spillover from treated to untreated workers is part of the effect. If a cooperative bank passes through a subsidised rate to a small business, and that business pays its supplier on better terms, the supplier experiences the program too without ever being enrolled.

The same logic shows up wherever applied work happens:

Microfinance spreads through gossip networks (Banerjee, Chandrasekhar, Duflo, Jackson 2013): who hears about a new loan product first depends on the social graph, and takeup propagates through ties.
Trade is a weighted directed graph where countries are nodes and bilateral flows are edges. A shock to one country’s terms of trade hits its partners directly and reaches the rest of the graph through chains.
Supplier-customer linkages transmit credit and demand shocks (Carvalho, Nirei, Saito, Tahbaz-Salehi 2021 on the Tohoku earthquake): a plant that loses an upstream supplier loses output, and so does the plant’s customers.
Peer effects in technology adoption: a farmer adopts a new seed variety partly because their neighbors did, partly because the same extension worker visited everyone in the village, partly because rainfall was favorable for all of them. Three different mechanisms with three different policy implications.
Blended-finance intermediaries (CDEs, Mutual Guarantee Societies, cooperative banks) form a network of pass-through entities sharing LPs and clients. Systemic shocks propagate through these intermediary ties.
Interbank lending is a network of exposures (the Acemoglu, Carvalho, Ozdaglar, Tahbaz-Salehi 2012 logic): idiosyncratic shocks to central banks aggregate into systemic volatility.

In all of these, the cross-sectional regression of y_i on x_i misses the part of the story that lives on the edges. You need a way to put the network into the model.

The reflection problem

Manski (1993) is the right place to start because it makes the identification problem painfully clear before you spend any time estimating anything.

Write the linear-in-means model. Let N(i) denote i’s set of neighbors and let \bar y_{N(i)} = |N(i)|^{-1} \sum_{j \in N(i)} y_j be the leave-one-out average of neighbor outcomes. Same for \bar x_{N(i)}. The model is

y_i = \alpha + \beta \bar y_{N(i)} + \gamma' \bar x_{N(i)} + \delta' x_i + \varepsilon_i.

Three things could generate a positive correlation between y_i and \bar y_{N(i)}:

Endogenous peer effects (\beta \neq 0): your neighbors’ outcomes affect yours. You study more because your friends study more.
Contextual effects (\gamma \neq 0): your neighbors’ characteristics affect yours. You study more because your friends come from educated families.
Correlated effects (Cov(\varepsilon_i, \varepsilon_j) \neq 0 for j \in N(i)): you and your neighbors share an environment. You and your friends study more because the school is good for all of you.

The reflection problem is the mechanical fact that \bar y_{N(i)} is itself caused by y_i in the structural sense. If everyone in the neighborhood is solving the same equation, then i’s decision affects j’s decision and vice versa. The cross-sectional regression cannot disentangle who is reflecting whom.

The simultaneity becomes algebraic if you stack the equations. Let W be the row-normalized adjacency matrix (W_{ij} = 1/|N(i)| if j \in N(i) and zero otherwise). Then

Y = \alpha \mathbf{1} + \beta W Y + W X \gamma + X \delta + \varepsilon,

which solves to

Y = (I - \beta W)^{-1}\left[\alpha \mathbf{1} + W X \gamma + X \delta + \varepsilon\right].

The reduced form has W X and W^2 X and W^3 X etc. all entering, because (I - \beta W)^{-1} = I + \beta W + \beta^2 W^2 + \dots when |\beta| is below the spectral bound. The cross-sectional moments do not separately identify \beta, \gamma, and the variance of \varepsilon unless something extra restricts the system.

If you regress y_i on \bar y_{N(i)} and \bar x_{N(i)} and x_i without thinking about this, you get a number. The number is not interpretable as an endogenous peer effect.

Bramoullé, Djebbari, Fortin (2009): identification from network topology

The Bramoullé, Djebbari, Fortin (2009) result is that under the linear-in-means model, \beta, \gamma, \delta are identified if and only if the matrices I, W, W^2, W^3 are linearly independent. The matrices W, W^2 are linearly independent (call this the BDF condition) when there exists some pair of nodes (i, k) with k a neighbor-of-a-neighbor of i but NOT a direct neighbor of i. In words, your network must have “intransitive triads”: some friend of your friend is not your friend.

The intuition is that W^2 X (average of friends-of-friends’ characteristics) gives you a valid instrument for W Y (average of friends’ outcomes) when intransitivity holds. The exclusion is that, conditional on X, W X, and the network structure, the characteristics of your friends-of-friends affect you only through their effect on your friends’ outcomes. The relevance is that friends-of-friends’ X correlates with friends’ Y through the structural model.

The rank condition: identification holds when I, W, W^2 are linearly independent, which fails in two important cases.

Complete graphs: when everyone is everyone else’s neighbor, W has a constant row sum and W^2 is a scalar multiple of W plus a constant times I. There are no friends-of-friends who are not also direct friends. Linear-in-means peer effects in a single classroom where everyone interacts with everyone are not identified.

Tight clusters: in highly clustered networks (think a small village where everyone knows everyone), the BDF condition is technically satisfied but the empirical analogue is weak. The instrument W^2 X has very little variation independent of W X. The variance bound on \hat\beta blows up.

Sparse networks and trees are where the BDF approach works well. Friendship graphs in schools, supplier-customer chains, and gossip networks across villages have enough intransitivity for W^2 X to be a useful instrument.

You verify the rank condition by examining the network. A practical diagnostic is the clustering coefficient: roughly the fraction of triangles to connected triples. Clustering coefficients near 1 are warning signs. Networks with clustering below 0.4 and meaningful diameter are usually fine.

Friends-of-friends variants

De Giorgi, Pellizzari, Redaelli (2010), in their paper on Bocconi students choosing majors, exploit the fact that students choose their academic peers indirectly through workgroup assignments. Calvó-Armengol, Patacchini, Zenou (2009) use the AddHealth dataset and exploit characteristics of friends-of-friends as the exclusion. The structural model is the same; what changes is the empirical specification of W.

The common theme: any time your network has intransitivity, you have an in-built instrument for peer outcomes using peers’ peers’ characteristics. The differences across papers are mostly about how you choose to weight the network (binary vs weighted ties, row-normalized vs raw degree) and how you handle the inevitable measurement error in network ties.

Estimation in practice

The three workhorses are 2SLS with network instruments, maximum likelihood for spatial autoregressive models, and GMM.

2SLS with network instruments. Use W^2 X as instruments for W Y, controlling for W X and X. In R:

library(igraph)
library(spatialreg)
library(fixest)

# Build the adjacency matrix from edgelist
g <- graph_from_data_frame(edges, directed = FALSE, vertices = nodes)
W <- as_adjacency_matrix(g, sparse = TRUE)
W <- W / rowSums(W)          # row-normalize
W2 <- W %*% W                # friends-of-friends

# Construct WY, WX, W2X
WY  <- as.numeric(W  %*% df$y)
WX  <- as.numeric(W  %*% df$x)
W2X <- as.numeric(W2 %*% df$x)

# 2SLS: instrument WY with W2X, control for WX and X
m <- feols(y ~ x + WX | 0 | WY ~ W2X, data = cbind(df, WY, WX, W2X))
summary(m, stage = 2)

The spatialreg::lagsarlm function fits SAR models by maximum likelihood when you are willing to assume normality, but for most applied work the 2SLS approach is cleaner because it does not require distributional assumptions and the instruments are interpretable.

In Stata:

* Construct WY, WX, W2X outside Stata (Python/R) and merge in
ivreg2 y x WX (WY = W2X), cluster(network_id) first

* Or use spregress for the SAR ML version
spregress y x, gs2sls dvarlag(W) ivarlag(W: x)

Maximum likelihood for SAR. The spatial autoregressive model

y_i = \rho \sum_j W_{ij} y_j + x_i' \beta + \varepsilon_i

is estimated by maximizing the Gaussian likelihood, which involves the log-determinant of (I - \rho W). This is the Cliff-Ord, Anselin, LeSage tradition. R: spatialreg::lagsarlm. Stata: spregress, ml. Use ML when the network is a clean, well-measured spatial structure (counties sharing borders, students in fixed classrooms) and when you actually believe the Gaussian assumption.

GMM (Lee, Liu 2010) is the right answer when the network is large, the disturbances may be heteroskedastic, and you want efficient inference under weaker assumptions than ML. It uses both linear and quadratic moment conditions and is more efficient than the BDF 2SLS in many settings. The cost is implementation complexity; in practice 2SLS gets you 90% of the way there with much less code.

Network spillovers in experimental designs

If you control the assignment, you can design around the reflection problem rather than instrument your way out of it.

Partial-population designs (Moffitt 2001). Vary treatment intensity across clusters. In Crépon, Devoto, Duflo, Parienté (2013) on microfinance in Morocco, villages were randomized into treatment and control, and within treatment villages, only some households got the microfinance offer. The within-treatment-village comparison of treated and untreated households identifies the direct effect plus the within-village spillover. The between-village comparison of untreated households in treated villages vs households in control villages identifies the pure spillover.

Two-stage randomization (Hudgens, Halloran 2008; Baird, Bohren, McIntosh, Özler 2018). Stage 1: randomize village-level treatment saturation (say 25%, 50%, 75%, 100%). Stage 2: within each village, randomize which households are treated at the assigned saturation. This setup identifies three quantities cleanly:

The direct effect of treatment, holding spillover exposure constant.
The spillover effect on untreated households as a function of the saturation.
The total effect on the village.

The saturation gradient is what gives you identification. If spillovers are linear in saturation, the slope of the untreated-household outcome with respect to saturation is the spillover. If spillovers are non-linear, the saturation design lets you trace out the shape.

Saturation designs in practice. Most field designs use four or five saturation levels: \{0, 25, 50, 75, 100\}\% or similar. With V villages and S saturation levels you need V \gg S for power, which usually means several hundred villages.

A small simulation makes the intuition concrete:

library(tidyverse)
set.seed(42)

n_villages <- 200
sat_levels  <- c(0, 0.25, 0.5, 0.75, 1.0)
df <- expand_grid(village = 1:n_villages, hh = 1:20) %>%
  group_by(village) %>%
  mutate(saturation = sample(sat_levels, 1),
         treated    = rbinom(n(), 1, saturation),
         u          = rnorm(n()),
         y          = 0.5 * treated + 1.2 * saturation + u) %>%
  ungroup()

# Direct effect (within village, controlling for saturation)
feols(y ~ treated + saturation, data = df, cluster = ~village)

# Pure spillover: untreated households only, on saturation
feols(y ~ saturation, data = filter(df, treated == 0),
      cluster = ~village)

The coefficient on treated is the direct effect (0.5 in the DGP), and the coefficient on saturation in the untreated-only regression is the pure spillover (1.2 in the DGP). The two-stage design has separated them.

Network formation and selection

A deeper problem: friends choose each other. The network W is not exogenously given; it is a realization of a joint formation process where ties are more likely between pairs of nodes whose unobservables are correlated. If high-ability students befriend each other, then W Y correlates with the unobservable component of y_i even before any peer effects operate.

This is the endogenous-network problem, and it is the deepest threat to identifying peer effects in observational data.

Three responses, in increasing order of ambition:

Argue your network is plausibly exogenous. Best when ties are imposed by administrative process (random roommate assignments in college dorms, military barracks, randomized desk arrangements). The Sacerdote (2001) Dartmouth roommate study is the canonical example.
Model the network formation explicitly. Exponential random graph models (ERGMs) write the probability of a network g as P(g) \propto \exp(\theta' s(g)) where s(g) is a vector of network statistics (number of edges, triangles, two-stars, homophily counts). R: the ergm package in the statnet suite. Estimate \theta by MCMC, then condition on the predicted network in the second-stage peer-effects regression. The Graham (2017 Econometrica) result on dyadic regression with degree heterogeneity is the modern theoretical anchor.
Use a quasi-experiment in network formation. Cai, De Janvry, Sadoulet (2015) on insurance uptake in China use random variation in the timing of information sessions to generate exogenous variation in who-knew-whom-when. The cleanest network econometrics papers usually find some exogenous variation in network exposure rather than relying on observational W.

A practical point: even if you cannot estimate the formation model, you should describe it. Tell the reader who forms ties with whom and why. If the network is plausibly endogenous on observables, condition on the right covariates. If it is endogenous on unobservables, say so and discuss the sign of the bias.

library(ergm)
library(network)

# Convert edgelist to a network object
net <- network(edges, directed = FALSE, vertices = nodes)
net %v% "education" <- nodes$education
net %v% "village"   <- nodes$village

# Fit ERGM with edges, triangles, education-homophily
fit <- ergm(net ~ edges + triangle + nodematch("education")
                + nodematch("village"))
summary(fit)

Chandrasekhar, Lewis (2011) make a different point: if you only observe a sample of the true network (you elicit gossip ties from 30% of households and infer the rest), the measurement error in W biases peer-effects estimates toward zero. Sample networks need correction, and the correction is non-trivial.

Centrality as outcome or treatment

A node’s position in the network is informative. The four standard centrality measures:

Degree centrality: number of edges. Simplest, most local.
Eigenvector centrality: a node is central if it is connected to central nodes. Solve \lambda v = W v; the leading eigenvector ranks nodes. Stable in well-connected networks.
Betweenness centrality: fraction of shortest paths that pass through a node. Captures brokerage.
Katz-Bonacich centrality: b_i(\alpha) = \sum_{k=0}^\infty \alpha^k (W^k \mathbf{1})_i for some decay \alpha < 1/\lambda_{\max}(W). Counts walks of all lengths, discounted by length. This is the centrality measure that emerges naturally from the equilibrium of the linear-in-means peer-effects model: in the structural model, each agent’s outcome is proportional to their Katz-Bonacich centrality with \alpha = \beta.

The Banerjee, Chandrasekhar, Duflo, Jackson (2013 Science) paper introduces diffusion centrality, which is a finite-horizon variant of Katz-Bonacich that tracks how information spreads through the network over T rounds:

DC_i(p, T) = \sum_{t=1}^T (pW)^t \mathbf{1}.

Their finding in 75 Indian villages: when a microfinance organization randomly seeds information with a small number of households, the diffusion centrality of those seeds predicts community-wide takeup four years later. This is a clean identification because the seeds were randomly chosen from a known set; the variation in seed centrality is exogenous.

Acemoglu, Carvalho, Ozdaglar, Tahbaz-Salehi (2012 Econometrica) make the macro version of the same point. Idiosyncratic productivity shocks to firms aggregate into GDP volatility at a rate that depends on the network of intermediate-input linkages. In a star network with one central supplier, a shock to that supplier propagates to all downstream firms. The granular hypothesis: the law of large numbers does not wash out idiosyncratic shocks when the input-output network is sufficiently asymmetric.

The practical lesson for blended finance: in a network of CDEs sharing LPs, or MGS sharing reinsurance partners, the central intermediary’s failure is not just one failure; it is the aggregate shock you are not pricing.

Worked example: microfinance diffusion in 75 Indian villages

The setup. Banerjee, Chandrasekhar, Duflo, Jackson surveyed 75 villages in Karnataka. For each village they elicited the full social network across multiple relationship types (medical advice, money-borrowing, religious participation, etc.). The microfinance NGO BSS entered each village and selected a small group of seed households who received information about the new microfinance product first. The question: does the diffusion centrality of the seed households predict village-level takeup?

The identification: seeds were chosen by BSS according to a rule that, conditional on village-level covariates, did not depend on the network position of any particular household. The variation in seed centrality across villages is exogenous given the village covariates.

A simplified version in R:

library(igraph)
library(dplyr)
library(fixest)

# Per-village computation of diffusion centrality of seeds
compute_diffusion_centrality <- function(g, seeds, p, T_horizon) {
  n <- vcount(g)
  W <- as_adjacency_matrix(g, sparse = FALSE)
  seed_vec <- rep(0, n); seed_vec[seeds] <- 1
  reach <- rep(0, n)
  cur   <- seed_vec
  for (t in 1:T_horizon) {
    cur   <- (p * W) %*% cur
    reach <- reach + as.numeric(cur)
  }
  sum(reach)
}

village_data <- village_list %>%
  rowwise() %>%
  mutate(
    DC_seeds = compute_diffusion_centrality(
      g = network_list[[village_id]],
      seeds = seed_households[[village_id]],
      p = 0.15, T_horizon = 5
    )
  ) %>%
  ungroup()

# Regress village-level takeup on seed diffusion centrality
m <- feols(takeup_rate ~ DC_seeds + log(village_size) + caste_share
                       + literacy_rate,
           data = village_data, vcov = "hetero")
summary(m)

The coefficient on DC_seeds is the estimated effect of seed centrality on adoption. Banerjee et al. (2013) find a strong positive effect: a one-standard-deviation increase in seed diffusion centrality raises takeup by roughly 20% of the mean. Eigenvector centrality of seeds works as a proxy when diffusion centrality is too computationally heavy.

The Stata version uses nwcommands for the graph operations:

* Load edgelists and node attributes
use village_edges, clear
nwfromedge tail head, name(g) directed(false)
nwsummarize g
nwsna g, measures(degree eigenvector betweenness)

* Merge centrality with village data
collapse (mean) deg_seed = degree eig_seed = eigenvector ///
  if seeded == 1, by(village_id)
merge 1:1 village_id using village_takeup
reg takeup_rate eig_seed log_size caste_share literacy_rate, robust

A point about the diffusion-centrality choice. The parameter p in (pW)^t governs how lossy information transmission is per step. Banerjee et al. estimate p jointly with the takeup model, but you can approximate by computing diffusion centrality for several plausible p and showing the takeup regression is robust across them. This is the “multiple specifications of W as robustness” point from the reporting checklist below.

Common traps

A few mistakes recur often enough to flag explicitly.

Estimating peer effects in a complete network. If your “network” is a classroom where everyone interacts with everyone, the BDF rank condition fails by construction and you cannot identify endogenous and contextual effects separately. Reframe as a contextual-effects model and stop there, or find a sub-network with intransitivity.

Forgetting that sampled networks measure W with error. If you elicit gossip ties from 30% of villagers and infer the rest, your \hat W is noisy and the peer-effect coefficient attenuates. Chandrasekhar, Lewis (2011) is the reference. Either correct (graphon-based bias correction) or report sensitivity to network density.

Confounding peer effects with shared environment. Two farmers adopting the same seed variety in the same village could be (a) peer-influenced or (b) both responding to the same extension officer’s visit. The “correlated effects” channel is the hardest to rule out. Village fixed effects help with constant correlated shocks. Variation in within-village ties (some pairs are tightly linked, others not) identifies the peer effect off the differential exposure.

Two-step estimation that ignores first-stage uncertainty in W. If you estimate the network in step 1 (say, by ERGM or by gossip-tie elicitation with sampling error) and use it as a known quantity in step 2, your second-stage standard errors are too narrow. Bootstrap at the village level: resample villages, re-estimate W in each draw, re-run the peer-effects regression, get the distribution of \hat\beta across draws.

Network endogeneity from selection. Your friends are not a random sample of your community. If high-ability students befriend each other, the cross-section will show a positive peer effect even when none exists structurally. The cleanest defenses are random tie assignment (roommates), instrumented tie formation (rare), or panel data with within-individual variation in network exposure.

Misspecifying W. A binary tie matrix and a weighted tie matrix (by interaction frequency) can give very different \hat\beta in finite samples. Row-normalized and raw W likewise. Best practice: report at least two specifications and show the results are robust.

A reporting checklist

When you write up a network paper, your reader needs to be able to verify the identification story. The minimum:

Visualize a representative network. A force-directed layout of one cluster or village. Include the seeds (if a diffusion paper), color by treatment status, label hubs.
Show the BDF rank condition holds, or its empirical analogue. Report the clustering coefficient, network diameter, average path length. If clustering is above 0.5, justify why you still claim identification.
Estimate under multiple specifications of W. Binary vs weighted, row-normalized vs raw, immediate neighbors vs friends-of-friends. Table the results side by side.
Cluster or bootstrap standard errors at the network level, not the individual level. Within-village correlation is exactly the failure mode the standard errors need to absorb.
Describe the network-measurement methodology. Was it elicited (gossip questions), administrative (school records of co-enrollment), or observed (transaction data)? What is the measurement error model? Are you sampling the network or observing it in full?
Test robustness to peer-group definition. Immediate neighbors, friends-of-friends only, weighted shortest-path neighborhoods. If the result depends sensitively on definition, say so and explain which definition is structural.
Discuss the network-formation process honestly. Endogenous network formation is rarely fully solved; you should at least describe how the network came to be and what you have or have not done about selection into ties.

The trade-network angle

This is the angle I work in, so a longer treatment.

Trade is a weighted directed graph. Nodes are countries (or country-sectors), edges are bilateral trade flows. The graph is dense at the top (major economies trade with everyone) and sparse at the bottom (many country pairs have zero recorded trade). The weight on an edge from j to i is some function of imports, often \log(\text{imports}_{ij} + 1) or the share \text{imports}_{ij} / \text{GDP}_i.

Standard gravity equations treat bilateral trade as a function of bilateral characteristics (distance, common language, RTA membership) plus country-side fixed effects. The network structure is implicit; the regression does not exploit the topology.

Effective distance (Brockmann, Helbing 2013 Science) replaces geographic distance with a network-shortest-path distance, weighted by edge intensity. Define d_{ij} = 1 - \log P_{ij} where P_{ij} is the share of i’s flow that goes to j, and take the shortest-path sum along the chain from i to j. Effective distance is large when there is no high-intensity route between countries; small when there is. Brockmann and Helbing showed this metric predicts the spread of H1N1 across airline networks far better than geographic distance.

In the trade setting, effective distance reorders the world map. Singapore and Rotterdam are “close” to a great many partners because they are central hubs. Landlocked countries with no major transit links are “far” from most partners, even those that are geographically nearby. Helfrich, Gonchar (2025) “Trade in the Spotlight” applies the effective-distance metric to bilateral export panels and shows that trade-shock exposure measured via effective distance predicts country-pair outcomes better than geographic-distance gravity does.

Why this matters: a country’s exposure to a shock in country j is not just direct trade with j; it is also indirect trade through the chains that pass through j. If China imposes a tariff on Vietnamese exports, the effect on Cambodia is partly direct (Cambodian goods substituted out by Vietnamese diversion) and partly indirect (Cambodian inputs into Vietnamese exports lose their final-demand market). The effective-distance measure absorbs both channels into a single weighted exposure.

A simple implementation:

library(igraph)

# trade_flows: data frame with cols (exporter, importer, value)
g <- graph_from_data_frame(trade_flows, directed = TRUE)
E(g)$weight <- -log(E(g)$share_of_exporter_total + 1e-9)

# Shortest-path distance under the effective-distance weights
eff_dist <- distances(g, mode = "out", weights = E(g)$weight)

# Build country-level shock exposure: weighted avg of partner shocks
# using inverse effective distance
exposure_i <- function(i, shocks, eff_dist) {
  w <- exp(-eff_dist[i, ])
  w[i] <- 0; w <- w / sum(w)
  sum(w * shocks, na.rm = TRUE)
}

The same logic generalizes to migration networks, capital-flow networks, and intermediate-input networks. The methodological move (replace geographic distance with network distance) is portable.

Why this matters for applied finance and development researchers

The agenda I want you to take from this chapter is that the systemic and supply-chain literatures have absorbed network econometrics in the last decade, and a development or finance researcher who cannot speak the language is locked out of a growing share of the frontier.

Three places this shows up in the work I see:

Blended-finance intermediaries form a network. CDEs that hold NMTC allocations share LPs, share project pipelines, and sometimes co-invest. A shock to one CDE’s portfolio (a project failure, a tax-credit recapture event) propagates to LPs and through them to other CDEs. Modeling this as a star-of-stars network (a few hub CDEs, many spoke projects) and applying Acemoglu et al. (2012) logic gives you a way to estimate the systemic component of CDE risk that the project-level data alone cannot reveal.

Microfinance and adoption diffusion is canonically a network problem. Banerjee et al. (2013) is the reference. The empirical question of “which households should we seed first” turns into a centrality optimization problem, with the additional twist that the seed selection is now your treatment.

Supplier-customer credit transmission. Carvalho, Nirei, Saito, Tahbaz-Salehi (2021 QJE) use the Tohoku earthquake to identify input-output network propagation in Japanese manufacturing. The effect of losing an upstream supplier propagates downstream by an order of magnitude that is invisible in industry-aggregate data. This is the template for thinking about credit-shock propagation through supplier networks; cooperative-banking networks fit the same mold.

The connecting thread: in all three, the “treatment” is a node-level shock, the “spillover” is a network-weighted exposure, and the identification comes from variation in network position or from quasi-random shocks to nodes whose downstream linkages you can measure.

References

Acemoglu, D., Carvalho, V. M., Ozdaglar, A., Tahbaz-Salehi, A. (2012). The network origins of aggregate fluctuations. Econometrica, 80(5), 1977-2016.

Baird, S., Bohren, J. A., McIntosh, C., Özler, B. (2018). Optimal design of experiments in the presence of interference. Review of Economics and Statistics, 100(5), 844-860.

Banerjee, A., Chandrasekhar, A. G., Duflo, E., Jackson, M. O. (2013). The diffusion of microfinance. Science, 341(6144), 1236498.

Banerjee, A., Chandrasekhar, A. G., Duflo, E., Jackson, M. O. (2019). Using gossips to spread information: Theory and evidence from two randomized controlled trials. Review of Economic Studies, 86(6), 2453-2490.

Bramoullé, Y., Djebbari, H., Fortin, B. (2009). Identification of peer effects through social networks. Journal of Econometrics, 150(1), 41-55.

Brockmann, D., Helbing, D. (2013). The hidden geometry of complex, network-driven contagion phenomena. Science, 342(6164), 1337-1342.

Cai, J., De Janvry, A., Sadoulet, E. (2015). Social networks and the decision to insure. American Economic Journal: Applied Economics, 7(2), 81-108.

Calvó-Armengol, A., Patacchini, E., Zenou, Y. (2009). Peer effects and social networks in education. Review of Economic Studies, 76(4), 1239-1267.

Carvalho, V. M., Nirei, M., Saito, Y. U., Tahbaz-Salehi, A. (2021). Supply chain disruptions: Evidence from the Great East Japan earthquake. Quarterly Journal of Economics, 136(2), 1255-1321.

Chandrasekhar, A. G., Lewis, R. (2011). Econometrics of sampled networks. Working paper.

Crépon, B., Devoto, F., Duflo, E., Parienté, W. (2013). Estimating the impact of microcredit on those who take it up: Evidence from a randomized experiment in Morocco. American Economic Journal: Applied Economics, 7(1), 123-150.

De Giorgi, G., Pellizzari, M., Redaelli, S. (2010). Identification of social interactions through partially overlapping peer groups. American Economic Journal: Applied Economics, 2(2), 241-275.

Graham, B. S. (2017). An econometric model of network formation with degree heterogeneity. Econometrica, 85(4), 1033-1063.

Helfrich, I., Gonchar, E. (2025). Trade in the spotlight: Effective distance and bilateral exposure. SSRN 5202676.

Hudgens, M. G., Halloran, M. E. (2008). Toward causal inference with interference. Journal of the American Statistical Association, 103(482), 832-842.

Lee, L. F., Liu, X. (2010). Efficient GMM estimation of high order spatial autoregressive models with autoregressive disturbances. Econometric Theory, 26(1), 187-230.

Manski, C. F. (1993). Identification of endogenous social effects: The reflection problem. Review of Economic Studies, 60(3), 531-542.

Moffitt, R. A. (2001). Policy interventions, low-level equilibria, and social interactions. In Durlauf, S. N., Young, H. P. (Eds.), Social Dynamics (pp. 45-82). MIT Press.

Sacerdote, B. (2001). Peer effects with random assignment: Results for Dartmouth roommates. Quarterly Journal of Economics, 116(2), 681-704.

Beaman, L., BenYishay, A., Magruder, J., Mobarak, A. M. (2021). Can network theory-based targeting increase technology adoption? American Economic Review, 111(6), 1918-1943.

Banerjee, A., Duflo, E., Glennerster, R., Kinnan, C. (2015). The miracle of microfinance? Evidence from a randomized evaluation. American Economic Journal: Applied Economics, 7(1), 22-53.

Aral, S., Walker, D. (2012). Identifying influential and susceptible members of social networks. Science, 337(6092), 337-341.