The primary user-facing constructor for catgraph. It computes
pairwise effect sizes (phi or Cramer's V) for all categorical variable
pairs, stores the resulting weighted igraph network, and preserves
processed data and metadata for downstream analysis. Use this function for
standard workflows; use build_graph only when a raw
igraph object is required.
Arguments
- data
A data frame or tibble whose columns represent categorical variables. Factor, character, and logical columns are supported. Numeric columns are coerced to character with a message.
- method
Character. Association metric for edge weights. One of
"cramers_v"(default),"cramers_v_corrected","nmi","ami", or"bayesian_cramers_v". Seebuild_graphfor details.- corrected
Logical. Deprecated shortcut for
method = "cramers_v_corrected". Kept for backward compatibility. DefaultFALSE.- correct
Logical. Yates' continuity correction for the chi-square test. Default
FALSE.- simulate_p
Logical. Monte Carlo p-value simulation. Default
FALSE.- B
Integer. Monte Carlo resamples when
simulate_p = TRUE. Default2000L.- alpha
Numeric. Dirichlet prior concentration for
method = "bayesian_cramers_v". Default0.5(Jeffreys prior). Ignored for all other methods.- x
A
catgraphobject.- ...
Ignored.
- object
A
catgraphobject.- top
Integer. Number of strongest edges to display. Use
Inffor all edges. Default10L.
Value
An S3 object of class catgraph containing:
graphAn undirected weighted igraph object. True zero associations are absent edges, not near-zero edges.
dataThe processed data frame actually used for estimation (after non-categorical coercion and constant-column removal). Downstream functions such as
catgraph_ciresample from this object. Changed fromraw_datain v0.4.0 to fix an internal-consistency bug.raw_dataThe original input data frame, for reference.
methodCharacter string recording which association metric was used (
"cramers_v","cramers_v_corrected","nmi","ami", or"bayesian_cramers_v").alphaThe Dirichlet prior used, or
NAwhenmethodis not"bayesian_cramers_v".correctedLogical flag,
TRUEwhenmethod = "cramers_v_corrected". Kept for backward compatibility.n_varsNumber of variables (graph vertices).
n_pairs_totalNumber of variable pairs evaluated.
n_pairsNumber of retained graph edges (pairs with non-zero effect size).
callThe matched call.
Details
Scope. A catgraph is a pairwise association network,
not a conditional-independence graphical model. Edges encode bivariate
dependence between two variables and do not imply that the two variables
remain dependent after controlling for the remaining variables. Interpret
centrality, community, and bridge measures accordingly. See the package
vignette for a full discussion.
All variable pairs with non-zero effect size are retained by default (no
thresholding at construction time). To remove weak or non-significant
edges, pass the object to prune_edges.
Methods (by generic)
print(catgraph): Print a concise summary of acatgraphobject.summary(catgraph): Summarise acatgraphobject, listing edges sorted by effect size.
References
Bergsma, W. (2013). A bias-correction for Cramer's V and Tschuprow's T. Journal of the Korean Statistical Society, 42(3), 323–328. doi:10.1016/j.jkss.2012.10.002
Examples
df <- expand_table(Titanic)
cg <- catgraph(df)
cg
#> catgraph object (pairwise association network)
#> Variables : 4
#> Edges : 6
#> Method : Cramer's V (classical)
#> Weights : min = 0.0976 median = 0.2630 max = 0.4556
#> Note : edges encode pairwise marginal association, not
#> conditional independence. All metrics lie on [0, 1].
#> NMI / AMI weights are not exchangeable with Cramer's V
#> weights across graph objects. See vignette
#> 'Methodological caveats'.
summary(cg)
#> catgraph summary
#> Variables : 4
#> Pairs evaluated : 6
#> Edges retained : 6
#>
#> Method : Cramer's V (classical)
#>
#> Top 6 edges by effect size:
#>
#> var1 var2 effect_size metric p_value n type
#> 1 Sex Survived 0.45560 phi 2.302e-101 2201 2x2
#> 2 Class Sex 0.39872 cramers_v 1.557e-75 2201 RxC
#> 3 Class Survived 0.29412 cramers_v 5.000e-41 2201 RxC
#> 4 Class Age 0.23195 cramers_v 1.695e-25 2201 RxC
#> 5 Sex Age 0.11101 phi 1.907e-07 2201 2x2
#> 6 Age Survived 0.09758 phi 4.701e-06 2201 2x2
cg_bc <- catgraph(df, corrected = TRUE)