The Delta framework answers two related but distinct questions about peptide-level prevalence:
deltaplot()
and deltaplot_interactive() place each peptide at its
pooled prevalence (x) and prevalence shift Δ (y), providing a global
view of which features move between groups.compute_delta() aggregates per-peptide z-scores into a
single Stouffer-type statistic and assesses significance via
subject-label permutation. forestplot() and
forestplot_interactive() then display the top and bottom
features ranked by that statistic.This vignette walks through the complete Delta workflow in phiper:
deltaplot() and
deltaplot_interactive()compute_delta()forestplot() and
forestplot_interactive()Load the bundled example dataset. It contains two patient groups
(A, B) measured at two timepoints
(T1, T2) across 1 000 simulated peptides.
pd <- load_example_data()
pd
#> ── <phip_data> ─────────────────────────────────────────────────────────────────
#>
#> counts (first 5 rows):
#> # A tibble: 5 × 9
#> sample_id subject_id group timepoint peptide_id exist counts_control
#> <chr> <chr> <chr> <chr> <chr> <int> <int>
#> 1 A_T1_1 1 A T1 10003 1 5
#> 2 A_T1_1 1 A T1 10017 1 37
#> 3 A_T1_1 1 A T1 10023 1 11
#> 4 A_T1_1 1 A T1 10062 1 0
#> 5 B_T1_1 1 B T1 10087 1 1
#> # ℹ 2 more variables: counts_hits <int>, fold_change <dbl>
#>
#> table size: 78,200 rows x 9 columns
#>
#> peptide library preview (first 5 rows):
#> # A tibble: 5 × 8
#> peptide_id Fullname species genus family order class common
#> <chr> <chr> <chr> <chr> <chr> <chr> <chr> <chr>
#> 1 agilent_1 Chromodomain-helicase-DNA-… Homo s… Homo Homin… Prim… Mamm… Human
#> 2 agilent_2 integral membrane protein Mycoba… Myco… Mycob… Myco… Acti… <NA>
#> 3 agilent_3 hypothetical protein (6/16… Mycoba… Myco… Mycob… Myco… Acti… <NA>
#> 4 agilent_4 envelope protein (5/8) & a… Orthof… Orth… Flavi… Amar… Flas… JEV
#> 5 agilent_5 Myosin-7 & beta-myosin hea… Homo s… Homo Homin… Prim… Mamm… Human
#> ... plus 36 more columns
#>
#> library size: 357,190 rows x 44 columns
#>
#> meta flags:
#> con: <duckdb_connection>
#> longitudinal: TRUE
#> exist: TRUE
#> fold_change: TRUE
#> raw_counts: FALSE
#> extra_cols: group, counts_control, counts_hits
#> peptide_con: <duckdb_connection>
#> materialise_table: TRUE
#> finalizer_env: <environment>
#> full_cross: FALSENote. The example data are entirely simulated and have no biological meaning. They exist solely to demonstrate the API.
The Delta plots consume prevalence tables produced by
compute_pop() (see the POP vignette for details). Compute a
group-level prevalence table first:
pop_group <- compute_pop(
pd,
rank_cols = "peptide_id",
group_cols = "group"
)
#> [08:06:38] INFO compute_pop
#> [08:06:38] INFO compute_pop
#> - ranks : peptide_id
#> - group_cols: group
#> - exist_col : exist
#> - pop_k_min : 1
#> - paired : FALSE
#> [08:06:38] INFO ranks resolved
#> - available: peptide_id
#> [08:06:38] INFO computing cohort sizes and validating binary group_cols
#> [08:06:38] INFO computing presence per sample via k-of-n rule
#> [08:06:38] INFO counting present samples per feature (pop, unpaired)
#> [08:06:39] INFO building pairwise comparisons
#> [08:06:40] OK materialized; computing Fisher p-values
#> - table: ph_pop_20260614_080639
#> [08:06:41] OK done (compute_pop, unpaired)
#> - rows : 1950
#> - ranks : peptide_id
#> - k_min : 1
#> [08:06:41] OK compute_pop - done
#> -> elapsed: 2.956s
head(pop_group)
#> rank feature group_col group1 n1 N1 prop1 percent1 group2 n2 N2
#> 1 peptide_id 10003 group A 31 38 0.8157895 81.57895 B 0 42
#> 2 peptide_id 10017 group A 6 38 0.1578947 15.78947 B 0 42
#> 3 peptide_id 10023 group A 29 38 0.7631579 76.31579 B 0 42
#> 4 peptide_id 10062 group A 11 38 0.2894737 28.94737 B 0 42
#> 5 peptide_id 10087 group A 0 38 0.0000000 0.00000 B 8 42
#> 6 peptide_id 10100 group A 0 38 0.0000000 0.00000 B 10 42
#> prop2 percent2 ratio delta_ratio p_raw n_peptides
#> 1 0.0000000 0.00000 68.52631579 33.263158 8.819969e-16 1
#> 2 0.0000000 0.00000 13.26315789 5.631579 9.186952e-03 1
#> 3 0.0000000 0.00000 64.10526316 31.052632 3.123739e-14 1
#> 4 0.0000000 0.00000 24.31578947 11.157895 1.148463e-04 1
#> 5 0.1904762 19.04762 0.06907895 -6.238095 5.758809e-03 1
#> 6 0.2380952 23.80952 0.05526316 -8.047619 1.180799e-03 1For the unpaired group comparison with compute_delta(),
each subject must appear at most once per group. Because pd
contains two timepoints per subject, restrict to a single timepoint
first:
deltaplot() — static ggplot2deltaplot() plots each peptide at:
(prop1 + prop2) / 2prop2 − prop1A dashed horizontal line marks Δ = 0. Arrows and labels indicate the
direction of enrichment. A GAM smooth (controlled by
add_smooth) summarises the trend across pooled
prevalence.
The only required input is a data frame containing
group1, group2, prop1, and
prop2 — which is exactly what compute_pop()
returns.
deltaplot(
pop_group,
group_pair_values = c("A", "B"),
group_labels = c("Group A", "Group B")
)
#> [08:06:41] INFO Preparing delta prevalence plot.Use group_pair_values to select a specific contrast when
the table contains more than one pair. group_labels
provides display names for the axis annotations.
The smooth uses a GAM with basis dimension smooth_k.
Lower values produce smoother curves; set
add_smooth = FALSE to suppress it.
deltaplot(
pop_group,
group_pair_values = c("A", "B"),
group_labels = c("Group A", "Group B"),
smooth_k = 3,
point_alpha = 0.15,
point_size = 0.4
)
#> [08:06:42] INFO Preparing delta prevalence plot.deltaplot_interactive() — plotlyThe interactive version mirrors the static API. Hovering over a point shows the feature identifier, group-level counts and percentages, and p-value.
deltaplot_interactive(
pop_group,
group_pair_values = c("A", "B"),
group_labels = c("Group A", "Group B")
)Disable the smooth if you have few peptides or want a cleaner display:
deltaplot_interactive(
pop_group,
group_pair_values = c("A", "B"),
group_labels = c("Group A", "Group B"),
add_smooth = FALSE,
point_size = 5,
point_alpha = 0.5
)compute_delta()compute_delta() performs a permutation test for a global
(antigen-wide) shift in prevalence between two groups. For each
(rank, feature) stratum it computes a per-peptide z-score,
aggregates them via a Stouffer statistic, and assesses significance by
permuting subject labels.
set.seed(42)
delta_group <- compute_delta(
pd_t1,
rank_cols = "peptide_id",
group_cols = "group",
B_permutations = 2000L
)delta_group
#> # A tibble: 1,950 × 19
#> rank feature group_col group1 group2 design n_subjects_paired
#> <chr> <chr> <chr> <chr> <chr> <chr> <int>
#> 1 peptide_id 10003 group A B unpaired NA
#> 2 peptide_id 10017 group A B unpaired NA
#> 3 peptide_id 10023 group A B unpaired NA
#> 4 peptide_id 10062 group A B unpaired NA
#> 5 peptide_id 10087 group A B unpaired NA
#> 6 peptide_id 10100 group A B unpaired NA
#> 7 peptide_id 10108 group A B unpaired NA
#> 8 peptide_id 1011 group A B unpaired NA
#> 9 peptide_id 10121 group A B unpaired NA
#> 10 peptide_id 1013 group A B unpaired NA
#> # ℹ 1,940 more rows
#> # ℹ 12 more variables: n_peptides_used <int>, m_eff <dbl>, T_obs <dbl>,
#> # T_null_mean <dbl>, T_null_sd <dbl>, T_obs_stand <dbl>, Z_from_p <dbl>,
#> # p_perm <dbl>, b <int>, max_delta <dbl>, frac_delta_pos <dbl>,
#> # frac_delta_pos_w <dbl>| Column | Description |
|---|---|
rank |
Rank at which the feature is defined |
feature |
Feature identifier |
group_col |
Name of the grouping column |
group1, group2 |
The two groups compared |
design |
"paired" or "unpaired" |
n_subjects_paired |
Paired subjects used (or NA for unpaired) |
n_peptides_used |
Peptides contributing to the test |
m_eff |
Effective number of peptides after weighting |
T_obs |
Observed Stouffer-type test statistic |
p_perm |
Two-sided permutation p-value |
b |
Number of permutations actually used |
max_delta |
Maximum absolute peptide-level prevalence difference |
frac_delta_pos |
Unweighted fraction of positive peptide-level deltas |
frac_delta_pos_w |
Weighted fraction of positive peptide-level deltas |
stat_mode controls how each peptide’s z-score is
computed:
stat_mode |
Description |
|---|---|
"srlr" (default) |
Signed root likelihood ratio |
"diff" |
Wald-type z: Δ / SE |
"asin" |
Arcsine-transformed z (variance-stabilised) |
"score" |
Pooled score z (2×2 score test) |
"mcnemar" |
Signed McNemar z — paired designs only |
"srlr_paired" |
Signed root deviance — paired designs only |
aggregate_stat controls how peptide-level z-scores are
combined:
"stouffer" (default) — weighted Stouffer
combination"maxmean" — mean of the dominant direction"af" — absolute fractionstrat_bins bins peptides by pooled prevalence before
aggregation. The default bins c(0.002, 0.005, ..., 0.50)
stratify across the prevalence range so that low- and high-prevalence
peptides are treated separately. Set strat_bins = 0 for a
single global statistic without stratification.
# No stratification
delta_nostrat <- compute_delta(
pd,
rank_cols = "peptide_id",
group_cols = "group",
strat_bins = 0,
B_permutations = 2000L
)
# Custom bins
delta_custom <- compute_delta(
pd,
rank_cols = "peptide_id",
group_cols = "group",
strat_bins = c(0.05, 0.25, 0.50),
B_permutations = 2000L
)weight_mode controls peptide weights in the Stouffer
statistic:
"equal" (default) — all peptides weight 1"se_invvar" — inverse-SE weights (downweights noisy
peptides)"n_eff_sqrt" — square root of expected positives
(emphasises common peptides)When the same subjects are measured at two timepoints, set
paired_by to the subject-linking column and
stat_mode to a paired statistic. The function then performs
sign-flipping permutations instead of label shuffling.
set.seed(42)
delta_paired <- compute_delta(
pd,
rank_cols = "peptide_id",
group_cols = "timepoint",
paired_by = "subject_id",
stat_mode = "srlr_paired",
B_permutations = 2000L
)
delta_paired
#> # A tibble: 1,946 × 19
#> rank feature group_col group1 group2 design n_subjects_paired
#> <chr> <chr> <chr> <chr> <chr> <chr> <int>
#> 1 peptide_id 10003 timepoint T1 T2 paired 21
#> 2 peptide_id 10017 timepoint T1 T2 paired 21
#> 3 peptide_id 10023 timepoint T1 T2 paired 21
#> 4 peptide_id 10062 timepoint T1 T2 paired 21
#> 5 peptide_id 10087 timepoint T1 T2 paired 21
#> 6 peptide_id 10100 timepoint T1 T2 paired 21
#> 7 peptide_id 10108 timepoint T1 T2 paired 21
#> 8 peptide_id 1011 timepoint T1 T2 paired 21
#> 9 peptide_id 10121 timepoint T1 T2 paired 21
#> 10 peptide_id 1013 timepoint T1 T2 paired 21
#> # ℹ 1,936 more rows
#> # ℹ 12 more variables: n_peptides_used <int>, m_eff <dbl>, T_obs <dbl>,
#> # T_null_mean <dbl>, T_null_sd <dbl>, T_obs_stand <dbl>, Z_from_p <dbl>,
#> # p_perm <dbl>, b <int>, max_delta <dbl>, frac_delta_pos <dbl>,
#> # frac_delta_pos_w <dbl>Forest plots display the most extreme positive and negative features for a chosen rank, sorted by the selected test statistic.
forestplot() — static ggplot2forestplot(
delta_group,
rank_of_interest = "peptide_id",
statistic_to_plot = "T",
n_neg_each = 10,
n_pos_each = 10,
left_label = "More in A",
right_label = "More in B"
)$plotstatistic_to_plot selects what is plotted on the
x-axis:
| Value | Description |
|---|---|
"T" |
Raw T_obs from compute_delta() |
"T_stand" |
Permutation-standardised T (requires T_obs_stand
column) |
"Z_from_p" |
Signed Z derived from permutation p-values (requires
Z_from_p) |
Set use_diverging_colors = TRUE to shade bars from blue
(negative) to red (positive):
forestplot(
delta_group,
rank_of_interest = "peptide_id",
statistic_to_plot = "T",
n_neg_each = 10,
n_pos_each = 10,
left_label = "More in A",
right_label = "More in B",
use_diverging_colors = TRUE
)$plotPass a column name to filter_significant to restrict the
plot to features below a given significance threshold. With
p_perm this retains only permutation-significant
features.
forestplot_interactive() — plotlyThe interactive version mirrors the static API and returns a plotly widget. Hovering over a bar shows the feature identifier and statistic value.
forestplot_interactive(
delta_group,
rank_of_interest = "peptide_id",
statistic_to_plot = "T",
n_neg_each = 10,
n_pos_each = 10,
left_label = "More in A",
right_label = "More in B",
use_diverging_colors = TRUE
)$plotA typical Delta analysis runs in four steps:
# 1. Compute prevalence (needed for deltaplot)
pop <- compute_pop(
pd,
rank_cols = "peptide_id",
group_cols = "group"
)
# 2. Visualise raw shift
deltaplot(pop, group_pair_values = c("A", "B"),
group_labels = c("Group A", "Group B"))
deltaplot_interactive(pop, group_pair_values = c("A", "B"),
group_labels = c("Group A", "Group B"))
# 3. Permutation test (unpaired)
set.seed(42)
delta <- compute_delta(
pd,
rank_cols = "peptide_id",
group_cols = "group",
B_permutations = 2000L
)
# Optional: paired test across timepoints
delta_paired <- compute_delta(
pd,
rank_cols = "peptide_id",
group_cols = "timepoint",
paired_by = "subject_id",
stat_mode = "srlr_paired",
B_permutations = 2000L
)
# 4. Forest plots
forestplot(delta, rank_of_interest = "peptide_id",
left_label = "More in A", right_label = "More in B")$plot
forestplot_interactive(delta, rank_of_interest = "peptide_id",
left_label = "More in A",
right_label = "More in B")$plotsessionInfo()
#> R version 4.6.0 (2026-04-24)
#> Platform: x86_64-pc-linux-gnu
#> Running under: Ubuntu 24.04.4 LTS
#>
#> Matrix products: default
#> BLAS: /usr/lib/x86_64-linux-gnu/openblas-pthread/libblas.so.3
#> LAPACK: /usr/lib/x86_64-linux-gnu/openblas-pthread/libopenblasp-r0.3.26.so; LAPACK version 3.12.0
#>
#> locale:
#> [1] LC_CTYPE=en_US.UTF-8 LC_NUMERIC=C
#> [3] LC_TIME=en_US.UTF-8 LC_COLLATE=en_US.UTF-8
#> [5] LC_MONETARY=en_US.UTF-8 LC_MESSAGES=en_US.UTF-8
#> [7] LC_PAPER=en_US.UTF-8 LC_NAME=C
#> [9] LC_ADDRESS=C LC_TELEPHONE=C
#> [11] LC_MEASUREMENT=en_US.UTF-8 LC_IDENTIFICATION=C
#>
#> time zone: Etc/UTC
#> tzcode source: system (glibc)
#>
#> attached base packages:
#> [1] stats graphics grDevices utils datasets methods base
#>
#> other attached packages:
#> [1] dplyr_1.2.1 phiper_0.4.1 rmarkdown_2.31
#>
#> loaded via a namespace (and not attached):
#> [1] gtable_0.3.6 xfun_0.58 bslib_0.11.0
#> [4] ggplot2_4.0.3 parallelDist_0.2.7 lattice_0.22-9
#> [7] vctrs_0.7.3 tools_4.6.0 generics_0.1.4
#> [10] parallel_4.6.0 tibble_3.3.1 cluster_2.1.8.2
#> [13] blob_1.3.0 pkgconfig_2.0.3 Matrix_1.7-5
#> [16] dbplyr_2.5.2 RColorBrewer_1.1-3 S7_0.2.2
#> [19] RcppParallel_5.1.11-2 lifecycle_1.0.5 compiler_4.6.0
#> [22] farver_2.1.2 codetools_0.2-20 permute_0.9-10
#> [25] htmltools_0.5.9 sys_3.4.3 buildtools_1.0.0
#> [28] sass_0.4.10 yaml_2.3.12 pillar_1.11.1
#> [31] jquerylib_0.1.4 tidyr_1.3.2 MASS_7.3-65
#> [34] cachem_1.1.0 vegan_2.7-5 nlme_3.1-169
#> [37] parallelly_1.47.0 tidyselect_1.2.1 digest_0.6.39
#> [40] Rtsne_0.17 future_1.70.0 duckdb_1.5.2
#> [43] purrr_1.2.2 showtextdb_3.0 listenv_0.10.1
#> [46] maketools_1.3.2 labeling_0.4.3 splines_4.6.0
#> [49] fastmap_1.2.0 grid_4.6.0 cli_3.6.6
#> [52] magrittr_2.0.5 utf8_1.2.6 future.apply_1.20.2
#> [55] withr_3.0.2 scales_1.4.0 showtext_0.9-8
#> [58] globals_0.19.1 sysfonts_0.8.9 otel_0.2.0
#> [61] ggsignif_0.6.4 chk_0.10.0 evaluate_1.0.5
#> [64] knitr_1.51 mgcv_1.9-4 phiperio_0.5.0
#> [67] rlang_1.2.0 Rcpp_1.1.1-1.1 glue_1.8.1
#> [70] DBI_1.3.0 jsonlite_2.0.0 R6_2.6.1