Improve `tidy.coxph()` for multistate models #1236

DanChaltiel · 2025-01-05T13:02:17Z

Feature

Hi,

When using survival::coxph() for multistate models, the summary() (and therefore the tidy()) output contains indices only.
These refer to the indices of fit$states, and are hard to read, especially when dealing with multiple models.
If these indices could be replaced by their labels, the readability would improve dramatically.

I know that {broom}'s policy is now to not change their methods, but in this case, I think this could be worth it.

This could be a use_label=TRUE option, that should probably depend on the coxphms class.

Here is an example of an expected output:

library(tidyverse)
library(survival)
#example from vignette compete.pdf 
mgus2$etime <- with(mgus2, ifelse(pstat==0, futime, ptime)) 
event <- with(mgus2, ifelse(pstat==0, 2*death, 1))
mgus2$event <- factor(event, 0:2, labels=c("censor", "pcm", "death"))
fit <- coxph(Surv(etime, event) ~ age + sex + mspike, mgus2, id=id)

tidy_ms = function(fit){
  x = broom::tidy(fit)
  s = fit$states
  x$term = x$term %>%
    str_replace_all("(\\d):(\\d)", function(match) {
      parts <- str_match(match, "(\\d):(\\d)")
      first_digit <- as.numeric(parts[2])
      second_digit <- as.numeric(parts[3])
      paste0(s[first_digit], ":", s[second_digit])
    })
  x %>% 
    separate(term, into=c("term", "transition"), sep="_")
} 

broom::tidy(fit)
#> # A tibble: 6 × 6
#>   term       estimate std.error robust.se statistic  p.value
#>   <chr>         <dbl>     <dbl>     <dbl>     <dbl>    <dbl>
#> 1 age_1:2     0.0164    0.00837   0.00694    2.35   1.85e- 2
#> 2 sexM_1:2   -0.00503   0.188     0.188     -0.0268 9.79e- 1
#> 3 mspike_1:2  0.884     0.165     0.168      5.25   1.51e- 7
#> 4 age_1:3     0.0652    0.00365   0.00374   17.4    3.80e-68
#> 5 sexM_1:3    0.389     0.0699    0.0666     5.84   5.20e- 9
#> 6 mspike_1:3 -0.0593    0.0639    0.0620    -0.956  3.39e- 1
tidy_ms(fit)
#> # A tibble: 6 × 7
#>   term   transition estimate std.error robust.se statistic  p.value
#>   <chr>  <chr>         <dbl>     <dbl>     <dbl>     <dbl>    <dbl>
#> 1 age    (s0):pcm    0.0164    0.00837   0.00694    2.35   1.85e- 2
#> 2 sexM   (s0):pcm   -0.00503   0.188     0.188     -0.0268 9.79e- 1
#> 3 mspike (s0):pcm    0.884     0.165     0.168      5.25   1.51e- 7
#> 4 age    (s0):death  0.0652    0.00365   0.00374   17.4    3.80e-68
#> 5 sexM   (s0):death  0.389     0.0699    0.0666     5.84   5.20e- 9
#> 6 mspike (s0):death -0.0593    0.0639    0.0620    -0.956  3.39e- 1

^{Created on 2025-01-05 with reprex v2.1.1}

Of course, this code could be improved.

Separating term from transition allows for better filtering and arranging,

The text was updated successfully, but these errors were encountered:

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Improve `tidy.coxph()` for multistate models #1236

Improve `tidy.coxph()` for multistate models #1236

DanChaltiel commented Jan 5, 2025

Improve tidy.coxph() for multistate models #1236

Improve tidy.coxph() for multistate models #1236

Comments

DanChaltiel commented Jan 5, 2025

Feature

Improve `tidy.coxph()` for multistate models #1236

Improve `tidy.coxph()` for multistate models #1236