Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

estimate_grouplevel(): improvements for 1.0? BLUPs, random effects etc #412

Closed
3 of 5 tasks
DominiqueMakowski opened this issue Feb 21, 2025 · 3 comments · Fixed by #429
Closed
3 of 5 tasks

estimate_grouplevel(): improvements for 1.0? BLUPs, random effects etc #412

DominiqueMakowski opened this issue Feb 21, 2025 · 3 comments · Fixed by #429
Labels
Consistency 🍏 🍎 Expected output across functions or API design could be more similar Enhancement 💥 Implemented features can be improved or revised
Milestone

Comments

@DominiqueMakowski
Copy link
Member

DominiqueMakowski commented Feb 21, 2025

While writing a few lines for #405

I thought I'd so some brainstorming

Name

currently we refer in the vignette and the docstrings as "group-level parameters" as opposed to "population-level parameters". But would it be clearer to name them individual-level parameters? This kind of assumes that the groups are individuals, but on the other hand "group-level" might mean "at the level of the whole group [of participants]". I'm just having this kind of late semantic doubt, don't hesitate to reassure me that what we have is good enough

Arguments

we have estimate_grouplevel(..., type="random") (ranef) or type="total" (BLUPs, random + fixed)

to make in line with our 1.0 vision, maybe we could think of something better?

  • estimate="relative" (relative deviation from the fixed effect) vs. "absolute" [not sure that's much of an improvement on top of random/total tho]
  • ...

I'd be happy to keep things as is, but also it's worth doing some brainstorming now before 1.0

Other

  • Example in docstrings mentions outdated deviation=TRUE argument
  • Function throws some error with glmmTMB it seems
  • Better clean up of names for brms
  • Add the indices="all" and group="all" filtering capabilities (available in reshape_grouplevel()) as part of the main function too
  • type argument doesn't work for brms
DominiqueMakowski added a commit that referenced this issue Feb 21, 2025
@DominiqueMakowski
Copy link
Member Author

Here are some examples (we should probably add tests)

data <- iris
data$Group <- as.factor(rep(c("G1", "G2", "G3"), each = 50))

# LME4
model <- lme4::lmer(Sepal.Width ~ Petal.Width + (Petal.Width | Group), data = data)

modelbased::estimate_grouplevel(model, type = "random")
#> Group | Level | Parameter   | Coefficient |   SE |         95% CI
#> -----------------------------------------------------------------
#> Group | G1    | (Intercept) |        1.09 | 0.06 | [ 0.98,  1.20]
#> Group | G1    | Petal.Width |   -7.59e-04 | 0.15 | [-0.30,  0.30]
#> Group | G2    | (Intercept) |       -0.53 | 0.18 | [-0.88, -0.18]
#> Group | G2    | Petal.Width |        0.09 | 0.13 | [-0.16,  0.35]
#> Group | G3    | (Intercept) |       -0.56 | 0.23 | [-1.01, -0.11]
#> Group | G3    | Petal.Width |       -0.09 | 0.11 | [-0.31,  0.13]
modelbased::estimate_grouplevel(model, type = "total")
#> Group | Level | Parameter   | Coefficient
#> -----------------------------------------
#> Group | G1    | (Intercept) |        3.23
#> Group | G1    | Petal.Width |        0.78
#> Group | G2    | (Intercept) |        1.61
#> Group | G2    | Petal.Width |        0.88
#> Group | G3    | (Intercept) |        1.58
#> Group | G3    | Petal.Width |        0.69

# Alternative estimation
modelbased::estimate_grouplevel(model, type = "total") |> 
  datawizard::data_filter(Parameter == "Petal.Width")
#> Group | Level | Parameter   | Coefficient
#> -----------------------------------------
#> Group | G1    | Petal.Width |        0.78
#> Group | G2    | Petal.Width |        0.88
#> Group | G3    | Petal.Width |        0.69
modelbased::estimate_slopes(model, trend="Petal.Width", by=c("Group"))
#> Estimated Marginal Effects
#> 
#> Group | Slope |   SE |       95% CI |    t |      p
#> ---------------------------------------------------
#> G1    |  0.78 | 0.16 | [0.47, 1.10] | 4.86 | < .001
#> G2    |  0.88 | 0.16 | [0.56, 1.19] | 5.44 | < .001
#> G3    |  0.69 | 0.16 | [0.37, 1.01] | 4.29 | < .001
#> 
#> Marginal effects estimated for Petal.Width
#> Type of slope was dY/dX


# GLMMTMB
model <- glmmTMB::glmmTMB(Sepal.Width ~ Petal.Width + (Petal.Width | Group), data = data)

modelbased::estimate_grouplevel(model, type = "random")
#> Group | Level | Parameter   | Coefficient |   SE |        95% CI |   Component
#> ------------------------------------------------------------------------------
#> Group | G1    | (Intercept) |        1.08 | 0.47 | [ 0.15, 2.00] | conditional
#> Group | G1    | Petal.Width |   -5.19e-03 | 0.29 | [-0.58, 0.57] | conditional
#> Group | G2    | (Intercept) |       -0.40 | 0.57 | [-1.51, 0.71] | conditional
#> Group | G2    | Petal.Width |        0.01 | 0.29 | [-0.56, 0.59] | conditional
#> Group | G3    | (Intercept) |       -0.68 | 0.55 | [-1.75, 0.40] | conditional
#> Group | G3    | Petal.Width |   -9.56e-03 | 0.21 | [-0.42, 0.40] | conditional
modelbased::estimate_grouplevel(model, type = "total")
#> Warning in random[random$Parameter == p, cols] + fixed[fixed$Parameter == :
#> longer object length is not a multiple of shorter object length
#> Warning in random[random$Parameter == p, cols] + fixed[fixed$Parameter == :
#> longer object length is not a multiple of shorter object length
#> Group | Level | Parameter   | Coefficient | Component  
#> -------------------------------------------------------
#> Group | G1    | (Intercept) |        5.41 | conditional
#> Group | G1    | Petal.Width |        0.74 | conditional
#> Group | G2    | (Intercept) |        0.20 | conditional
#> Group | G2    | Petal.Width |        0.76 | conditional
#> Group | G3    | (Intercept) |        3.66 | conditional
#> Group | G3    | Petal.Width |        0.73 | conditional

# Alternative estimation
modelbased::estimate_grouplevel(model, type = "total") |> 
  datawizard::data_filter(Parameter == "Petal.Width")
#> Warning in random[random$Parameter == p, cols] + fixed[fixed$Parameter == :
#> longer object length is not a multiple of shorter object length
#> Warning in random[random$Parameter == p, cols] + fixed[fixed$Parameter == :
#> longer object length is not a multiple of shorter object length
#> Group | Level | Parameter   | Coefficient | Component  
#> -------------------------------------------------------
#> Group | G1    | Petal.Width |        0.74 | conditional
#> Group | G2    | Petal.Width |        0.76 | conditional
#> Group | G3    | Petal.Width |        0.73 | conditional
modelbased::estimate_slopes(model, trend="Petal.Width", by=c("Group"))
#> Estimated Marginal Effects
#> 
#> Group | Slope |   SE |       95% CI |    z |      p
#> ---------------------------------------------------
#> G1    |  0.74 | 0.17 | [0.40, 1.08] | 4.25 | < .001
#> G2    |  0.76 | 0.17 | [0.42, 1.10] | 4.36 | < .001
#> G3    |  0.73 | 0.17 | [0.39, 1.07] | 4.22 | < .001
#> 
#> Marginal effects estimated for Petal.Width
#> Type of slope was dY/dX


# BRMS
model <- brms::brm(Sepal.Width ~ Petal.Width + (Petal.Width | Group), data = data, refresh=0)
#> Compiling Stan program...
#> Start sampling
#> Warning: There were 321 divergent transitions after warmup. See
#> https://mc-stan.org/misc/warnings.html#divergent-transitions-after-warmup
#> to find out why this is a problem and how to eliminate them.
#> Warning: There were 34 transitions after warmup that exceeded the maximum treedepth. Increase max_treedepth above 10. See
#> https://mc-stan.org/misc/warnings.html#maximum-treedepth-exceeded
#> Warning: Examine the pairs() plot to diagnose sampling problems
#> Warning: Bulk Effective Samples Size (ESS) is too low, indicating posterior means and medians may be unreliable.
#> Running the chains for more iterations may help. See
#> https://mc-stan.org/misc/warnings.html#bulk-ess
#> Warning: Tail Effective Samples Size (ESS) is too low, indicating posterior variances and tail quantiles may be unreliable.
#> Running the chains for more iterations may help. See
#> https://mc-stan.org/misc/warnings.html#tail-ess

modelbased::estimate_grouplevel(model, type = "random")
#> Group | Level                   | Parameter                        
#> -------------------------------------------------------------------
#> Group | (Intercept)             | sd_Group__Intercept              
#> Group | Group.G1                | r_Group[G1,Intercept]            
#> Group | Group.G1                | r_Group[G1,Petal.Width]          
#> Group | Group.G2                | r_Group[G2,Intercept]            
#> Group | Group.G2                | r_Group[G2,Petal.Width]          
#> Group | Group.G3                | r_Group[G3,Intercept]            
#> Group | Group.G3                | r_Group[G3,Petal.Width]          
#> Group | Intercept ~ Petal.Width | cor_Group__Intercept__Petal.Width
#> Group | Petal.Width             | sd_Group__Petal.Width            
#> 
#> Group | Component   |   Median |        95% CI |     pd |  Rhat |     ESS
#> -------------------------------------------------------------------------
#> Group | conditional |     1.46 | [ 0.56, 4.28] |   100% | 1.006 | 1032.00
#> Group | conditional |     0.96 | [-0.76, 3.00] | 91.47% | 1.017 |  732.00
#> Group | conditional | 4.58e-04 | [-0.75, 0.94] | 50.12% | 1.017 |  319.00
#> Group | conditional |    -0.67 | [-2.51, 1.29] | 78.22% | 1.022 |  359.00
#> Group | conditional |     0.07 | [-0.60, 1.09] | 63.20% | 1.042 |  183.00
#> Group | conditional |    -0.63 | [-2.39, 1.39] | 76.78% | 1.023 |  418.00
#> Group | conditional |    -0.07 | [-0.92, 0.68] | 65.62% | 1.046 |  147.00
#> Group | conditional |     0.08 | [-0.94, 0.93] | 53.67% | 1.021 |  255.00
#> Group | conditional |     0.40 | [ 0.02, 2.77] |   100% | 1.106 |   35.00
modelbased::estimate_grouplevel(model, type = "total")
#> Group | Level                   | Parameter                        
#> -------------------------------------------------------------------
#> Group | (Intercept)             | sd_Group__Intercept              
#> Group | Group.G1                | r_Group[G1,Intercept]            
#> Group | Group.G1                | r_Group[G1,Petal.Width]          
#> Group | Group.G2                | r_Group[G2,Intercept]            
#> Group | Group.G2                | r_Group[G2,Petal.Width]          
#> Group | Group.G3                | r_Group[G3,Intercept]            
#> Group | Group.G3                | r_Group[G3,Petal.Width]          
#> Group | Intercept ~ Petal.Width | cor_Group__Intercept__Petal.Width
#> Group | Petal.Width             | sd_Group__Petal.Width            
#> 
#> Group | Component   |   Median |     pd |  Rhat |     ESS
#> ---------------------------------------------------------
#> Group | conditional |     1.46 |   100% | 1.006 | 1032.00
#> Group | conditional |     0.96 | 91.47% | 1.017 |  732.00
#> Group | conditional | 4.58e-04 | 50.12% | 1.017 |  319.00
#> Group | conditional |    -0.67 | 78.22% | 1.022 |  359.00
#> Group | conditional |     0.07 | 63.20% | 1.042 |  183.00
#> Group | conditional |    -0.63 | 76.78% | 1.023 |  418.00
#> Group | conditional |    -0.07 | 65.62% | 1.046 |  147.00
#> Group | conditional |     0.08 | 53.67% | 1.021 |  255.00
#> Group | conditional |     0.40 |   100% | 1.106 |   35.00

# Alternative estimation
modelbased::estimate_grouplevel(model, type = "total") 
#> Group | Level                   | Parameter                        
#> -------------------------------------------------------------------
#> Group | (Intercept)             | sd_Group__Intercept              
#> Group | Group.G1                | r_Group[G1,Intercept]            
#> Group | Group.G1                | r_Group[G1,Petal.Width]          
#> Group | Group.G2                | r_Group[G2,Intercept]            
#> Group | Group.G2                | r_Group[G2,Petal.Width]          
#> Group | Group.G3                | r_Group[G3,Intercept]            
#> Group | Group.G3                | r_Group[G3,Petal.Width]          
#> Group | Intercept ~ Petal.Width | cor_Group__Intercept__Petal.Width
#> Group | Petal.Width             | sd_Group__Petal.Width            
#> 
#> Group | Component   |   Median |     pd |  Rhat |     ESS
#> ---------------------------------------------------------
#> Group | conditional |     1.46 |   100% | 1.006 | 1032.00
#> Group | conditional |     0.96 | 91.47% | 1.017 |  732.00
#> Group | conditional | 4.58e-04 | 50.12% | 1.017 |  319.00
#> Group | conditional |    -0.67 | 78.22% | 1.022 |  359.00
#> Group | conditional |     0.07 | 63.20% | 1.042 |  183.00
#> Group | conditional |    -0.63 | 76.78% | 1.023 |  418.00
#> Group | conditional |    -0.07 | 65.62% | 1.046 |  147.00
#> Group | conditional |     0.08 | 53.67% | 1.021 |  255.00
#> Group | conditional |     0.40 |   100% | 1.106 |   35.00
modelbased::estimate_slopes(model, trend="Petal.Width", by=c("Group"))
#> Estimated Marginal Effects
#> 
#> Group | Median |       95% CI |     pd |          ROPE | % in ROPE
#> ------------------------------------------------------------------
#> G1    |   0.80 | [0.20, 1.48] | 99.33% | [-0.10, 0.10] |        0%
#> G2    |   0.91 | [0.56, 1.34] |   100% | [-0.10, 0.10] |        0%
#> G3    |   0.68 | [0.36, 0.95] |   100% | [-0.10, 0.10] |        0%
#> 
#> Marginal effects estimated for Petal.Width
#> Type of slope was dY/dX

Created on 2025-02-21 with reprex v2.1.1

@strengejacke
Copy link
Member

Maybe that's a psycho-thing. In epidemiology, the groups are usually other units than a (repeated sample of) individuals, but rather countries/regions, or other group-characteristics.

but on the other hand "group-level" might mean "at the level of the whole group [of participants]".

That would be my argument in case of, e.g. RCTs, where you indeed have individuals.

But I don't mind, "individual" could also refer to an "individual group member", which can be a patient, a country, a school/class, whatever... so we either have individual or group here (in teaching, I initially say that whenever I refer to variables used as random effects (intercepts), I call them "group factor" or "group level factor").

@bwiernik
Copy link
Contributor

I think "group" makes the most sense — the random effect variable is a "grouping variable". Depending on the field, it might be participants, subgroups, schools, families, plots, etc. The generic unifying feature is that these are variables that group multiple rows/cases together.

I think "random" and "total" are probably the most clear. "Random" is also applicable in cases where a fixed component isn't included in the model.

@strengejacke strengejacke added this to the 1.0.0 milestone Feb 23, 2025
@strengejacke strengejacke added Consistency 🍏 🍎 Expected output across functions or API design could be more similar Enhancement 💥 Implemented features can be improved or revised labels Feb 23, 2025
@DominiqueMakowski DominiqueMakowski linked a pull request Mar 1, 2025 that will close this issue
5 tasks
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Consistency 🍏 🍎 Expected output across functions or API design could be more similar Enhancement 💥 Implemented features can be improved or revised
Projects
None yet
Development

Successfully merging a pull request may close this issue.

3 participants