get_predicted(): zero-inflation options #413

bwiernik · 2021-08-03T19:12:57Z

Currently, for zero-inflated models, get_predicted() and its downstream functions like modelbased::estimate_expectation() always return the equivalent to type = "conditional" (predicted values assuming non-zero). It would be good to allow users to specify other methods, such as predicting unconditional response predictions (incorporating both parts of the model) or just the zero-inflation parts.

It would also be good I think to make the default equivalent to type = "response" (incorporating both model parts).

See the type argument in predict.glmmTMB():

type
Denoting mu as the mean of the conditional distribution and p as the zero-inflation probability, the possible choices are:

"link"
conditional mean on the scale of the link function, or equivalently the linear predictor of the conditional model
"response"
expected value; this is mu*(1-p) for zero-inflated models and mu otherwise
"conditional"
mean of the conditional response; mu for all models (i.e., synonymous with "response" in the absence of zero-inflation
"zprob"
the probability of a structural zero (gives an error for non-zero-inflated models)
"zlink"
predicted zero-inflation probability on the scale of the logit link functio>
"disp"
dispersion parameter however it is defined for that particular family as described in sigma.glmmTMB

The text was updated successfully, but these errors were encountered:

DominiqueMakowski · 2021-08-04T00:09:49Z

do you have a reproducible example with a model that has these components so I can play around?

bwiernik · 2021-08-04T00:38:24Z

Away from computer. The second example in ?glmmTMB

DominiqueMakowski · 2021-08-04T00:46:51Z

library(glmmTMB)

m <- glmmTMB(count ~ spp + mined + (1|site),
             zi=~spp + mined,
             family=nbinom2, data=Salamanders)


head(insight::get_predicted(m))
#> [1] 0.5387752 1.0768783 0.3554236 2.4701755 2.4950498 2.1819828
head(insight::get_predicted(m, type = "zprob"))
#> [1] 2.040119 2.040119 2.040119 1.174339 1.174339 1.174339

^{Created on 2021-08-04 by the reprex package (v2.0.0)}

custom types should work after the latest commit.

So now it becomes a question or wether we want to change / add to the behaviour of our main predict argument (this is also what would drive easystats/modelbased#136). We could have predict = "dispersion" or something like that, but then again I'm not very familiar with these models so I don't know

bwiernik · 2021-08-04T13:33:08Z

Yes, I think the predict argument should have options for zero inflated and dispersion parameters

bwiernik · 2021-08-04T15:53:15Z

I think we should revert the type argument. Our predict argument fills the same role, and it's confusing to have two.

Instead, I think we add options to predict to include these:

Existing

"link"
- linear predictor on the link scale (the conditional part for zero-inflated models)
- with confidence intervals (uncertainty intervals on linear prediction)
- same as glmmTMB's type = "link"

Existing labels, need adjusted behavior

"expectation"
- expected value (mean) on the response scale, including both the zero-inflated and conditional parts
- with confidence intervals (uncertainty intervals on the conditional mean)
- currently, it ignores the zero-inflated part
- should be mu*(1-p) for zero-inflated models and mu otherwise
- This would be equivalent to glmmTMB's type = "response"
"prediction"
- expected value (mean) on the response scale, including both the zero-inflated and conditional parts
- with predction intervals (uncertainty intervals on the individual cases)
"response"
- "prediction", but classifying probabilities in binomial models to 0-1
- uncertainty intervals would generally become [0, 1] for bernoulli models, but could be a range of integers for binomial models with multiple trials

New labels

"conditional"
- expected value (mean) on the response scale, conditional on non-structural zero
- with confidence intervals (uncertainty intervals on the conditional mean)
- what is currently returned by predict = "expectation"
"zprob"
- expected probability of a structural zero
- with confidence intervals (uncertainty intervals on the expected probability)
- effectively the "expectation" for the zero-inflation part of the model
- should give an error or message for non-zero-inflated models
"zlink"
- linear predictor for a structural zero on the link scale
- with confidence intervals (uncertainty intervals on the linear prediction)
- effectively the "expectation" for the zero-inflation part of the model
- should give an error or message for non-zero-inflated models
"dispersion"
- expected value for the dispersion parameter for the model (e.g., conditional SD for a normal linear model)
- with confidence intervals (uncertainty intervals on the conditional dispersion)
- For gaussian models with type = "disp", glmmTMB returns estimates/CI on the sigma (SD) scale; I don't see a need to offer other scales (variance or log-variance [this is what is actually modeled])

For predict = "prediction", we should include the dispersion parameter in the prediction intervals. @DominiqueMakowski If you could add a placeholder for that, I can fill in the necessary extractions from glmmTMB objects. How do we currently handle dispersion for things like Poisson models where there is a variance term in the model?

DominiqueMakowski · 2021-08-05T01:46:16Z

The needed steps to add/edit I believe are:

extending the list of possible arguments here:

insight/R/get_predicted.R

Line 275 in 1540c06

predict = c("expectation", "link", "prediction", "response", "relation"),

editing the logic here for the type:

insight/R/get_predicted.R

Lines 645 to 652 in 1540c06

    
           # Type (that's for the initial call to stats::predict) 
        
           if (!is.null(type) && all(type == "auto")) { 
        
             if (info$is_linear) { 
        
               type <- "response" 
        
             } else { 
        
               type <- "link" 
        
             } 
        
           }

and here for the CI

insight/R/get_predicted.R

Lines 633 to 643 in 1540c06

    
           # Prediction and CI type 
        
           if (predict == "link") { 
        
             ci_type <- "confidence" 
        
             scale <- "link" 
        
           } else if (predict == "expectation") { 
        
             ci_type <- "confidence" 
        
             scale <- "response" 
        
           } else if (predict %in% c("prediction", "response")) { 
        
             ci_type <- "prediction" 
        
             scale <- "response" 
        
           }

(essentially since we have one "master" argument predict, it is then passed to the .get_predicted_args() helper that assigns the traditional arguments)

strengejacke · 2021-08-09T08:13:08Z

@DominiqueMakowski can insight be submitted, or is there anything that needs to be addressed for modelbased?

DominiqueMakowski · 2021-08-09T08:15:57Z

I wouldn't say that this issue of adding more options for glmmTMB is urgent so probably not a blocker for a CRAN update

strengejacke · 2022-01-31T23:40:33Z

I think this will be closed in #501

strengejacke · 2022-02-10T12:58:33Z

should be resolved in #501 to #503

bwiernik mentioned this issue Aug 3, 2021

estimate_dispersion() (and maybe estimate_zeroinflation()?) easystats/modelbased#136

Open

strengejacke added the Enhancement 💥 Implemented features can be improved or revised label Aug 4, 2021

vincentarelbundock mentioned this issue Oct 11, 2021

get_predicted.glmmTMB tests and not transformations #449

Closed

strengejacke added the get_predicted Function specific issues label Jan 10, 2022

strengejacke mentioned this issue Jan 31, 2022

Update get_predicted.R #501

Merged

strengejacke closed this as completed Feb 10, 2022

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

get_predicted(): zero-inflation options #413

get_predicted(): zero-inflation options #413

bwiernik commented Aug 3, 2021

DominiqueMakowski commented Aug 4, 2021

bwiernik commented Aug 4, 2021

DominiqueMakowski commented Aug 4, 2021

bwiernik commented Aug 4, 2021

bwiernik commented Aug 4, 2021

DominiqueMakowski commented Aug 5, 2021

strengejacke commented Aug 9, 2021

DominiqueMakowski commented Aug 9, 2021

strengejacke commented Jan 31, 2022

strengejacke commented Feb 10, 2022

get_predicted(): zero-inflation options #413

get_predicted(): zero-inflation options #413

Comments

bwiernik commented Aug 3, 2021

DominiqueMakowski commented Aug 4, 2021

bwiernik commented Aug 4, 2021

DominiqueMakowski commented Aug 4, 2021

bwiernik commented Aug 4, 2021

bwiernik commented Aug 4, 2021

DominiqueMakowski commented Aug 5, 2021

strengejacke commented Aug 9, 2021

DominiqueMakowski commented Aug 9, 2021

strengejacke commented Jan 31, 2022

strengejacke commented Feb 10, 2022