# A tibble: 26,044 × 993
otu_id `100259` `100262` `100267` `100274` `100275` `100277` `100291`
<chr> <dbl> <dbl> <dbl> <dbl> <dbl> <dbl> <dbl>
1 54 0 0 0 0 0 0 0
2 94 0 0 0 0 0 0 0
3 113 0 0 0 0 0 0 0
4 117 0 0 0 0 0 0 0
5 145 0 0 0 0 0 0 0
6 202 0 0 0 0 0 0 0
7 217 0 0 0 0 0 0 0
8 237 0 0 0 0 0 0 0
9 245 0 0 0 0 0 0 0
10 248 0 0 0 0 0 0 0
# ℹ 26,034 more rows
# ℹ 985 more variables: `100292` <dbl>, `100293` <dbl>, `100294` <dbl>,
# `100298` <dbl>, `100303` <dbl>, `100317` <dbl>, `100320` <dbl>,
# `100322` <dbl>, `100341` <dbl>, `100353` <dbl>, `100356` <dbl>,
# `100361` <dbl>, `100365` <dbl>, `100395` <dbl>, `100401` <dbl>,
# `100403` <dbl>, `100437` <dbl>, `100457` <dbl>, `100462` <dbl>,
# `100470` <dbl>, `100476` <dbl>, `100489` <dbl>, `100491` <dbl>, …
3.5 Comments
In part 1, we have fitted four models: Quasi-Poisson, Negative Binomial, Zero Inflated Poisson, and Zero Inflated Negative Binomial. The AIC values for these models are as follows:
In part 2, we have calculated the type I error rate for the four models. The median and IQR of the type I error rate for the four models are as follows:
From the above results, we can see that the Zero Inflated Negative Binomial model has the lowest AIC value for both Genus and Species data. The Zero Inflated Negative Binomial model also relatively performs better in terms of the type I error rate. Therefore, Zero Inflated Negative Binomial model seems to be good model for the given data. However, zero-inflated negative binomial regression models are computationally expensive and this model might fail to converge for some species, leading to NA values in the coefficients or p-values.
For a reliable Type I error rate and fewer NA issues, the Zero Inflated Poisson model is a safer choice. The Zero Inflated Poisson model has a higher AIC value compared to the Zero Inflated Negative Binomial model, but it is more robust and less computationally expensive. Therefore, the Zero Inflated Poisson model is a good alternative to the Zero Inflated Negative Binomial model. Given the importance of reliability in published results, I would recommend the Zero Inflated Poisson model for the given data due to its balance between fit (though not the best AIC) and robustness in Type I error rate estimation.