betareg

Beta Regression Trees

Description

Fit beta regression trees via model-based recursive partitioning.

Usage

betatree(formula, partition,
  data, subset = NULL, na.action = na.omit, weights, offset, cluster,
  link = "logit", link.phi = "log", control = betareg.control(),
  ...)

Arguments

`formula`	symbolic description of the model of type `y ~ x` or `y ~ x \| z`, specifying the variables influencing mean and precision of `y`, respectively. For details see `betareg`.
`partition`	symbolic description of the partitioning variables, e.g., `~ p1 + p2`. The argument `partition` can be omitted if `formula` is a three-part formula of type `y ~ x \| z \| p1 + p2`.
`data`, `subset`, `na.action`, `weights`, `offset`, `cluster`	arguments controlling data/model processing passed to `mob`.
`link`	character specification of the link function in the mean model (mu). Currently, `“logit”`, `“probit”`, `“cloglog”`, `“cauchit”`, `“log”`, `“loglog”` are supported. Alternatively, an object of class `“link-glm”` can be supplied.
`link.phi`	character specification of the link function in the precision model (phi). Currently, `“identity”`, `“log”`, `“sqrt”` are supported. Alternatively, an object of class `“link-glm”` can be supplied.
`control`	a list of control arguments for the beta regression specified via `betareg.control`.
`…`	further control arguments for the recursive partitioning passed to `mob_control`.

Details

Beta regression trees are an application of model-based recursive partitioning (implemented in mob, see Zeileis et al. 2008) to beta regression (implemented in betareg, see Cribari-Neto and Zeileis 2010). See also Grün at al. (2012) for more details.

Various methods are provided for “betatree” objects, most of them inherit their behavior from “mob” objects (e.g., print, summary, coef, etc.). The plot method employs the node_bivplot panel-generating function.

Value

betatree() returns an object of S3 class “betatree” which inherits from “modelparty”.

References

Cribari-Neto, F., and Zeileis, A. (2010). Beta Regression in R. Journal of Statistical Software, 34(2), 1–24. doi:10.18637/jss.v034.i02

Grün, B., Kosmidis, I., and Zeileis, A. (2012). Extended Beta Regression in R: Shaken, Stirred, Mixed, and Partitioned. Journal of Statistical Software, 48(11), 1–25. doi:10.18637/jss.v048.i11

Zeileis, A., Hothorn, T., and Hornik K. (2008). Model-Based Recursive Partitioning. Journal of Computational and Graphical Statistics, 17(2), 492–514.

Examples

library("betareg")

options(digits = 4)
suppressWarnings(RNGversion("3.5.0"))

## data with two groups of dyslexic and non-dyslexic children
data("ReadingSkills", package = "betareg")
## additional random noise (not associated with reading scores)
set.seed(1071)
ReadingSkills$x1 <- rnorm(nrow(ReadingSkills))
ReadingSkills$x2 <- runif(nrow(ReadingSkills))
ReadingSkills$x3 <- factor(rnorm(nrow(ReadingSkills)) > 0)

## fit beta regression tree: in each node
##   - accurcay's mean and precision depends on iq
##   - partitioning is done by dyslexia and the noise variables x1, x2, x3
## only dyslexia is correctly selected for splitting
bt <- betatree(accuracy ~ iq | iq, ~ dyslexia + x1 + x2 + x3,
  data = ReadingSkills, minsize = 10)
plot(bt)

## inspect result
coef(bt)

  (Intercept)       iq (phi)_(Intercept) (phi)_iq
2      1.6565  1.46571             1.273    2.048
3      0.3809 -0.08623             4.808    0.826

if(require("strucchange")) sctest(bt)

$`1`
           dyslexia     x1     x2     x3
statistic 2.269e+01 8.5251 5.5699 1.0568
p.value   5.848e-04 0.9095 0.9987 0.9999

$`2`
          dyslexia     x1     x2     x3
statistic        0 6.4116 4.5170 4.2308
p.value         NA 0.8412 0.9752 0.7566

$`3`
NULL

## IGNORE_RDIFF_BEGIN
summary(bt, node = 2)


Call:
betatree(formula = accuracy ~ iq | iq, data = ReadingSkills)

Quantile residuals:
   Min     1Q Median     3Q    Max 
-2.495 -0.437  0.210  0.953  1.090 

Coefficients (mean model with logit link):
            Estimate Std. Error z value Pr(>|z|)    
(Intercept)    1.657      0.286    5.78  7.3e-09 ***
iq             1.466      0.248    5.92  3.2e-09 ***

Phi coefficients (precision model with log link):
            Estimate Std. Error z value Pr(>|z|)    
(Intercept)    1.273      0.307    4.15  3.4e-05 ***
iq             2.048      0.331    6.19  5.9e-10 ***
---
Signif. codes:  0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1 

Type of estimator: ML (maximum likelihood)
Log-likelihood: 39.4 on 4 Df
Pseudo R-squared: 0.149
Number of iterations: 17 (BFGS) + 2 (Fisher scoring)

summary(bt, node = 3)


Call:
betatree(formula = accuracy ~ iq | iq, data = ReadingSkills)

Quantile residuals:
   Min     1Q Median     3Q    Max 
-2.426 -0.631 -0.067  0.778  1.555 

Coefficients (mean model with logit link):
            Estimate Std. Error z value Pr(>|z|)    
(Intercept)   0.3809     0.0486    7.83  4.8e-15 ***
iq           -0.0862     0.0549   -1.57     0.12    

Phi coefficients (precision model with log link):
            Estimate Std. Error z value Pr(>|z|)    
(Intercept)    4.808      0.414   11.61   <2e-16 ***
iq             0.826      0.395    2.09    0.036 *  
---
Signif. codes:  0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1 

Type of estimator: ML (maximum likelihood)
Log-likelihood: 27.3 on 4 Df
Pseudo R-squared: 0.0391
Number of iterations: 16 (BFGS) + 2 (Fisher scoring)

## IGNORE_RDIFF_END

## add a numerical variable with relevant information for splitting
ReadingSkills$x4 <- rnorm(nrow(ReadingSkills), c(-1.5, 1.5)[ReadingSkills$dyslexia])

bt2 <- betatree(accuracy ~ iq | iq, ~ x1 + x2 + x3 + x4,
  data = ReadingSkills, minsize = 10)
plot(bt2)

## inspect result
coef(bt2)

  (Intercept)      iq (phi)_(Intercept) (phi)_iq
2      1.7060 1.47402             1.293   2.0841
3      0.5048 0.03391             3.131  -0.7684

if(require("strucchange")) sctest(bt2)

$`1`
              x1     x2     x3       x4
statistic 8.5251 5.5699 1.0568 19.94405
p.value   0.9095 0.9987 0.9999  0.03485

$`2`
              x1     x2     x3     x4
statistic 8.9467 3.5888 3.5677 4.7049
p.value   0.5964 0.9985 0.9197 0.9848

$`3`
              x1     x2     x3     x4
statistic 5.5413 1.2373 4.8649 4.9921
p.value   0.6595 0.9997 0.7619 0.7432

## IGNORE_RDIFF_BEGIN
summary(bt2, node = 2)


Call:
betatree(formula = accuracy ~ iq | iq, data = ReadingSkills)

Quantile residuals:
   Min     1Q Median     3Q    Max 
-2.583 -0.393  0.177  0.923  1.054 

Coefficients (mean model with logit link):
            Estimate Std. Error z value Pr(>|z|)    
(Intercept)    1.706      0.292    5.85  4.9e-09 ***
iq             1.474      0.248    5.95  2.7e-09 ***

Phi coefficients (precision model with log link):
            Estimate Std. Error z value Pr(>|z|)    
(Intercept)    1.293      0.312    4.14  3.4e-05 ***
iq             2.084      0.333    6.25  4.0e-10 ***
---
Signif. codes:  0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1 

Type of estimator: ML (maximum likelihood)
Log-likelihood: 38.6 on 4 Df
Pseudo R-squared: 0.163
Number of iterations: 17 (BFGS) + 1 (Fisher scoring)

summary(bt2, node = 3)


Call:
betatree(formula = accuracy ~ iq | iq, data = ReadingSkills)

Quantile residuals:
   Min     1Q Median     3Q    Max 
-2.070 -0.584 -0.156  0.639  2.188 

Coefficients (mean model with logit link):
            Estimate Std. Error z value Pr(>|z|)    
(Intercept)   0.5048     0.1245    4.05    5e-05 ***
iq            0.0339     0.0998    0.34     0.73    

Phi coefficients (precision model with log link):
            Estimate Std. Error z value Pr(>|z|)    
(Intercept)    3.131      0.370    8.45   <2e-16 ***
iq            -0.768      0.359   -2.14    0.032 *  
---
Signif. codes:  0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1 

Type of estimator: ML (maximum likelihood)
Log-likelihood: 22.4 on 4 Df
Pseudo R-squared: 0.0378
Number of iterations: 16 (BFGS) + 1 (Fisher scoring)

## IGNORE_RDIFF_END