Beta Regression Trees

Description

Fit beta regression trees via model-based recursive partitioning.

Usage

betatree(formula, partition,
  data, subset = NULL, na.action = na.omit, weights, offset, cluster,
  link = "logit", link.phi = "log", control = betareg.control(),
  ...)

Arguments

formula symbolic description of the model of type y ~ x or y ~ x | z, specifying the variables influencing mean and precision of y, respectively. For details see betareg.
partition symbolic description of the partitioning variables, e.g., ~ p1 + p2. The argument partition can be omitted if formula is a three-part formula of type y ~ x | z | p1 + p2.
data, subset, na.action, weights, offset, cluster arguments controlling data/model processing passed to mob.
link character specification of the link function in the mean model (mu). Currently, “logit”, “probit”, “cloglog”, “cauchit”, “log”, “loglog” are supported. Alternatively, an object of class “link-glm” can be supplied.
link.phi character specification of the link function in the precision model (phi). Currently, “identity”, “log”, “sqrt” are supported. Alternatively, an object of class “link-glm” can be supplied.
control a list of control arguments for the beta regression specified via betareg.control.
further control arguments for the recursive partitioning passed to mob_control.

Details

Beta regression trees are an application of model-based recursive partitioning (implemented in mob, see Zeileis et al. 2008) to beta regression (implemented in betareg, see Cribari-Neto and Zeileis 2010). See also Grün at al. (2012) for more details.

Various methods are provided for “betatree” objects, most of them inherit their behavior from “mob” objects (e.g., print, summary, coef, etc.). The plot method employs the node_bivplot panel-generating function.

Value

betatree() returns an object of S3 class “betatree” which inherits from “modelparty”.

References

Cribari-Neto, F., and Zeileis, A. (2010). Beta Regression in R. Journal of Statistical Software, 34(2), 1–24. doi:10.18637/jss.v034.i02

Grün, B., Kosmidis, I., and Zeileis, A. (2012). Extended Beta Regression in R: Shaken, Stirred, Mixed, and Partitioned. Journal of Statistical Software, 48(11), 1–25. doi:10.18637/jss.v048.i11

Zeileis, A., Hothorn, T., and Hornik K. (2008). Model-Based Recursive Partitioning. Journal of Computational and Graphical Statistics, 17(2), 492–514.

See Also

betareg, betareg.fit, mob

Examples

library("betareg")

options(digits = 4)
suppressWarnings(RNGversion("3.5.0"))

## data with two groups of dyslexic and non-dyslexic children
data("ReadingSkills", package = "betareg")
## additional random noise (not associated with reading scores)
set.seed(1071)
ReadingSkills$x1 <- rnorm(nrow(ReadingSkills))
ReadingSkills$x2 <- runif(nrow(ReadingSkills))
ReadingSkills$x3 <- factor(rnorm(nrow(ReadingSkills)) > 0)

## fit beta regression tree: in each node
##   - accurcay's mean and precision depends on iq
##   - partitioning is done by dyslexia and the noise variables x1, x2, x3
## only dyslexia is correctly selected for splitting
bt <- betatree(accuracy ~ iq | iq, ~ dyslexia + x1 + x2 + x3,
  data = ReadingSkills, minsize = 10)
plot(bt)

## inspect result
coef(bt)
  (Intercept)       iq (phi)_(Intercept) (phi)_iq
2      1.6565  1.46571             1.273    2.048
3      0.3809 -0.08623             4.808    0.826
if(require("strucchange")) sctest(bt)
$`1`
           dyslexia     x1     x2     x3
statistic 2.269e+01 8.5251 5.5699 1.0568
p.value   5.848e-04 0.9095 0.9987 0.9999

$`2`
          dyslexia     x1     x2     x3
statistic        0 6.4116 4.5170 4.2308
p.value         NA 0.8412 0.9752 0.7566

$`3`
NULL
## IGNORE_RDIFF_BEGIN
summary(bt, node = 2)

Call:
betatree(formula = accuracy ~ iq | iq, data = ReadingSkills)

Quantile residuals:
   Min     1Q Median     3Q    Max 
-2.495 -0.437  0.210  0.953  1.090 

Coefficients (mean model with logit link):
            Estimate Std. Error z value Pr(>|z|)    
(Intercept)    1.657      0.286    5.78  7.3e-09 ***
iq             1.466      0.248    5.92  3.2e-09 ***

Phi coefficients (precision model with log link):
            Estimate Std. Error z value Pr(>|z|)    
(Intercept)    1.273      0.307    4.15  3.4e-05 ***
iq             2.048      0.331    6.19  5.9e-10 ***
---
Signif. codes:  0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1 

Type of estimator: ML (maximum likelihood)
Log-likelihood: 39.4 on 4 Df
Pseudo R-squared: 0.149
Number of iterations: 17 (BFGS) + 2 (Fisher scoring) 
summary(bt, node = 3)

Call:
betatree(formula = accuracy ~ iq | iq, data = ReadingSkills)

Quantile residuals:
   Min     1Q Median     3Q    Max 
-2.426 -0.631 -0.067  0.778  1.555 

Coefficients (mean model with logit link):
            Estimate Std. Error z value Pr(>|z|)    
(Intercept)   0.3809     0.0486    7.83  4.8e-15 ***
iq           -0.0862     0.0549   -1.57     0.12    

Phi coefficients (precision model with log link):
            Estimate Std. Error z value Pr(>|z|)    
(Intercept)    4.808      0.414   11.61   <2e-16 ***
iq             0.826      0.395    2.09    0.036 *  
---
Signif. codes:  0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1 

Type of estimator: ML (maximum likelihood)
Log-likelihood: 27.3 on 4 Df
Pseudo R-squared: 0.0391
Number of iterations: 16 (BFGS) + 2 (Fisher scoring) 
## IGNORE_RDIFF_END

## add a numerical variable with relevant information for splitting
ReadingSkills$x4 <- rnorm(nrow(ReadingSkills), c(-1.5, 1.5)[ReadingSkills$dyslexia])

bt2 <- betatree(accuracy ~ iq | iq, ~ x1 + x2 + x3 + x4,
  data = ReadingSkills, minsize = 10)
plot(bt2)

## inspect result
coef(bt2)
  (Intercept)      iq (phi)_(Intercept) (phi)_iq
2      1.7060 1.47402             1.293   2.0841
3      0.5048 0.03391             3.131  -0.7684
if(require("strucchange")) sctest(bt2)
$`1`
              x1     x2     x3       x4
statistic 8.5251 5.5699 1.0568 19.94405
p.value   0.9095 0.9987 0.9999  0.03485

$`2`
              x1     x2     x3     x4
statistic 8.9467 3.5888 3.5677 4.7049
p.value   0.5964 0.9985 0.9197 0.9848

$`3`
              x1     x2     x3     x4
statistic 5.5413 1.2373 4.8649 4.9921
p.value   0.6595 0.9997 0.7619 0.7432
## IGNORE_RDIFF_BEGIN
summary(bt2, node = 2)

Call:
betatree(formula = accuracy ~ iq | iq, data = ReadingSkills)

Quantile residuals:
   Min     1Q Median     3Q    Max 
-2.583 -0.393  0.177  0.923  1.054 

Coefficients (mean model with logit link):
            Estimate Std. Error z value Pr(>|z|)    
(Intercept)    1.706      0.292    5.85  4.9e-09 ***
iq             1.474      0.248    5.95  2.7e-09 ***

Phi coefficients (precision model with log link):
            Estimate Std. Error z value Pr(>|z|)    
(Intercept)    1.293      0.312    4.14  3.4e-05 ***
iq             2.084      0.333    6.25  4.0e-10 ***
---
Signif. codes:  0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1 

Type of estimator: ML (maximum likelihood)
Log-likelihood: 38.6 on 4 Df
Pseudo R-squared: 0.163
Number of iterations: 17 (BFGS) + 1 (Fisher scoring) 
summary(bt2, node = 3)

Call:
betatree(formula = accuracy ~ iq | iq, data = ReadingSkills)

Quantile residuals:
   Min     1Q Median     3Q    Max 
-2.070 -0.584 -0.156  0.639  2.188 

Coefficients (mean model with logit link):
            Estimate Std. Error z value Pr(>|z|)    
(Intercept)   0.5048     0.1245    4.05    5e-05 ***
iq            0.0339     0.0998    0.34     0.73    

Phi coefficients (precision model with log link):
            Estimate Std. Error z value Pr(>|z|)    
(Intercept)    3.131      0.370    8.45   <2e-16 ***
iq            -0.768      0.359   -2.14    0.032 *  
---
Signif. codes:  0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1 

Type of estimator: ML (maximum likelihood)
Log-likelihood: 22.4 on 4 Df
Pseudo R-squared: 0.0378
Number of iterations: 16 (BFGS) + 1 (Fisher scoring) 
## IGNORE_RDIFF_END