Robust covariance matrix estimation: sandwich 3.0-0, web page, JSS paper

Version 3.0-0 of the R package 'sandwich' for robust covariance matrix estimation (HC, HAC, clustered, panel, and bootstrap) is now available from CRAN, accompanied by a new web page and a paper in the Journal of Statistical Software (JSS).

CRAN release of version 3.0-0

The sandwich package provides model-robust covariance matrix estimators for cross-sectional, time series, clustered, panel, and longitudinal data. The implementation is modular due to an object-oriented design with support for many model objects, including: lm, glm, survreg, coxph, mlogit, polr, hurdle, zeroinfl, and beyond.

The release of version 3.0-0 on CRAN (Comprehensive R Archive Network) completes the substantial updates and improvements started in the 2.4-x and 2.5-x releases: especially clustered, panel, and bootstrap covariances. In addition to the new pkgdown web page and paper in the Journal of Statistical Software (JSS), described below, the new release includes some smaller improvements in: some equations in the vignettes (suggested by Bettina Grün and Yves Croissant), the kernel weights function kweights() (suggested by Christoph Hanck), in the formula handling (suggested by David Hugh-Jones), in the bread() method for weighted mlm objects (suggested by James Pustejovsky). The full list of changes can be seen in the package’s NEWS.

New web page

The package comes with a dedicated pkgdown website on R-Forge now: This includes a nice logo, kindly provided by Reto Stauffer.

New sandwich logo

The web page essentially uses the previous content of the package (documentation, vignettes, NEWS) but also adds a nice overview of the package to help new users to “Get started”.

JSS paper

Zeileis A, Köll S, Graham N (2020). “Various Versatile Variances: An Object-Oriented Implementation of Clustered Covariances in R.” Journal of Statistical Software, 95(1), 1-36. doi:10.18637/jss.v095.i01.

Clustered covariances or clustered standard errors are very widely used to account for correlated or clustered data, especially in economics, political sciences, and other social sciences. They are employed to adjust the inference following estimation of a standard least-squares regression or generalized linear model estimated by maximum likelihood. Although many publications just refer to ``the’’ clustered standard errors, there is a surprisingly wide variety of clustered covariances, particularly due to different flavors of bias corrections. Furthermore, while the linear regression model is certainly the most important application case, the same strategies can be employed in more general models (e.g., for zero-inflated, censored, or limited responses).

In R, functions for covariances in clustered or panel models have been somewhat scattered or available only for certain modeling functions, notably the (generalized) linear regression model. In contrast, an object-oriented approach to “robust” covariance matrix estimation - applicable beyond lm() and glm() - is available in the sandwich package but has been limited to the case of cross-section or time series data. Starting with sandwich 2.4.0, this shortcoming has been corrected: Based on methods for two generic functions (estfun() and bread()), clustered and panel covariances are provided in vcovCL(), vcovPL(), and vcovPC(). Moreover, clustered bootstrap covariances are provided in vcovBS(), using model update() on bootstrap samples. These are directly applicable to models from packages including MASS, pscl, countreg, and betareg, among many others. Some empirical illustrations are provided as well as an assessment of the methods’ performance in a simulation study.