Package 'arm'

Title:	Data Analysis Using Regression and Multilevel/Hierarchical Models
Description:	Functions to accompany A. Gelman and J. Hill, Data Analysis Using Regression and Multilevel/Hierarchical Models, Cambridge University Press, 2007.
Authors:	Andrew Gelman [aut], Yu-Sung Su [aut, cre] , Masanao Yajima [ctb], Jennifer Hill [ctb], Maria Grazia Pittau [ctb], Jouni Kerman [ctb], Tian Zheng [ctb], Vincent Dorie [ctb]
Maintainer:	Yu-Sung Su <[email protected]>
License:	GPL (>= 2)
Version:	1.14-4
Built:	2025-04-01 05:32:39 UTC
Source:	https://github.com/suyusung/arm

Help Index

Functions to compute the balance statistics
Bayesian generalized linear models.
Bayesian Ordered Logistic or Probit Regression
Binned Residual Plot
Generic Function for Making Coefficient Plot
Contrast Matrices
Correlation Plot
Histogram for Discrete Distributions
Functions for Processing lm, glm, mer, polr and svyglm Output
Extract AIC and DIC from a ‘mer’ model
Formating the Rounding of Numbers
Function to Recall Last Source File
Logistic and Inverse logistic functions
Lalonde Dataset
Single Nearest Neighborhood Matching
Generic Function to Run ‘mcmcsamp()’ in lme4
Construct Design Matrices
Multiple Comparison Plot
Function to read data by columns
Function for Standardizing by Centering and Dividing by 2 sd's
residual plot for the observed values
Extract Standard Errors of Model Coefficients
Extract Residual Errors
Functions to Get Posterior Distributions
Function for Standardizing Regression Predictors by Centering and Dividing by 2 sd's
Trace plot of ‘bugs’ object
Triangle Plot

Functions to compute the balance statistics

Description

This function computes the balance statistics before and after matching.

Usage

balance(rawdata, treat, matched, estimand="ATT")

## S3 method for class 'balance'
print(x, ..., combined = FALSE, digits = 2)

## S3 method for class 'balance'
plot(x, longcovnames=NULL, which.covs="mixed",
    v.axis=TRUE, cex.main=1, cex.vars=1, cex.pts=1,
    mar=c(4, 3, 5.1, 2), plot=TRUE, x.max = NULL, ...)
balance(rawdata, treat, matched, estimand="ATT")

## S3 method for class 'balance'
print(x, ..., combined = FALSE, digits = 2)

## S3 method for class 'balance'
plot(x, longcovnames=NULL, which.covs="mixed",
    v.axis=TRUE, cex.main=1, cex.vars=1, cex.pts=1,
    mar=c(4, 3, 5.1, 2), plot=TRUE, x.max = NULL, ...)

Arguments

`rawdata`	The full covariate dataset
`treat`	the vector of treatment assignments for the full dataset
`matched`	vector of weights to apply to the full dataset to create the restructured data: for matching without replacement these will all be 0's and 1's; for one-to-one matching with replacement these will all be non-negative integers; for IPTW or more complicated matching methods these could be any non-negative numbers
`estimand`	can either be `ATT`, `ATC`, or `ATE`, default is `ATT`
`x`	an object return by the balance function.
`combined`	default is `FALSE`
`digits`	minimal number of significant digits, default is 2.
`longcovnames`	long covariate names. If not provided, plot will use covariate variable name by default
`which.covs`	`mixed` then it plots all as std diffs; `binary` it only plots binary and as abs unstd diffs; `cont` it only plots non-binary and as abs std diffs
`v.axis`	default is `TRUE`, which shows the top axis–axis(3).
`cex.main`	font size of main title
`cex.vars`	font size of variabel names
`cex.pts`	point size of the estimates
`mar`	A numerical vector of the form `c(bottom, left, top, right)` which gives the number of lines of margin to be specified on the four sides of the plot. The default is `c(0,3,5.1,2)`.
`plot`	default is `TRUE`, which will plot the plot.
`x.max`	set the max of the `xlim`, default is `NULL`
`...`	other plot options may be passed to this function

Details

This function plots the balance statistics before and after matching. The open circle dots represent the unmatched balance statistics. The solid dots represent the matched balance statistics. The closer the value of the estimates to the zero, the better the treated and control groups are balanced after matching.

Note

The function does not work with predictors that contain factor(x), log(x) or all other data transformation. Create new objects for these variables. Attach them into the original dataset before doing the matching procedure.

Author(s)

Jennifer Hill [email protected]; Yu-Sung Su [email protected]

References

Andrew Gelman and Jennifer Hill. (2006). Data Analysis Using Regression and Multilevel/Hierarchical Models. Cambridge University Press. (Chapter 10)

Examples

# matching first
old.par <- par(no.readonly = TRUE)
data(lalonde)
attach(lalonde)
fit <- glm(treat ~ re74 + re75 + age + factor(educ) + 
            black + hisp + married + nodegr + u74 + u75, 
            family=binomial(link="logit"))
pscores <- predict(fit, type="link")
matches <- matching(z=lalonde$treat, score=pscores)
matched <- matches$cnts

# balance check
b.stats <- balance(lalonde, treat, matched, estimand = "ATT")
print(b.stats)
plot(b.stats)
par(old.par)
# matching first
old.par <- par(no.readonly = TRUE)
data(lalonde)
attach(lalonde)
fit <- glm(treat ~ re74 + re75 + age + factor(educ) + 
            black + hisp + married + nodegr + u74 + u75, 
            family=binomial(link="logit"))
pscores <- predict(fit, type="link")
matches <- matching(z=lalonde$treat, score=pscores)
matched <- matches$cnts

# balance check
b.stats <- balance(lalonde, treat, matched, estimand = "ATT")
print(b.stats)
plot(b.stats)
par(old.par)

Bayesian generalized linear models.

Description

Bayesian functions for generalized linear modeling with independent normal, t, or Cauchy prior distribution for the coefficients.

Usage

bayesglm (formula, family = gaussian, data,
    weights, subset, na.action,
    start = NULL, etastart, mustart,
    offset, control = list(...),
    model = TRUE, method = "glm.fit",
    x = FALSE, y = TRUE, contrasts = NULL,
    drop.unused.levels = TRUE,
    prior.mean = 0,
    prior.scale = NULL,
    prior.df = 1,
    prior.mean.for.intercept = 0,
    prior.scale.for.intercept = NULL,
    prior.df.for.intercept = 1,
    min.prior.scale=1e-12,
    scaled = TRUE, keep.order=TRUE,
    drop.baseline=TRUE,
    maxit=100, 
    print.unnormalized.log.posterior=FALSE,
    Warning=TRUE,...)

bayesglm.fit (x, y, weights = rep(1, nobs),
    start = NULL, etastart = NULL,
    mustart = NULL, offset = rep(0, nobs), family = gaussian(),
    control = list(), intercept = TRUE,
    prior.mean = 0,
    prior.scale = NULL,
    prior.df = 1,
    prior.mean.for.intercept = 0,
    prior.scale.for.intercept = NULL,
    prior.df.for.intercept = 1,
    min.prior.scale=1e-12, scaled = TRUE,
    print.unnormalized.log.posterior=FALSE, Warning=TRUE)
bayesglm (formula, family = gaussian, data,
    weights, subset, na.action,
    start = NULL, etastart, mustart,
    offset, control = list(...),
    model = TRUE, method = "glm.fit",
    x = FALSE, y = TRUE, contrasts = NULL,
    drop.unused.levels = TRUE,
    prior.mean = 0,
    prior.scale = NULL,
    prior.df = 1,
    prior.mean.for.intercept = 0,
    prior.scale.for.intercept = NULL,
    prior.df.for.intercept = 1,
    min.prior.scale=1e-12,
    scaled = TRUE, keep.order=TRUE,
    drop.baseline=TRUE,
    maxit=100, 
    print.unnormalized.log.posterior=FALSE,
    Warning=TRUE,...)

bayesglm.fit (x, y, weights = rep(1, nobs),
    start = NULL, etastart = NULL,
    mustart = NULL, offset = rep(0, nobs), family = gaussian(),
    control = list(), intercept = TRUE,
    prior.mean = 0,
    prior.scale = NULL,
    prior.df = 1,
    prior.mean.for.intercept = 0,
    prior.scale.for.intercept = NULL,
    prior.df.for.intercept = 1,
    min.prior.scale=1e-12, scaled = TRUE,
    print.unnormalized.log.posterior=FALSE, Warning=TRUE)

Arguments

`formula`	a symbolic description of the model to be fit. The details of model specification are given below.
`family`	a description of the error distribution and link function to be used in the model. This can be a character string naming a family function, a family function or the result of a call to a family function. (See `family` for details of family functions.)
`data`	an optional data frame, list or environment (or object coercible by `as.data.frame` to a data frame) containing the variables in the model. If not found in `data`, the variables are taken from `environment(formula)`, typically the environment from which `glm` is called.
`weights`	an optional vector of weights to be used in the fitting process. Should be `NULL` or a numeric vector.
`subset`	an optional vector specifying a subset of observations to be used in the fitting process.
`na.action`	a function which indicates what should happen when the data contain `NA`s. The default is set by the `na.action` setting of `options`, and is `na.fail` if that is unset. The “factory-fresh” default is `na.omit`. Another possible value is `NULL`, no action. Value `na.exclude` can be useful.
`start`	starting values for the parameters in the linear predictor.
`etastart`	starting values for the linear predictor.
`mustart`	starting values for the vector of means.
`offset`	this can be used to specify an a priori known component to be included in the linear predictor during fitting. This should be `NULL` or a numeric vector of length either one or equal to the number of cases. One or more `offset` terms can be included in the formula instead or as well, and if both are specified their sum is used. See `model.offset`.
`control`	a list of parameters for controlling the fitting process. See the documentation for `glm.control` for details.
`model`	a logical value indicating whether model frame should be included as a component of the returned value.
`method`	the method to be used in fitting the model. The default method `"glm.fit"` uses iteratively reweighted least squares (IWLS). The only current alternative is `"model.frame"` which returns the model frame and does no fitting.
`x`, `y`	For `glm`: logical values indicating whether the response vector and model matrix used in the fitting process should be returned as components of the returned value. For `glm.fit`: `x` is a design matrix of dimension `n * p`, and `y` is a vector of observations of length `n`.
`contrasts`	an optional list. See the `contrasts.arg` of `model.matrix.default`.
`drop.unused.levels`	default TRUE, if FALSE, it interpolates the intermediate values if the data have integer levels.
`intercept`	logical. Should an intercept be included in the null model?
`prior.mean`	prior mean for the coefficients: default is 0. Can be a vector of length equal to the number of predictors (not counting the intercept, if any). If it is a scalar, it is expanded to the length of this vector.
`prior.scale`	prior scale for the coefficients: default is NULL; if is NULL, for a logit model, prior.scale is 2.5; for a probit model, prior scale is 2.5*1.6. Can be a vector of length equal to the number of predictors (not counting the intercept, if any). If it is a scalar, it is expanded to the length of this vector.
`prior.df`	prior degrees of freedom for the coefficients. For t distribution: default is 1 (Cauchy). Set to Inf to get normal prior distributions. Can be a vector of length equal to the number of predictors (not counting the intercept, if any). If it is a scalar, it is expanded to the length of this vector.
`prior.mean.for.intercept`	prior mean for the intercept: default is 0. See ‘Details’.
`prior.scale.for.intercept`	prior scale for the intercept: default is NULL; for a logit model, prior scale for intercept is 10; for probit model, prior scale for intercept is rescaled as 10*1.6.
`prior.df.for.intercept`	prior degrees of freedom for the intercept: default is 1.
`min.prior.scale`	Minimum prior scale for the coefficients: default is 1e-12.
`scaled`	scaled=TRUE, the scales for the prior distributions of the coefficients are determined as follows: For a predictor with only one value, we just use prior.scale. For a predictor with two values, we use prior.scale/range(x). For a predictor with more than two values, we use prior.scale/(2sd(x)). If the response is Gaussian, prior.scale is also multiplied by 2 sd(y). Default is TRUE
`keep.order`	a logical value indicating whether the terms should keep their positions. If `FALSE` the terms are reordered so that main effects come first, followed by the interactions, all second-order, all third-order and so on. Effects of a given order are kept in the order specified. Default is TRUE.
`drop.baseline`	Drop the base level of categorical x's, default is TRUE.
`maxit`	integer giving the maximal number of IWLS iterations, default is 100. This can also be controlled by `control`.
`print.unnormalized.log.posterior`	display the unnormalized log posterior likelihood for bayesglm, default=FALSE
`Warning`	default is TRUE, which will show the error messages of not convergence and separation.
`...`	further arguments passed to or from other methods.

Details

The program is a simple alteration of glm() that uses an approximate EM algorithm to update the betas at each step using an augmented regression to represent the prior information.

We use Student-t prior distributions for the coefficients. The prior distribution for the constant term is set so it applies to the value when all predictors are set to their mean values.

If scaled=TRUE, the scales for the prior distributions of the coefficients are determined as follows: For a predictor with only one value, we just use prior.scale. For a predictor with two values, we use prior.scale/range(x). For a predictor with more than two values, we use prior.scale/(2*sd(x)).

We include all the glm() arguments but we haven't tested that all the options (e.g., offsets, contrasts, deviance for the null model) all work.

The new arguments here are: prior.mean, prior.scale, prior.scale.for.intercept, prior.df, prior.df.for.interceptand scaled.

Value

See glm for details.

`prior.mean`	prior means for the coefficients and the intercept.
`prior.scale`	prior scales for the coefficients
`prior.df`	prior dfs for the coefficients.
`prior.scale.for.intercept`	prior scale for the intercept
`prior.df.for.intercept`	prior df for the intercept

Author(s)

Andrew Gelman [email protected]; Yu-Sung Su [email protected]; Daniel Lee [email protected]; Aleks Jakulin [email protected]

References

Andrew Gelman, Aleks Jakulin, Maria Grazia Pittau and Yu-Sung Su. (2009). “A Weakly Informative Default Prior Distribution For Logistic And Other Regression Models.” The Annals of Applied Statistics 2 (4): 1360–1383. http://www.stat.columbia.edu/~gelman/research/published/priors11.pdf

Examples

  n <- 100
  x1 <- rnorm (n)
  x2 <- rbinom (n, 1, .5)
  b0 <- 1
  b1 <- 1.5
  b2 <- 2
  y <- rbinom (n, 1, invlogit(b0+b1*x1+b2*x2))

  M1 <- glm (y ~ x1 + x2, family=binomial(link="logit"))
  display (M1)

  M2 <- bayesglm (y ~ x1 + x2, family=binomial(link="logit"),
    prior.scale=Inf, prior.df=Inf)
  display (M2)  # just a test:  this should be identical to classical logit

  M3 <- bayesglm (y ~ x1 + x2, family=binomial(link="logit"))
    # default Cauchy prior with scale 2.5
  display (M3)

  M4 <- bayesglm (y ~ x1 + x2, family=binomial(link="logit"),
    prior.scale=2.5, prior.df=1)
    # Same as M3, explicitly specifying Cauchy prior with scale 2.5
  display (M4)

  M5 <- bayesglm (y ~ x1 + x2, family=binomial(link="logit"),
    prior.scale=2.5, prior.df=7)   # t_7 prior with scale 2.5
  display (M5)

  M6 <- bayesglm (y ~ x1 + x2, family=binomial(link="logit"),
    prior.scale=2.5, prior.df=Inf)  # normal prior with scale 2.5
  display (M6)

# Create separation:  set y=1 whenever x2=1
# Now it should blow up without the prior!

  y <- ifelse (x2==1, 1, y)

  M1 <- glm (y ~ x1 + x2, family=binomial(link="logit"))
  display (M1)

  M2 <- bayesglm (y ~ x1 + x2, family=binomial(link="logit"),
    prior.scale=Inf, prior.scale.for.intercept=Inf) # Same as M1
  display (M2)

  M3 <- bayesglm (y ~ x1 + x2, family=binomial(link="logit"))
  display (M3)

  M4 <- bayesglm (y ~ x1 + x2, family=binomial(link="logit"),
    prior.scale=2.5, prior.scale.for.intercept=10)  # Same as M3
  display (M4)

  M5 <- bayesglm (y ~ x1 + x2, family=binomial(link="logit"),
    prior.scale=2.5, prior.df=7)
  display (M5)

  M6 <- bayesglm (y ~ x1 + x2, family=binomial(link="logit"),
    prior.scale=2.5, prior.df=Inf)
  display (M6)

  # bayesglm with gaussian family (bayes lm)
  sigma <- 5
  y2 <- rnorm (n, b0+b1*x1+b2*x2, sigma)
  M7 <- bayesglm (y2 ~ x1 + x2, prior.scale=Inf, prior.df=Inf)
  display (M7)


  # bayesglm with categorical variables
  z1 <- trunc(runif(n, 4, 9))
  levels(factor(z1))
  z2 <- trunc(runif(n, 15, 19))
  levels(factor(z2))

  ## drop the base level (R default)
  M8 <- bayesglm (y ~ x1 + factor(z1) + factor(z2),
    family=binomial(link="logit"), prior.scale=2.5, prior.df=Inf)
  display (M8)

  ## keep all levels with the intercept, keep the variable order
  M9 <- bayesglm (y ~ x1 + x1:x2 + factor(z1) + x2 + factor(z2),
    family=binomial(link="logit"),
    prior.mean=rep(0,12),
    prior.scale=rep(2.5,12),
    prior.df=rep(Inf,12),
    prior.mean.for.intercept=0,
    prior.scale.for.intercept=10,
    prior.df.for.intercept=1,
    drop.baseline=FALSE, keep.order=TRUE)
  display (M9)

  ## keep all levels without the intercept
  M10 <- bayesglm (y ~ x1 + factor(z1) + x1:x2 + factor(z2)-1,
    family=binomial(link="logit"),
    prior.mean=rep(0,11),
    prior.scale=rep(2.5,11),
    prior.df=rep(Inf,11),
    drop.baseline=FALSE)
  display (M10)

n <- 100
  x1 <- rnorm (n)
  x2 <- rbinom (n, 1, .5)
  b0 <- 1
  b1 <- 1.5
  b2 <- 2
  y <- rbinom (n, 1, invlogit(b0+b1*x1+b2*x2))

  M1 <- glm (y ~ x1 + x2, family=binomial(link="logit"))
  display (M1)

  M2 <- bayesglm (y ~ x1 + x2, family=binomial(link="logit"),
    prior.scale=Inf, prior.df=Inf)
  display (M2)  # just a test:  this should be identical to classical logit

  M3 <- bayesglm (y ~ x1 + x2, family=binomial(link="logit"))
    # default Cauchy prior with scale 2.5
  display (M3)

  M4 <- bayesglm (y ~ x1 + x2, family=binomial(link="logit"),
    prior.scale=2.5, prior.df=1)
    # Same as M3, explicitly specifying Cauchy prior with scale 2.5
  display (M4)

  M5 <- bayesglm (y ~ x1 + x2, family=binomial(link="logit"),
    prior.scale=2.5, prior.df=7)   # t_7 prior with scale 2.5
  display (M5)

  M6 <- bayesglm (y ~ x1 + x2, family=binomial(link="logit"),
    prior.scale=2.5, prior.df=Inf)  # normal prior with scale 2.5
  display (M6)

# Create separation:  set y=1 whenever x2=1
# Now it should blow up without the prior!

  y <- ifelse (x2==1, 1, y)

  M1 <- glm (y ~ x1 + x2, family=binomial(link="logit"))
  display (M1)

  M2 <- bayesglm (y ~ x1 + x2, family=binomial(link="logit"),
    prior.scale=Inf, prior.scale.for.intercept=Inf) # Same as M1
  display (M2)

  M3 <- bayesglm (y ~ x1 + x2, family=binomial(link="logit"))
  display (M3)

  M4 <- bayesglm (y ~ x1 + x2, family=binomial(link="logit"),
    prior.scale=2.5, prior.scale.for.intercept=10)  # Same as M3
  display (M4)

  M5 <- bayesglm (y ~ x1 + x2, family=binomial(link="logit"),
    prior.scale=2.5, prior.df=7)
  display (M5)

  M6 <- bayesglm (y ~ x1 + x2, family=binomial(link="logit"),
    prior.scale=2.5, prior.df=Inf)
  display (M6)

  # bayesglm with gaussian family (bayes lm)
  sigma <- 5
  y2 <- rnorm (n, b0+b1*x1+b2*x2, sigma)
  M7 <- bayesglm (y2 ~ x1 + x2, prior.scale=Inf, prior.df=Inf)
  display (M7)


  # bayesglm with categorical variables
  z1 <- trunc(runif(n, 4, 9))
  levels(factor(z1))
  z2 <- trunc(runif(n, 15, 19))
  levels(factor(z2))

  ## drop the base level (R default)
  M8 <- bayesglm (y ~ x1 + factor(z1) + factor(z2),
    family=binomial(link="logit"), prior.scale=2.5, prior.df=Inf)
  display (M8)

  ## keep all levels with the intercept, keep the variable order
  M9 <- bayesglm (y ~ x1 + x1:x2 + factor(z1) + x2 + factor(z2),
    family=binomial(link="logit"),
    prior.mean=rep(0,12),
    prior.scale=rep(2.5,12),
    prior.df=rep(Inf,12),
    prior.mean.for.intercept=0,
    prior.scale.for.intercept=10,
    prior.df.for.intercept=1,
    drop.baseline=FALSE, keep.order=TRUE)
  display (M9)

  ## keep all levels without the intercept
  M10 <- bayesglm (y ~ x1 + factor(z1) + x1:x2 + factor(z2)-1,
    family=binomial(link="logit"),
    prior.mean=rep(0,11),
    prior.scale=rep(2.5,11),
    prior.df=rep(Inf,11),
    drop.baseline=FALSE)
  display (M10)

Bayesian Ordered Logistic or Probit Regression

Description

Bayesian functions for ordered logistic or probit modeling with independent normal, t, or Cauchy prior distribution for the coefficients.

Usage

bayespolr(formula, data, weights, start,  ...,
    subset, na.action, contrasts = NULL,
    Hess = TRUE, model = TRUE,
    method = c("logistic", "probit", "cloglog", "cauchit"),
    drop.unused.levels=TRUE,
    prior.mean = 0,
    prior.scale = 2.5,
    prior.df = 1,
    prior.counts.for.bins = NULL,
    min.prior.scale=1e-12,
    scaled = TRUE,
    maxit = 100,
    print.unnormalized.log.posterior = FALSE)
bayespolr(formula, data, weights, start,  ...,
    subset, na.action, contrasts = NULL,
    Hess = TRUE, model = TRUE,
    method = c("logistic", "probit", "cloglog", "cauchit"),
    drop.unused.levels=TRUE,
    prior.mean = 0,
    prior.scale = 2.5,
    prior.df = 1,
    prior.counts.for.bins = NULL,
    min.prior.scale=1e-12,
    scaled = TRUE,
    maxit = 100,
    print.unnormalized.log.posterior = FALSE)

Arguments

`formula`	a formula expression as for regression models, of the form `response ~ predictors`. The response should be a factor (preferably an ordered factor), which will be interpreted as an ordinal response, with levels ordered as in the factor. A proportional odds model will be fitted. The model must have an intercept: attempts to remove one will lead to a warning and be ignored. An offset may be used. See the documentation of `formula` for other details.
`data`	an optional data frame in which to interpret the variables occurring in `formula`.
`weights`	optional case weights in fitting. Default to 1.
`start`	initial values for the parameters. This is in the format `c(coefficients, zeta)`
`...`	additional arguments to be passed to `optim`, most often a `control` argument.
`subset`	expression saying which subset of the rows of the data should be used in the fit. All observations are included by default.
`na.action`	a function to filter missing data.
`contrasts`	a list of contrasts to be used for some or all of the factors appearing as variables in the model formula.
`Hess`	logical for whether the Hessian (the observed information matrix) should be returned.
`model`	logical for whether the model matrix should be returned.
`method`	logistic or probit or complementary log-log or cauchit (corresponding to a Cauchy latent variable and only available in R >= 2.1.0).
`drop.unused.levels`	default `TRUE`, if `FALSE`, it interpolates the intermediate values if the data have integer levels.
`prior.mean`	prior mean for the coefficients: default is 0. Can be a vector of length equal to the number of predictors (not counting the intercepts). If it is a scalar, it is expanded to the length of this vector.
`prior.scale`	prior scale for the coefficients: default is 2.5. Can be a vector of length equal to the number of predictors (not counting the intercepts). If it is a scalar, it is expanded to the length of this vector.
`prior.df`	for t distribution: default is 1 (Cauchy). Set to `Inf` to get normal prior distributions. Can be a vector of length equal to the number of predictors (not counting the intercepts). If it is a scalar, it is expanded to the length of this vector.
`prior.counts.for.bins`	default is `NULL`, which will augment the data by giving each cut point a `1/levels(y)`. To use a noninformative prior, assign prior.counts.for.bins = 0. If it is a scalar, it is expanded to the number of levels of y.
`min.prior.scale`	Minimum prior scale for the coefficients: default is 1e-12.
`scaled`	if `scaled = TRUE`, then the prior distribution is rescaled. Can be a vector of length equal to the number of cutpoints (intercepts). If it is a scalar, it is expanded to the length of this vector.
`maxit`	integer giving the maximal number of IWLS iterations, default is 100. This can also be controlled by `control`.
`print.unnormalized.log.posterior`	display the unnormalized log posterior likelihood for bayesglm fit, default=`FALSE`

Details

The program is a simple alteration of polr in VR version 7.2-31 that augments the loglikelihood with the log of the t prior distributions for the coefficients.

We use Student-t prior distributions for the coefficients. The prior distributions for the intercepts (the cutpoints) are set so they apply to the value when all predictors are set to their mean values.

If scaled=TRUE, the scales for the prior distributions of the coefficients are determined as follows: For a predictor with only one value, we just use prior.scale. For a predictor with two values, we use prior.scale/range(x). For a predictor with more than two values, we use prior.scale/(2*sd(x)).

Value

See polr for details.

`prior.mean`	prior means for the cofficients.
`prior.scale`	prior scales for the cofficients.
`prior.df`	prior dfs for the cofficients.
`prior.counts.for.bins`	prior counts for the cutpoints.

Author(s)

Andrew Gelman [email protected]; Yu-Sung Su [email protected]; Maria Grazia Pittau [email protected]

Examples

    M1 <- polr(Sat ~ Infl + Type + Cont, weights = Freq, data = housing)
    display (M1)

    M2 <- bayespolr(Sat ~ Infl + Type + Cont, weights = Freq, data = housing,
        prior.scale=Inf, prior.df=Inf) # Same as M1
    display (M2)

    M3 <- bayespolr(Sat ~ Infl + Type + Cont, weights = Freq, data = housing)
    display (M3)

    M4 <- bayespolr(Sat ~ Infl + Type + Cont, weights = Freq, data = housing,
        prior.scale=2.5, prior.df=1)  # Same as M3
    display (M4)

    M5 <- bayespolr(Sat ~ Infl + Type + Cont, weights = Freq, data = housing,
        prior.scale=2.5, prior.df=7)
    display (M5)

    M6 <- bayespolr(Sat ~ Infl + Type + Cont, weights = Freq, data = housing,
        prior.scale=2.5, prior.df=Inf)
    display (M6)

    # Assign priors
    M7 <- bayespolr(Sat ~ Infl + Type + Cont, weights = Freq, data = housing,
        prior.mean=rep(0,6), prior.scale=rep(2.5,6), prior.df=c(1,1,1,7,7,7))
    display (M7)


    #### Another example
    y <- factor (rep (1:10,1:10))
    x <- rnorm (length(y))
    x <- x - mean(x)

    M8 <- polr (y ~ x)
    display (M8)

    M9 <- bayespolr (y ~ x,  prior.scale=Inf, prior.df=Inf, prior.counts.for.bins=0)
    display (M9) # same as M1

    M10 <- bayespolr (y ~ x,  prior.scale=Inf, prior.df=Inf, prior.counts.for.bins=10000)
    display (M10)


    #### Another example

    y <- factor (rep (1:3,1:3))
    x <- rnorm (length(y))
    x <- x - mean(x)

    M11 <- polr (y ~ x)
    display (M11)

    M12 <- bayespolr (y ~ x,  prior.scale=Inf, prior.df=Inf, prior.counts.for.bins=0)
    display (M12) # same as M1

    M13 <- bayespolr (y ~ x,  prior.scale=Inf, prior.df=Inf, prior.counts.for.bins=1)
    display (M13)

    M14 <- bayespolr (y ~ x,  prior.scale=Inf, prior.df=Inf, prior.counts.for.bins=10)
    display (M14)

M1 <- polr(Sat ~ Infl + Type + Cont, weights = Freq, data = housing)
    display (M1)

    M2 <- bayespolr(Sat ~ Infl + Type + Cont, weights = Freq, data = housing,
        prior.scale=Inf, prior.df=Inf) # Same as M1
    display (M2)

    M3 <- bayespolr(Sat ~ Infl + Type + Cont, weights = Freq, data = housing)
    display (M3)

    M4 <- bayespolr(Sat ~ Infl + Type + Cont, weights = Freq, data = housing,
        prior.scale=2.5, prior.df=1)  # Same as M3
    display (M4)

    M5 <- bayespolr(Sat ~ Infl + Type + Cont, weights = Freq, data = housing,
        prior.scale=2.5, prior.df=7)
    display (M5)

    M6 <- bayespolr(Sat ~ Infl + Type + Cont, weights = Freq, data = housing,
        prior.scale=2.5, prior.df=Inf)
    display (M6)

    # Assign priors
    M7 <- bayespolr(Sat ~ Infl + Type + Cont, weights = Freq, data = housing,
        prior.mean=rep(0,6), prior.scale=rep(2.5,6), prior.df=c(1,1,1,7,7,7))
    display (M7)


    #### Another example
    y <- factor (rep (1:10,1:10))
    x <- rnorm (length(y))
    x <- x - mean(x)

    M8 <- polr (y ~ x)
    display (M8)

    M9 <- bayespolr (y ~ x,  prior.scale=Inf, prior.df=Inf, prior.counts.for.bins=0)
    display (M9) # same as M1

    M10 <- bayespolr (y ~ x,  prior.scale=Inf, prior.df=Inf, prior.counts.for.bins=10000)
    display (M10)


    #### Another example

    y <- factor (rep (1:3,1:3))
    x <- rnorm (length(y))
    x <- x - mean(x)

    M11 <- polr (y ~ x)
    display (M11)

    M12 <- bayespolr (y ~ x,  prior.scale=Inf, prior.df=Inf, prior.counts.for.bins=0)
    display (M12) # same as M1

    M13 <- bayespolr (y ~ x,  prior.scale=Inf, prior.df=Inf, prior.counts.for.bins=1)
    display (M13)

    M14 <- bayespolr (y ~ x,  prior.scale=Inf, prior.df=Inf, prior.counts.for.bins=10)
    display (M14)

Binned Residual Plot

Description

A function that plots averages of y versus averages of x and can be useful to plot residuals for logistic regression.

Usage

 binnedplot(x ,y, nclass=NULL, 
    xlab="Expected Values", ylab="Average residual", 
    main="Binned residual plot", 
    cex.pts=0.8, col.pts=1, col.int="gray", ...)
binnedplot(x ,y, nclass=NULL, 
    xlab="Expected Values", ylab="Average residual", 
    main="Binned residual plot", 
    cex.pts=0.8, col.pts=1, col.int="gray", ...)

Arguments

`x`	The expected values from the logistic regression.
`y`	The residuals values from logistic regression (observed values minus expected values).
`nclass`	Number of categories (bins) based on their fitted values in which the data are divided. Default=NULL and will take the value of nclass according to the $n$ such that if $n >=100$, nclass=floor(sqrt(length(x))); if $10<n<100$, nclass=10; if $n<10$, nclass=floor(n/2).
`xlab`	a label for the x axis, default is "Expected Values".
`ylab`	a label for the y axis, default is "Average residual".
`main`	a main title for the plot, default is "Binned residual plot". See also `title`.
`cex.pts`	The size of points, default=0.8.
`col.pts`	color of points, default is black
`col.int`	color of intervals, default is gray
`...`	Graphical parameters to be passed to methods

Details

In logistic regression, as with linear regression, the residuals can be defined as observed minus expected values. The data are discrete and so are the residuals. As a result, plots of raw residuals from logistic regression are generally not useful. The binned residuals plot instead, after dividing the data into categories (bins) based on their fitted values, plots the average residual versus the average fitted value for each bin.

Value

A plot in which the gray lines indicate plus and minus 2 standard-error bounds, within which one would expect about 95% of the binned residuals to fall, if the model were actually true.

Note

There is typically some arbitrariness in choosing the number of bins: each bin should contain enough points so that the averaged residuals are not too noisy, but it helps to have also many bins so as to see more local patterns in the residuals (see Gelman and Hill, Data Analysis Using Regression and Multilevel/Hierarchical Models, pag 97).

Author(s)

M. Grazia Pittau [email protected]; Yu-Sung Su [email protected]

References

Andrew Gelman and Jennifer Hill, Data Analysis Using Regression and Multilevel/Hierarchical Models, Cambridge University Press, 2006.

Examples

old.par <- par(no.readonly = TRUE)
 data(lalonde)
 attach(lalonde)
 fit <- glm(treat ~ re74 + re75 + educ + black + hisp + married 
               + nodegr + u74 + u75, family=binomial(link="logit"))
 x <- predict(fit)
 y <- resid(fit)
 binnedplot(x,y)
par(old.par)
old.par <- par(no.readonly = TRUE)
 data(lalonde)
 attach(lalonde)
 fit <- glm(treat ~ re74 + re75 + educ + black + hisp + married 
               + nodegr + u74 + u75, family=binomial(link="logit"))
 x <- predict(fit)
 y <- resid(fit)
 binnedplot(x,y)
par(old.par)

Generic Function for Making Coefficient Plot

Description

Functions that plot the coefficients plus and minus 1 and 2 sd from a lm, glm, bugs, and polr fits.

Usage

coefplot(object,...)

## Default S3 method:
coefplot(coefs, sds, CI=2, 
                lower.conf.bounds, upper.conf.bounds,
                varnames=NULL, vertical=TRUE, 
                v.axis=TRUE, h.axis=TRUE,
                cex.var=0.8, cex.pts=0.9, 
                col.pts=1, pch.pts=20, var.las=2,
                main=NULL, xlab=NULL, ylab=NULL, mar=c(1,3,5.1,2),
                plot=TRUE, add=FALSE, offset=.1, ...)

## S4 method for signature 'bugs'
coefplot(object, var.idx=NULL, varnames=NULL, 
            CI=1, vertical=TRUE,
            v.axis=TRUE, h.axis=TRUE, 
            cex.var=0.8, cex.pts=0.9, 
            col.pts=1, pch.pts=20, var.las=2, 
            main=NULL, xlab=NULL, ylab=NULL, 
            plot=TRUE, add=FALSE, offset=.1,
            mar=c(1,3,5.1,2), ...)

## S4 method for signature 'numeric'
coefplot(object, ...)
## S4 method for signature 'lm'
coefplot(object, varnames=NULL, intercept=FALSE, ...)
## S4 method for signature 'glm'
coefplot(object, varnames=NULL, intercept=FALSE, ...)
## S4 method for signature 'polr'
coefplot(object, varnames=NULL, ...)
coefplot(object,...)

## Default S3 method:
coefplot(coefs, sds, CI=2, 
                lower.conf.bounds, upper.conf.bounds,
                varnames=NULL, vertical=TRUE, 
                v.axis=TRUE, h.axis=TRUE,
                cex.var=0.8, cex.pts=0.9, 
                col.pts=1, pch.pts=20, var.las=2,
                main=NULL, xlab=NULL, ylab=NULL, mar=c(1,3,5.1,2),
                plot=TRUE, add=FALSE, offset=.1, ...)

## S4 method for signature 'bugs'
coefplot(object, var.idx=NULL, varnames=NULL, 
            CI=1, vertical=TRUE,
            v.axis=TRUE, h.axis=TRUE, 
            cex.var=0.8, cex.pts=0.9, 
            col.pts=1, pch.pts=20, var.las=2, 
            main=NULL, xlab=NULL, ylab=NULL, 
            plot=TRUE, add=FALSE, offset=.1,
            mar=c(1,3,5.1,2), ...)

## S4 method for signature 'numeric'
coefplot(object, ...)
## S4 method for signature 'lm'
coefplot(object, varnames=NULL, intercept=FALSE, ...)
## S4 method for signature 'glm'
coefplot(object, varnames=NULL, intercept=FALSE, ...)
## S4 method for signature 'polr'
coefplot(object, varnames=NULL, ...)

Arguments

`object`	fitted objects-lm, glm, bugs and polr, or a vector of coefficients.
`...`	further arguments passed to or from other methods.
`coefs`	a vector of coefficients.
`sds`	a vector of sds of coefficients.
`CI`	confidence interval, default is 2, which will plot plus and minus 2 sds or 95% CI. If CI=1, plot plus and minus 1 sds or 50% CI instead.
`lower.conf.bounds`	lower bounds of confidence intervals.
`upper.conf.bounds`	upper bounds of confidence intervals.
`varnames`	a vector of variable names, default is NULL, which will use the names of variables; if specified, the length of varnames must be equal to the length of predictors, including the intercept.
`vertical`	orientation of the plot, default is TRUE which will plot variable names in the 2nd axis. If FALSE, plot variable names in the first axis instead.
`v.axis`	default is TRUE, which shows the bottom axis–axis(1).
`h.axis`	default is TRUE, which shows the left axis–axis(2).
`cex.var`	The fontsize of the varible names, default=0.8.
`cex.pts`	The size of data points, default=0.9.
`col.pts`	color of points and segments, default is black.
`pch.pts`	symbol of points, default is solid dot.
`var.las`	the orientation of variable names against the axis, default is 2. see the usage of `las` in `par`.
`main`	The main title (on top) using font and size (character expansion) `par("font.main")` and color `par("col.main")`.
`xlab`	X axis label using font and character expansion `par("font.lab")` and color `par("col.lab")`.
`ylab`	Y axis label, same font attributes as `xlab`.
`mar`	A numerical vector of the form `c(bottom, left, top, right)` which gives the number of lines of margin to be specified on the four sides of the plot. The default is `c(1,3,5.1,2)`.
`plot`	default is TRUE, plot the estimates.
`add`	if add=TRUE, plot over the existing plot. default is FALSE.
`offset`	add extra spaces to separate from the existing dots. default is 0.1.
`var.idx`	the index of the variables of a bugs object, default is NULL which will plot all the variables.
`intercept`	If TRUE will plot intercept, default=FALSE to get better presentation.

Details

This function plots coefficients from bugs, lm, glm and polr with 1 sd and 2 sd interval bars.

Value

Plot of the coefficients from a bugs, lm or glm fit. You can add the intercept, the variable names and the display the result of the fitted model.

Author(s)

Yu-Sung Su [email protected]

References

Andrew Gelman and Jennifer Hill, Data Analysis Using Regression and Multilevel/Hierarchical Models, Cambridge University Press, 2006.

Examples

old.par <- par(no.readonly = TRUE)

 y1 <- rnorm(1000,50,23)
 y2 <- rbinom(1000,1,prob=0.72)
 x1 <- rnorm(1000,50,2) 
 x2 <- rbinom(1000,1,prob=0.63) 
 x3 <- rpois(1000, 2) 
 x4 <- runif(1000,40,100) 
 x5 <- rbeta(1000,2,2) 
 
 longnames <- c("a long name01","a long name02","a long name03",
                "a long name04","a long name05")

 fit1 <- lm(y1 ~ x1 + x2 + x3 + x4 + x5)
 fit2 <- glm(y2 ~ x1 + x2 + x3 + x4 + x5, 
            family=binomial(link="logit"))
 op <- par()
 # plot 1
 par (mfrow=c(2,2))
 coefplot(fit1)
 coefplot(fit2, col.pts="blue")
 
 # plot 2
 longnames <- c("(Intercept)", longnames) 
 coefplot(fit1, longnames, intercept=TRUE, CI=1)
 
 # plot 3
 coefplot(fit2, vertical=FALSE, var.las=1, frame.plot=TRUE)
 
 # plot 4: comparison to show bayesglm works better than glm
 n <- 100
 x1 <- rnorm (n)
 x2 <- rbinom (n, 1, .5)
 b0 <- 1
 b1 <- 1.5
 b2 <- 2
 y <- rbinom (n, 1, invlogit(b0+b1*x1+b2*x2))
 y <- ifelse (x2==1, 1, y)
 x1 <- rescale(x1)
 x2 <- rescale(x2, "center")
 
 M1 <- glm (y ~ x1 + x2, family=binomial(link="logit"))
       display (M1)
 M2 <- bayesglm (y ~ x1 + x2, family=binomial(link="logit"))
       display (M2)

#=================== 
#    stacked plot
#===================
  coefplot(M2, xlim=c(-1,5), intercept=TRUE)
  coefplot(M1, add=TRUE, col.pts="red")
  
#==================== 
# arrayed plot       
#====================
  par(mfrow=c(1,2))
  x.scale <- c(0, 7.5) # fix x.scale for comparison
  coefplot(M1, xlim=x.scale, main="glm", intercept=TRUE)
  coefplot(M2, xlim=x.scale, main="bayesglm", intercept=TRUE)

# plot 5: the ordered logit model from polr
 M3 <- polr(Sat ~ Infl + Type + Cont, weights = Freq, data = housing)
 coefplot(M3, main="polr")
   
 M4 <- bayespolr(Sat ~ Infl + Type + Cont, weights = Freq, data = housing)
 coefplot(M4, main="bayespolr", add=TRUE, col.pts="red")

## plot 6: plot bugs & lmer
# par <- op
# M5 <- lmer(Reaction ~ Days + (1|Subject), sleepstudy)
# M5.sim <- mcsamp(M5)
# coefplot(M5.sim, var.idx=5:22, CI=1, ylim=c(18,1), main="lmer model")


# plot 7: plot coefficients & sds vectors
 coef.vect <- c(0.2, 1.4, 2.3, 0.5)
 sd.vect <- c(0.12, 0.24, 0.23, 0.15)
 longnames <- c("var1", "var2", "var3", "var4")
 coefplot (coef.vect, sd.vect, varnames=longnames, main="Regression Estimates")
 coefplot (coef.vect, sd.vect, varnames=longnames, vertical=FALSE, 
    var.las=1, main="Regression Estimates")
    
par(old.par)
old.par <- par(no.readonly = TRUE)

 y1 <- rnorm(1000,50,23)
 y2 <- rbinom(1000,1,prob=0.72)
 x1 <- rnorm(1000,50,2) 
 x2 <- rbinom(1000,1,prob=0.63) 
 x3 <- rpois(1000, 2) 
 x4 <- runif(1000,40,100) 
 x5 <- rbeta(1000,2,2) 
 
 longnames <- c("a long name01","a long name02","a long name03",
                "a long name04","a long name05")

 fit1 <- lm(y1 ~ x1 + x2 + x3 + x4 + x5)
 fit2 <- glm(y2 ~ x1 + x2 + x3 + x4 + x5, 
            family=binomial(link="logit"))
 op <- par()
 # plot 1
 par (mfrow=c(2,2))
 coefplot(fit1)
 coefplot(fit2, col.pts="blue")
 
 # plot 2
 longnames <- c("(Intercept)", longnames) 
 coefplot(fit1, longnames, intercept=TRUE, CI=1)
 
 # plot 3
 coefplot(fit2, vertical=FALSE, var.las=1, frame.plot=TRUE)
 
 # plot 4: comparison to show bayesglm works better than glm
 n <- 100
 x1 <- rnorm (n)
 x2 <- rbinom (n, 1, .5)
 b0 <- 1
 b1 <- 1.5
 b2 <- 2
 y <- rbinom (n, 1, invlogit(b0+b1*x1+b2*x2))
 y <- ifelse (x2==1, 1, y)
 x1 <- rescale(x1)
 x2 <- rescale(x2, "center")
 
 M1 <- glm (y ~ x1 + x2, family=binomial(link="logit"))
       display (M1)
 M2 <- bayesglm (y ~ x1 + x2, family=binomial(link="logit"))
       display (M2)

#=================== 
#    stacked plot
#===================
  coefplot(M2, xlim=c(-1,5), intercept=TRUE)
  coefplot(M1, add=TRUE, col.pts="red")
  
#==================== 
# arrayed plot       
#====================
  par(mfrow=c(1,2))
  x.scale <- c(0, 7.5) # fix x.scale for comparison
  coefplot(M1, xlim=x.scale, main="glm", intercept=TRUE)
  coefplot(M2, xlim=x.scale, main="bayesglm", intercept=TRUE)

# plot 5: the ordered logit model from polr
 M3 <- polr(Sat ~ Infl + Type + Cont, weights = Freq, data = housing)
 coefplot(M3, main="polr")
   
 M4 <- bayespolr(Sat ~ Infl + Type + Cont, weights = Freq, data = housing)
 coefplot(M4, main="bayespolr", add=TRUE, col.pts="red")

## plot 6: plot bugs & lmer
# par <- op
# M5 <- lmer(Reaction ~ Days + (1|Subject), sleepstudy)
# M5.sim <- mcsamp(M5)
# coefplot(M5.sim, var.idx=5:22, CI=1, ylim=c(18,1), main="lmer model")


# plot 7: plot coefficients & sds vectors
 coef.vect <- c(0.2, 1.4, 2.3, 0.5)
 sd.vect <- c(0.12, 0.24, 0.23, 0.15)
 longnames <- c("var1", "var2", "var3", "var4")
 coefplot (coef.vect, sd.vect, varnames=longnames, main="Regression Estimates")
 coefplot (coef.vect, sd.vect, varnames=longnames, vertical=FALSE, 
    var.las=1, main="Regression Estimates")
    
par(old.par)

Contrast Matrices

Description

Return a matrix of contrasts used in bayesglm.

Usage

contr.bayes.unordered(n, base = 1, contrasts = TRUE)
contr.bayes.ordered (n, scores = 1:n, contrasts = TRUE)
contr.bayes.unordered(n, base = 1, contrasts = TRUE)
contr.bayes.ordered (n, scores = 1:n, contrasts = TRUE)

Arguments

`n`	a vector of levels for a factor, or the number of levels.
`base`	an integer specifying which group is considered the baseline group. Ignored if `contrasts` is `FALSE`.
`contrasts`	a logical indicating whether contrasts should be computed.
`scores`	the set of values over which orthogonal polynomials are to be computed.

Details

These functions are adapted from contr.treatment and contr.poly in stats package. The purpose for these functions are to keep the baseline levels of categorical variables and thus to suit the use of bayesglm.

contr.bayes.unordered is equivalent to contr.treatment whereas contr.bayes.ordered is equivalent to contr.poly.

Author(s)

Yu-Sung Su [email protected]

Examples

 cat.var <- rep(1:3, 5)
 dim(contr.bayes.unordered(cat.var))
 # 15*15 baseline level kept!
 dim(contr.treatment(cat.var))
 # 15*14
cat.var <- rep(1:3, 5)
 dim(contr.bayes.unordered(cat.var))
 # 15*15 baseline level kept!
 dim(contr.treatment(cat.var))
 # 15*14

Correlation Plot

Description

Function for making a correlation plot starting from a data matrix

Usage

corrplot (data, varnames=NULL, cutpts=NULL,
    abs=TRUE, details=TRUE, 
    n.col.legend=5, cex.col=0.7, 
    cex.var=0.9, digits=1, color=FALSE)
corrplot (data, varnames=NULL, cutpts=NULL,
    abs=TRUE, details=TRUE, 
    n.col.legend=5, cex.col=0.7, 
    cex.var=0.9, digits=1, color=FALSE)

Arguments

`data`	a data matrix
`varnames`	variable names of the data matrix, if not provided use default variable names
`abs`	if TRUE, transform all correlation values into positive values, default=TRUE.
`cutpts`	a vector of cutting points for color legend, default is NULL. The function will decide the cutting points if cutpts is not assigned.
`details`	show more than one digits correlaton values. Default is TRUE. FALSE is suggested to get readable output.
`n.col.legend`	number of legend for the color thermometer.
`cex.col`	font size of the color thermometer.
`cex.var`	font size of the variable names.
`digits`	number of digits shown in the text of the color theromoeter.
`color`	color of the plot, default is FALSE, which uses gray scale.

Details

The function adapts the R function for Figure 8 in Tian Zheng, Matthew Salganik, and Andrew Gelman, 2006, "How many people do you know in prison?: using overdispersion in count data to estimate social structure in networks", Journal of the American Statistical Association, Vol.101, N0. 474: p.409-23.

Value

A correlation plot.

Author(s)

Tian Zheng [email protected]; Yu-Sung Su [email protected]

References

Tian Zheng, Matthew Salganik, and Andrew Gelman, 2006, "How many people do you know in prison?: using overdispersion in count data to estimate social structure in networks", Journal of the American Statistical Association, Vol.101, N0. 474: p.409-23

Examples

old.par <- par(no.readonly = TRUE)

 x1 <- rnorm(1000,50,2) 
 x2 <- rbinom(1000,1,prob=0.63) 
 x3 <- rpois(1000, 2) 
 x4 <- runif(1000,40,100) 
 x5 <- rnorm(1000,100,30)
 x6 <- rbeta(1000,2,2) 
 x7 <- rpois(1000,10) 
 x8 <- rbinom(1000,1,prob=0.4) 
 x9 <- rbeta(1000,5,4) 
 x10 <- runif(1000,-10,-1) 

 test.data <- data.matrix(cbind(x1,x2,x3,x4,x5,x6,x7,x8,x9,x10))
 test.names <- c("a short name01","a short name02","a short name03",
                 "a short name04","a short name05","a short name06",
                 "a short name07","a short name08","a short name09",
                 "a short name10")
 
 # example 1
 corrplot(test.data)
 
 # example 2
 corrplot(test.data,test.names, abs=FALSE, n.col.legend=7)
 corrplot(test.data,test.names, abs=TRUE, n.col.legend=7)
 
 # example 3
 data(lalonde)
 corrplot(lalonde, details=FALSE, color=TRUE)
 corrplot(lalonde, cutpts=c(0,0.25,0.5,0.75), color=TRUE, digits=2)
 
par(old.par)
old.par <- par(no.readonly = TRUE)

 x1 <- rnorm(1000,50,2) 
 x2 <- rbinom(1000,1,prob=0.63) 
 x3 <- rpois(1000, 2) 
 x4 <- runif(1000,40,100) 
 x5 <- rnorm(1000,100,30)
 x6 <- rbeta(1000,2,2) 
 x7 <- rpois(1000,10) 
 x8 <- rbinom(1000,1,prob=0.4) 
 x9 <- rbeta(1000,5,4) 
 x10 <- runif(1000,-10,-1) 

 test.data <- data.matrix(cbind(x1,x2,x3,x4,x5,x6,x7,x8,x9,x10))
 test.names <- c("a short name01","a short name02","a short name03",
                 "a short name04","a short name05","a short name06",
                 "a short name07","a short name08","a short name09",
                 "a short name10")
 
 # example 1
 corrplot(test.data)
 
 # example 2
 corrplot(test.data,test.names, abs=FALSE, n.col.legend=7)
 corrplot(test.data,test.names, abs=TRUE, n.col.legend=7)
 
 # example 3
 data(lalonde)
 corrplot(lalonde, details=FALSE, color=TRUE)
 corrplot(lalonde, cutpts=c(0,0.25,0.5,0.75), color=TRUE, digits=2)
 
par(old.par)

Histogram for Discrete Distributions

Description

Creates a prettier histogram for discrete distributions

Usage

discrete.histogram (x, prob, prob2=NULL, prob3=NULL,
    xlab="x", xaxs.label=NULL, yaxs.label=NULL, bar.width=NULL,
    freq=FALSE, prob.col="blue", prob2.col="red", prob3.col="gray", ...)
discrete.histogram (x, prob, prob2=NULL, prob3=NULL,
    xlab="x", xaxs.label=NULL, yaxs.label=NULL, bar.width=NULL,
    freq=FALSE, prob.col="blue", prob2.col="red", prob3.col="gray", ...)

Arguments

`x`	The vector of x's
`prob`	The probabilities for the x's
`prob2`	A second vector of probabilities of the x's
`prob3`	A third vector of probabilities of the x's
`xlab`	Label for the x axis
`xaxs.label`	Label for the x's
`yaxs.label`	Label for the y axis
`bar.width`	Width of the bars
`freq`	If TRUE, shows a frequency histogram as opposed to probability.
`prob.col`	The color of the first set of histogram bars.
`prob2.col`	The color of the second set of histogram bars.
`prob3.col`	The color of the third set of histogram bars.
`...`	Additional arguments passed to function `plot`

Details

This function displays a histogram for discrete probability distributions.

Examples

a <- c(3,4,0,0,5,1,1,1,1,0)
discrete.histogram (a)

x <- c(0,1,3,4,5)
p <- c(.3,.4,.1,.1,.1)
discrete.histogram (x,p)

x <- c(0,1,3,4,5)
y <- c(3,4,1,1,1)
discrete.histogram (x,y)
a <- c(3,4,0,0,5,1,1,1,1,0)
discrete.histogram (a)

x <- c(0,1,3,4,5)
p <- c(.3,.4,.1,.1,.1)
discrete.histogram (x,p)

x <- c(0,1,3,4,5)
y <- c(3,4,1,1,1)
discrete.histogram (x,y)

Functions for Processing lm, glm, mer, polr and svyglm Output

Description

This generic function gives a clean printout of lm, glm, mer, polr and svyglm objects.

Usage

display (object, ...)

## S4 method for signature 'lm'
display(object, digits=2, detail=FALSE)
## S4 method for signature 'bayesglm'
display(object, digits=2, detail=FALSE)
## S4 method for signature 'glm'
display(object, digits=2, detail=FALSE)
## S4 method for signature 'merMod'
display(object, digits=2, detail=FALSE)
## S4 method for signature 'polr'
display(object, digits=2, detail=FALSE)
## S4 method for signature 'svyglm'
display(object, digits=2, detail=FALSE)
display (object, ...)

## S4 method for signature 'lm'
display(object, digits=2, detail=FALSE)
## S4 method for signature 'bayesglm'
display(object, digits=2, detail=FALSE)
## S4 method for signature 'glm'
display(object, digits=2, detail=FALSE)
## S4 method for signature 'merMod'
display(object, digits=2, detail=FALSE)
## S4 method for signature 'polr'
display(object, digits=2, detail=FALSE)
## S4 method for signature 'svyglm'
display(object, digits=2, detail=FALSE)

Arguments

`object`	The output of a call to lm, glm, mer, polr, svyglm or related regressions function with n data points and k predictors.
`...`	further arguments passed to or from other methods.
`digits`	number of significant digits to display.
`detail`	defaul is `FALSE`, if `TRUE`, display p-values or z-values

Details

This generic function gives a clean printout of lm, glm, mer and polr objects, focusing on the most pertinent pieces of information: the coefficients and their standard errors, the sample size, number of predictors, residual standard deviation, and R-squared. Note: R-squared is automatically displayed to 2 digits, and deviances are automatically displayed to 1 digit, no matter what.

Value

Coefficients and their standard errors, the sample size, number of predictors, residual standard deviation, and R-squared

Note

Output are the model, the regression coefficients and standard errors, and the residual sd and R-squared (for a linear model), or the null deviance and residual deviance (for a generalized linear model).

Author(s)

Andrew Gelman [email protected]; Yu-Sung Su [email protected]; Maria Grazia Pittau [email protected]

References

Andrew Gelman and Jennifer Hill, Data Analysis Using Regression and Multilevel/Hierarchical Models, Cambridge University Press, 2006.

Examples

                          
# Here's a simple example of a model of the form, y = a + bx + error, 
# with 10 observations in each of 10 groups, and with both the 
# intercept and the slope varying by group.  First we set up the model and data.
   group <- rep(1:10, rep(10,10))
   group2 <- rep(1:10, 10)
   mu.a <- 0
   sigma.a <- 2
   mu.b <- 3
   sigma.b <- 4
   rho <- 0.56
   Sigma.ab <- array (c(sigma.a^2, rho*sigma.a*sigma.b, 
                    rho*sigma.a*sigma.b, sigma.b^2), c(2,2))
   sigma.y <- 1
   ab <- mvrnorm (10, c(mu.a,mu.b), Sigma.ab)
   a <- ab[,1]
   b <- ab[,2]
   d <- rnorm(10)

   x <- rnorm (100)
   y1 <- rnorm (100, a[group] + b*x, sigma.y)
   y2 <- rbinom(100, 1, prob=invlogit(a[group] + b*x))
   y3 <- rnorm (100, a[group] + b[group]*x + d[group2], sigma.y)
   y4 <- rbinom(100, 1, prob=invlogit(a[group] + b*x + d[group2]))


# display a simple linear model

   M1 <- lm (y1 ~ x)
   display (M1)
   M1.sim <- sim(M1, n.sims=2)  

# display a simple logit model

   M2 <- glm (y2 ~ x, family=binomial(link="logit"))
   display (M2)
   M2.sim <- sim(M2, n.sims=2)  

# Then fit and display a simple varying-intercept model:
 
   M3 <- lmer (y1 ~ x + (1|group))
   display (M3)
   M3.sim <- sim(M3, n.sims=2)  


# Then the full varying-intercept, varying-slope model:
 
   M4 <- lmer (y1 ~ x + (1 + x |group))
   display (M4)
   M4.sim <- sim(M4, n.sims=2)  

 
# Then the full varying-intercept, logit model:
 
   M5 <- glmer (y2 ~ x + (1|group), family=binomial(link="logit"))
   display (M5)
   M5.sim <- sim(M5, n.sims=2)  

 
# Then the full varying-intercept, varying-slope logit model:
 
   M6 <- glmer (y2 ~ x + (1|group) + (0 + x |group), 
        family=binomial(link="logit"))
   display (M6)
   M6.sim <- sim(M6, n.sims=2)  


# Then non-nested varying-intercept, varying-slop model:
   
   M7 <- lmer (y3 ~ x + (1 + x |group) + (1|group2))
   display(M7)
   M7.sim <- sim(M7, n.sims=2)  

 
# Then the ordered logit model from polr

   M8 <- polr(Sat ~ Infl + Type + Cont, weights = Freq, data = housing)
   display(M8)
   
   M9 <- bayespolr(Sat ~ Infl + Type + Cont, weights = Freq, data = housing)
   display(M9) 
   

# Here's a simple example of a model of the form, y = a + bx + error, 
# with 10 observations in each of 10 groups, and with both the 
# intercept and the slope varying by group.  First we set up the model and data.
   group <- rep(1:10, rep(10,10))
   group2 <- rep(1:10, 10)
   mu.a <- 0
   sigma.a <- 2
   mu.b <- 3
   sigma.b <- 4
   rho <- 0.56
   Sigma.ab <- array (c(sigma.a^2, rho*sigma.a*sigma.b, 
                    rho*sigma.a*sigma.b, sigma.b^2), c(2,2))
   sigma.y <- 1
   ab <- mvrnorm (10, c(mu.a,mu.b), Sigma.ab)
   a <- ab[,1]
   b <- ab[,2]
   d <- rnorm(10)

   x <- rnorm (100)
   y1 <- rnorm (100, a[group] + b*x, sigma.y)
   y2 <- rbinom(100, 1, prob=invlogit(a[group] + b*x))
   y3 <- rnorm (100, a[group] + b[group]*x + d[group2], sigma.y)
   y4 <- rbinom(100, 1, prob=invlogit(a[group] + b*x + d[group2]))


# display a simple linear model

   M1 <- lm (y1 ~ x)
   display (M1)
   M1.sim <- sim(M1, n.sims=2)  

# display a simple logit model

   M2 <- glm (y2 ~ x, family=binomial(link="logit"))
   display (M2)
   M2.sim <- sim(M2, n.sims=2)  

# Then fit and display a simple varying-intercept model:
 
   M3 <- lmer (y1 ~ x + (1|group))
   display (M3)
   M3.sim <- sim(M3, n.sims=2)  


# Then the full varying-intercept, varying-slope model:
 
   M4 <- lmer (y1 ~ x + (1 + x |group))
   display (M4)
   M4.sim <- sim(M4, n.sims=2)  

 
# Then the full varying-intercept, logit model:
 
   M5 <- glmer (y2 ~ x + (1|group), family=binomial(link="logit"))
   display (M5)
   M5.sim <- sim(M5, n.sims=2)  

 
# Then the full varying-intercept, varying-slope logit model:
 
   M6 <- glmer (y2 ~ x + (1|group) + (0 + x |group), 
        family=binomial(link="logit"))
   display (M6)
   M6.sim <- sim(M6, n.sims=2)  


# Then non-nested varying-intercept, varying-slop model:
   
   M7 <- lmer (y3 ~ x + (1 + x |group) + (1|group2))
   display(M7)
   M7.sim <- sim(M7, n.sims=2)  

 
# Then the ordered logit model from polr

   M8 <- polr(Sat ~ Infl + Type + Cont, weights = Freq, data = housing)
   display(M8)
   
   M9 <- bayespolr(Sat ~ Infl + Type + Cont, weights = Freq, data = housing)
   display(M9)

Extract AIC and DIC from a ‘mer’ model

Description

Computes the (generalized) Akaike *A*n *I*nformation *C*riterion and *D*eviance *I*nformation *C*riterion for a mer model.

Usage

extractDIC(fit,...)
## S3 method for class 'merMod'
extractDIC(fit,...)

extractDIC(fit,...)
## S3 method for class 'merMod'
extractDIC(fit,...)

Arguments

`fit`	fitted `merMod` mode, usually the result of a fiiter like `merMod`.
`...`	further arguments (currently unused).

Author(s)

Andrew Gelman [email protected]; Yu-Sung Su [email protected]

Examples

fm1 <- lmer(Reaction ~ Days + (Days|Subject), sleepstudy)
extractAIC(fm1)
extractDIC(fm1)
fm1 <- lmer(Reaction ~ Days + (Days|Subject), sleepstudy)
extractAIC(fm1)
extractDIC(fm1)

Formating the Rounding of Numbers

Description

fround rounds the values in its first argument to the specified number of decimal places with surrounding quotes.

pfround rounds the values in its first argument to the specified number of decimal places without surrounding quotes.

Usage

fround(x, digits)
pfround(x, digits)
fround(x, digits)
pfround(x, digits)

Arguments

`x`	a numeric vector.
`digits`	integer indicating the precision to be used.

Author(s)

Andrew Gelman [email protected]; Yu-Sung Su [email protected]

Examples

    x <- rnorm(1)
    fround(x, digits=2)
    pfround(x, digits=2)
x <- rnorm(1)
    fround(x, digits=2)
    pfround(x, digits=2)

Function to Recall Last Source File

Description

A function that like source() but recalls the last source file names by default.

Usage

go(..., add=FALSE, timer=FALSE)
go(..., add=FALSE, timer=FALSE)

Arguments

`...`	list of filenames as character strings.
`add`	add these names to the current list; if replace, then `FALSE`.
`timer`	time the execution time of `go()`.

Author(s)

Jouni Kerman [email protected] [email protected]

Examples

    go('myprog')            # will run source('myprog.r')
    go()                    # will run source('myprog.r') again
    go('somelib',add=TRUE)  # will run source('myprog.r') and source('somelib.r')
    go('myprog','somelib')  # same as above
    go('mytest')            # will run source('mytest') only
    go()                    # runs source('mytest') again
    G                       # short cut to call go()
 go('myprog')            # will run source('myprog.r')
    go()                    # will run source('myprog.r') again
    go('somelib',add=TRUE)  # will run source('myprog.r') and source('somelib.r')
    go('myprog','somelib')  # same as above
    go('mytest')            # will run source('mytest') only
    go()                    # runs source('mytest') again
    G                       # short cut to call go()

Logistic and Inverse logistic functions

Description

Inverse-logit function, transforms continuous values to the range (0, 1)

Usage

logit(x)
invlogit(x)
logit(x)
invlogit(x)

Arguments

`x`	A vector of continuous values

Details

The Inverse-logit function defined as: $logit^-1(x) = e^x/(1+e^x)$ transforms continuous values to the range (0, 1), which is necessary, since probabilities must be between 0 and 1 and maps from the linear predictor to the probabilities

Value

A vector of estimated probabilities

Author(s)

Andrew Gelman [email protected], M.Grazia Pittau [email protected]

References

Andrew Gelman and Jennifer Hill. (2006). Data Analysis Using Regression and Multilevel/Hierarchical Models. Cambridge University Press.

Examples

 data(frisk)
 n <- 100
 x1 <- rnorm (n)
 x2 <- rbinom (n, 1, .5)
 b0 <- 1
 b1 <- 1.5
 b2 <- 2
 Inv.logit <- invlogit(b0+b1*x1+b2*x2)
 plot(b0+b1*x1+b2*x2, Inv.logit)
data(frisk)
 n <- 100
 x1 <- rnorm (n)
 x2 <- rbinom (n, 1, .5)
 b0 <- 1
 b1 <- 1.5
 b2 <- 2
 Inv.logit <- invlogit(b0+b1*x1+b2*x2)
 plot(b0+b1*x1+b2*x2, Inv.logit)

Lalonde Dataset

Description

Dataset used by Dehejia and Wahba (1999) to evaluate propensity score matching.

Usage

data(lalonde)data(lalonde)

Format

A data frame with 445 observations on the following 12 variables.

age: age in years.
educ: years of schooling.
black: indicator variable for blacks.
hisp: indicator variable for Hispanics.
married: indicator variable for martial status.
nodegr: indicator variable for high school diploma.
re74: real earnings in 1974.
re75: real earnings in 1975.
re78: real earnings in 1978.
u74: indicator variable for earnings in 1974 being zero.
u75: indicator variable for earnings in 1975 being zero.
treat: an indicator variable for treatment status.

Details

Two demos are provided which use this dataset. The first, DehejiaWahba, replicates one of the models from Dehejia and Wahba (1999). The second demo, AbadieImbens, replicates the models produced by Abadie and Imbens https://scholar.harvard.edu/imbens/scholar_software/matching-estimators. Many of these models are found to produce good balance for the Lalonde data.

Note

This documentation is adapted from Matching package.

References

Dehejia, Rajeev and Sadek Wahba. 1999.“Causal Effects in Non-Experimental Studies: Re-Evaluating the Evaluation of Training Programs.” Journal of the American Statistical Association 94 (448): 1053-1062.

LaLonde, Robert. 1986. “Evaluating the Econometric Evaluations of Training Programs.” American Economic Review 76:604-620.

Examples

data(lalonde)
data(lalonde)

Single Nearest Neighborhood Matching

Description

Function for processing matching with propensity score

Usage

matching(z, score, replace=FALSE)
matching(z, score, replace=FALSE)

Arguments

`z`	vector of indicators for treatment or control.
`score`	vector of the propensity scores in the same order as z.
`replace`	whether the control units could be reused for matching, default is `FALSE`.

Details

Function for matching each treatment unit in turn the control unit (not previously chosen) with the closest propensity score

Value

The function returns a vector of indices that the corresponding unit is matched to. 0 means matched to nothing.

Author(s)

Jeniffer Hill [email protected]; Yu-Sung Su [email protected]

References

Andrew Gelman and Jennifer Hill. (2006). Data Analysis Using Regression and Multilevel/Hierarchical Models. Cambridge University Press.

Examples

# matching first
data(lalonde)
attach(lalonde)
fit <- glm(treat ~ re74 + re75 + age + factor(educ) + 
            black + hisp + married + nodegr + u74 + u75, 
            family=binomial(link="logit"))
pscores <- predict(fit, type="response")
matches <- matching(z=lalonde$treat, score=pscores)
matched <- matches$cnts

# balance check!
b.stats <- balance(lalonde, treat, matched)
print(b.stats)
plot(b.stats)
# matching first
data(lalonde)
attach(lalonde)
fit <- glm(treat ~ re74 + re75 + age + factor(educ) + 
            black + hisp + married + nodegr + u74 + u75, 
            family=binomial(link="logit"))
pscores <- predict(fit, type="response")
matches <- matching(z=lalonde$treat, score=pscores)
matched <- matches$cnts

# balance check!
b.stats <- balance(lalonde, treat, matched)
print(b.stats)
plot(b.stats)

Generic Function to Run ‘mcmcsamp()’ in lme4

Description

The quick function for MCMC sampling for lmer and glmer objects and convert to Bugs objects for easy display.

Usage

## Default S3 method:
mcsamp(object, n.chains=3, n.iter=1000, n.burnin=floor(n.iter/2), 
    n.thin=max(1, floor(n.chains * (n.iter - n.burnin)/1000)), 
    saveb=TRUE, deviance=TRUE, make.bugs.object=TRUE)
## S4 method for signature 'merMod'
 mcsamp(object, ...)
## Default S3 method:
mcsamp(object, n.chains=3, n.iter=1000, n.burnin=floor(n.iter/2), 
    n.thin=max(1, floor(n.chains * (n.iter - n.burnin)/1000)), 
    saveb=TRUE, deviance=TRUE, make.bugs.object=TRUE)
## S4 method for signature 'merMod'
 mcsamp(object, ...)

Arguments

`object`	`mer` objects from `lme4`
`n.chains`	number of MCMC chains
`n.iter`	number of iteration for each MCMC chain
`n.burnin`	number of burnin for each MCMC chain, Default is `n.iter/2`, that is, discarding the first half of the simulations.
`n.thin`	keep every kth draw from each MCMC chain. Must be a positive integer. Default is `max(1, floor(n.chains * (n.iter-n.burnin) / 1000))` which will only thin if there are at least 2000 simulations.
`saveb`	if 'TRUE', causes the values of the random effects in each sample to be saved.
`deviance`	compute deviance for `mer` objects. Only works for `lmer` object
`make.bugs.object`	tranform the output into bugs object, default is TRUE
`...`	further arguments passed to or from other methods.

Details

This function generates a sample from the posterior distribution of the parameters of a fitted model using Markov Chain Monte Carlo methods. It automatically simulates multiple sequences and allows convergence to be monitored. The function relies on mcmcsamp in lme4.

Value

An object of (S3) class '"bugs"' suitable for use with the functions in the "R2WinBUGS" package.

Author(s)

Andrew Gelman [email protected]; Yu-Sung Su [email protected]

References

Andrew Gelman and Jennifer Hill, Data Analysis Using Regression and Multilevel/Hierarchical Models, Cambridge University Press, 2006.

Douglas Bates and Deepayan Sarkar, lme4: Linear mixed-effects models using S4 classes.

Examples

## Here's a simple example of a model of the form, y = a + bx + error, 
## with 10 observations in each of 10 groups, and with both the intercept 
## and the slope varying by group.  First we set up the model and data.
##   
#   group <- rep(1:10, rep(10,10))
#   group2 <- rep(1:10, 10)
#   mu.a <- 0
#   sigma.a <- 2
#   mu.b <- 3
#   sigma.b <- 4
#   rho <- 0.56
#   Sigma.ab <- array (c(sigma.a^2, rho*sigma.a*sigma.b, 
#                    rho*sigma.a*sigma.b, sigma.b^2), c(2,2))
#   sigma.y <- 1
#   ab <- mvrnorm (10, c(mu.a,mu.b), Sigma.ab)
#   a <- ab[,1]
#   b <- ab[,2]
#   d <- rnorm(10)
#
#   x <- rnorm (100)
#   y1 <- rnorm (100, a[group] + b*x, sigma.y)
#   y2 <- rbinom(100, 1, prob=invlogit(a[group] + b*x))
#   y3 <- rnorm (100, a[group] + b[group]*x + d[group2], sigma.y)
#   y4 <- rbinom(100, 1, prob=invlogit(a[group] + b*x + d[group2]))
#
## 
## Then fit and display a simple varying-intercept model:
# 
#   M1 <- lmer (y1 ~ x + (1|group))
#   display (M1)
#   M1.sim <- mcsamp (M1)
#   print (M1.sim)
#   plot (M1.sim)
## 
## Then the full varying-intercept, varying-slope model:
## 
#   M2 <- lmer (y1 ~ x + (1 + x |group))
#   display (M2)
#   M2.sim <- mcsamp (M2)
#   print (M2.sim)
#   plot (M2.sim)
## 
## Then the full varying-intercept, logit model:
## 
#   M3 <- lmer (y2 ~ x + (1|group), family=binomial(link="logit"))
#   display (M3)
#   M3.sim <- mcsamp (M3)
#   print (M3.sim)
#   plot (M3.sim)
## 
## Then the full varying-intercept, varying-slope logit model:
## 
#   M4 <- lmer (y2 ~ x + (1|group) + (0+x |group), 
#        family=binomial(link="logit"))
#   display (M4)
#   M4.sim <- mcsamp (M4)
#   print (M4.sim)
#   plot (M4.sim)
#   
##
## Then non-nested varying-intercept, varying-slop model:
##
#   M5 <- lmer (y3 ~ x + (1 + x |group) + (1|group2))
#   display(M5)
#   M5.sim <- mcsamp (M5)
#   print (M5.sim)
#   plot (M5.sim)
      
 ## Here's a simple example of a model of the form, y = a + bx + error, 
## with 10 observations in each of 10 groups, and with both the intercept 
## and the slope varying by group.  First we set up the model and data.
##   
#   group <- rep(1:10, rep(10,10))
#   group2 <- rep(1:10, 10)
#   mu.a <- 0
#   sigma.a <- 2
#   mu.b <- 3
#   sigma.b <- 4
#   rho <- 0.56
#   Sigma.ab <- array (c(sigma.a^2, rho*sigma.a*sigma.b, 
#                    rho*sigma.a*sigma.b, sigma.b^2), c(2,2))
#   sigma.y <- 1
#   ab <- mvrnorm (10, c(mu.a,mu.b), Sigma.ab)
#   a <- ab[,1]
#   b <- ab[,2]
#   d <- rnorm(10)
#
#   x <- rnorm (100)
#   y1 <- rnorm (100, a[group] + b*x, sigma.y)
#   y2 <- rbinom(100, 1, prob=invlogit(a[group] + b*x))
#   y3 <- rnorm (100, a[group] + b[group]*x + d[group2], sigma.y)
#   y4 <- rbinom(100, 1, prob=invlogit(a[group] + b*x + d[group2]))
#
## 
## Then fit and display a simple varying-intercept model:
# 
#   M1 <- lmer (y1 ~ x + (1|group))
#   display (M1)
#   M1.sim <- mcsamp (M1)
#   print (M1.sim)
#   plot (M1.sim)
## 
## Then the full varying-intercept, varying-slope model:
## 
#   M2 <- lmer (y1 ~ x + (1 + x |group))
#   display (M2)
#   M2.sim <- mcsamp (M2)
#   print (M2.sim)
#   plot (M2.sim)
## 
## Then the full varying-intercept, logit model:
## 
#   M3 <- lmer (y2 ~ x + (1|group), family=binomial(link="logit"))
#   display (M3)
#   M3.sim <- mcsamp (M3)
#   print (M3.sim)
#   plot (M3.sim)
## 
## Then the full varying-intercept, varying-slope logit model:
## 
#   M4 <- lmer (y2 ~ x + (1|group) + (0+x |group), 
#        family=binomial(link="logit"))
#   display (M4)
#   M4.sim <- mcsamp (M4)
#   print (M4.sim)
#   plot (M4.sim)
#   
##
## Then non-nested varying-intercept, varying-slop model:
##
#   M5 <- lmer (y3 ~ x + (1 + x |group) + (1|group2))
#   display(M5)
#   M5.sim <- mcsamp (M5)
#   print (M5.sim)
#   plot (M5.sim)

Construct Design Matrices

Description

model.matrixBayes creates a design matrix.

Usage

model.matrixBayes(object, data = environment(object),
    contrasts.arg = NULL, xlev = NULL, keep.order = FALSE, drop.baseline=FALSE,...)

model.matrixBayes(object, data = environment(object),
    contrasts.arg = NULL, xlev = NULL, keep.order = FALSE, drop.baseline=FALSE,...)

Arguments

`object`	an object of an appropriate class. For the default method, a model formula or terms object.
`data`	a data frame created with `model.frame`. If another sort of object, `model.frame` is called first.
`contrasts.arg`	A list, whose entries are contrasts suitable for input to the `contrasts` replacement function and whose names are the names of columns of `data` containing `factor`s.
`xlev`	to be used as argument of `model.frame` if `data` has no `"terms"` attribute.
`keep.order`	a logical value indicating whether the terms should keep their positions. If `FALSE` the terms are reordered so that main effects come first, followed by the interactions, all second-order, all third-order and so on. Effects of a given order are kept in the order specified.
`drop.baseline`	Drop the base level of categorical Xs, default is TRUE.
`...`	further arguments passed to or from other methods.

Details

model.matrixBayes is adapted from model.matrix in the stats pacakge and is designed for the use of bayesglm. It is designed to keep baseline levels of all categorical varaibles and keep the variable names unodered in the output. The design matrices created by model.matrixBayes are unidentifiable using classical regression methods, though; they can be identified using bayesglm.

Author(s)

Yu-Sung Su [email protected]

References

Examples

ff <- log(Volume) ~ log(Height) + log(Girth)
str(m <- model.frame(ff, trees))
(model.matrix(ff, m))
class(ff) <- c("bayesglm", "terms", "formula")
(model.matrixBayes(ff, m))
ff <- log(Volume) ~ log(Height) + log(Girth)
str(m <- model.frame(ff, trees))
(model.matrix(ff, m))
class(ff) <- c("bayesglm", "terms", "formula")
(model.matrixBayes(ff, m))

Multiple Comparison Plot

Description

Plots significant difference of simulated array.

Usage

multicomp.plot(object, alpha = 0.05, main = "Multiple Comparison Plot",
  label = NULL, shortlabel = NULL, show.pvalue = FALSE,
  label.as.shortlabel = FALSE, label.on.which.axis = 3,
  col.low = "lightsteelblue", col.same = "white", col.high = "lightslateblue",
  vertical.line = TRUE, horizontal.line = FALSE,
  vertical.line.lty = 1, horizontal.line.lty = 1, mar=c(3.5,3.5,3.5,3.5))
multicomp.plot(object, alpha = 0.05, main = "Multiple Comparison Plot",
  label = NULL, shortlabel = NULL, show.pvalue = FALSE,
  label.as.shortlabel = FALSE, label.on.which.axis = 3,
  col.low = "lightsteelblue", col.same = "white", col.high = "lightslateblue",
  vertical.line = TRUE, horizontal.line = FALSE,
  vertical.line.lty = 1, horizontal.line.lty = 1, mar=c(3.5,3.5,3.5,3.5))

Arguments

`object`	Simulated array of coefficients, columns being different variables and rows being simulated result.
`alpha`	Level of significance to compare.
`main`	Main label.
`label`	Labels for simulated parameters.
`shortlabel`	Short labels to put into the plot.
`show.pvalue`	Default is FALSE, if set to TRUE replaces short label with Bayesian p value.
`label.as.shortlabel`	Default is FALSE, if set to TRUE takes first 2 character of label and use it as short label.
`label.on.which.axis`	default is the 3rd (top) axis.
`col.low`	Color of significantly low coefficients.
`col.same`	Color of not significant difference.
`col.high`	Color of significantly high coefficients.
`vertical.line`	Default is TRUE, if set to FALSE does not draw vertical line.
`horizontal.line`	Default is FALSE, if set to TRUE draws horizontal line.
`vertical.line.lty`	Line type of vertical line.
`horizontal.line.lty`	Line type of horizontal line.
`mar`	A numerical vector of the form `c(bottom, left, top, right)` which gives the number of lines of margin to be specified on the four sides of the plot. The default is `c(3.5,3.5,3.5,3.5)`.

Value

`pvalue`	Array of Bayesian p value.
`significant`	Array of significance.

Author(s)

Masanao Yajima [email protected], Andrew Gelman [email protected]

References

Andrew Gelman and Jennifer Hill. (2006). Data Analysis Using Regression and Multilevel/Hierarchical Models. Cambridge University Press.

Examples

old.par <- par(no.readonly = TRUE)

# example 1
simulation.array <- data.frame(coef1=rnorm(100,10,2), coef2=rnorm(100,5,2),
                      coef3=rnorm(100,0,1), coef4=rnorm(100,-5,3),
                      coef5=rnorm(100,-2,1))
short.lab <- c("c01", "c02", "c03", "c04", "c05")
multicomp.plot(simulation.array[,1:4], label.as.shortlabel=TRUE)

# wraper for multicomp.plot
mcplot(simulation.array, shortlabel = short.lab)

# example 2
data(lalonde)
M1 <- lm(re78 ~ treat + re74 + re75 + age + educ + u74 + u75, data=lalonde)
M1.sim <- sim(M1)
lm.sim <- coef(M1.sim)[,-1]
multicomp.plot(lm.sim, label.as.shortlabel=TRUE, label.on.which.axis=2)

par(old.par)
old.par <- par(no.readonly = TRUE)

# example 1
simulation.array <- data.frame(coef1=rnorm(100,10,2), coef2=rnorm(100,5,2),
                      coef3=rnorm(100,0,1), coef4=rnorm(100,-5,3),
                      coef5=rnorm(100,-2,1))
short.lab <- c("c01", "c02", "c03", "c04", "c05")
multicomp.plot(simulation.array[,1:4], label.as.shortlabel=TRUE)

# wraper for multicomp.plot
mcplot(simulation.array, shortlabel = short.lab)

# example 2
data(lalonde)
M1 <- lm(re78 ~ treat + re74 + re75 + age + educ + u74 + u75, data=lalonde)
M1.sim <- sim(M1)
lm.sim <- coef(M1.sim)[,-1]
multicomp.plot(lm.sim, label.as.shortlabel=TRUE, label.on.which.axis=2)

par(old.par)

Function to read data by columns

Description

A function read data by columns

Usage

read.columns(filename, columns)
read.columns(filename, columns)

Arguments

`filename`	user specified file name including path of the file
`columns`	which columns of the data to be read

Author(s)

Andrew Gelman [email protected]

Function for Standardizing by Centering and Dividing by 2 sd's

Description

This function standardizes a variable by centering and dividing by 2 sd's with exceptions for binary variables.

Usage

rescale(x, binary.inputs="center")
rescale(x, binary.inputs="center")

Arguments

`x`	a vector
`binary.inputs`	options for standardizing binary variables, default is `center`; `0/1` keeps original scale; `-0.5,0.5` rescales 0 as -0.5 and 1 as 0.5; `center` substracts the mean; and `full` substracts the mean and divids by 2 sd.

Value

the standardized vector

Author(s)

Andrew Gelman [email protected]; Yu-Sung Su [email protected]

References

Andrew Gelman. (2008). “Scaling regression inputs by dividing by two standard deviations”. Statistics in Medicine 27: 2865–2873. http://www.stat.columbia.edu/~gelman/research/published/standardizing7.pdf

Examples

  # Set up the fake data
  n <- 100
  x <- rnorm (n, 2, 1)
  x1 <- rnorm (n)
  x1 <- (x1-mean(x1))/(2*sd(x1))   # standardization
  x2 <- rbinom (n, 1, .5)
  b0 <- 1
  b1 <- 1.5
  b2 <- 2
  y <- rbinom (n, 1, invlogit(b0+b1*x1+b2*x2))
  rescale(x, "full")
  rescale(y, "center")
# Set up the fake data
  n <- 100
  x <- rnorm (n, 2, 1)
  x1 <- rnorm (n)
  x1 <- (x1-mean(x1))/(2*sd(x1))   # standardization
  x2 <- rbinom (n, 1, .5)
  b0 <- 1
  b1 <- 1.5
  b2 <- 2
  y <- rbinom (n, 1, invlogit(b0+b1*x1+b2*x2))
  rescale(x, "full")
  rescale(y, "center")

residual plot for the observed values

Description

Plots the residual of observed variable.

Usage

residual.plot(Expected, Residuals, sigma, main = deparse(substitute(Expected)), 
  col.pts = "blue", col.ctr = "red", col.sgm = "black", cex = 0.5, 
  gray.scale = FALSE, xlab = "Predicted", ylab = "Residuals", ...)
residual.plot(Expected, Residuals, sigma, main = deparse(substitute(Expected)), 
  col.pts = "blue", col.ctr = "red", col.sgm = "black", cex = 0.5, 
  gray.scale = FALSE, xlab = "Predicted", ylab = "Residuals", ...)

Arguments

`Expected`	Expected value.
`Residuals`	Residual value.
`sigma`	Standard error.
`main`	main for the plot. See `plot` for detail.
`col.pts`	Color of the points.
`col.ctr`	Color of the line at zero.
`col.sgm`	Color of standard error line.
`cex`	A numerical value giving the amount by which plotting text and symbols should be magnified relative to the default. See par for detail.
`gray.scale`	If `TRUE`, makes the plot into black and white. This option overwrites the color specification. Default is FALSE.
`xlab`	Label for x axis.
`ylab`	Label for y axis.
`...`	Additional parameters passed to `plot` function.

Value

Plot to visualize pattern of residulal value for the expected value.

Author(s)

Masanao Yajima [email protected], M.Grazia Pittau [email protected]

Examples

old.par <- par(no.readonly = TRUE)

x <- rnorm(100)
y <- rnorm(100)
fit <- lm(y~x)
y.hat <- fitted(fit)
u <- resid(fit)
sigma <- sigma.hat(fit)
residual.plot(y.hat, u, sigma)

par(old.par)
old.par <- par(no.readonly = TRUE)

x <- rnorm(100)
y <- rnorm(100)
fit <- lm(y~x)
y.hat <- fitted(fit)
u <- resid(fit)
sigma <- sigma.hat(fit)
residual.plot(y.hat, u, sigma)

par(old.par)

Extract Standard Errors of Model Coefficients

Description

These functions extract standard errors of model coefficients from objects returned by modeling functions.

Usage

se.coef (object, ...)
se.fixef (object)
se.ranef (object)

## S4 method for signature 'lm'
se.coef(object)
## S4 method for signature 'glm'
se.coef(object)
## S4 method for signature 'merMod'
se.coef(object)
se.coef (object, ...)
se.fixef (object)
se.ranef (object)

## S4 method for signature 'lm'
se.coef(object)
## S4 method for signature 'glm'
se.coef(object)
## S4 method for signature 'merMod'
se.coef(object)

Arguments

`object`	object of `lm`, `glm` and `merMod` fit
`...`	other arguments

Details

se.coef extracts standard errors from objects returned by modeling functions. se.fixef extracts standard errors of the fixed effects from objects returned by lmer and glmer functions. se.ranef extracts standard errors of the random effects from objects returned by lmer and glmer functions.

Value

se.coef gives lists of standard errors for coef, se.fixef gives a vector of standard errors for fixef and se.ranef gives a list of standard errors for ranef.

Author(s)

Andrew Gelman [email protected]; Yu-Sung Su [email protected]

References

Andrew Gelman and Jennifer Hill. (2006). Data Analysis Using Regression and Multilevel/Hierarchical Models. Cambridge University Press.

Examples

# Here's a simple example of a model of the form, y = a + bx + error,
# with 10 observations in each of 10 groups, and with both the
# intercept and the slope varying by group.  First we set up the model and data.

   group <- rep(1:10, rep(10,10))
   mu.a <- 0
   sigma.a <- 2
   mu.b <- 3
   sigma.b <- 4
   rho <- 0
   Sigma.ab <- array (c(sigma.a^2, rho*sigma.a*sigma.b,
                    rho*sigma.a*sigma.b, sigma.b^2), c(2,2))
   sigma.y <- 1
   ab <- mvrnorm (10, c(mu.a,mu.b), Sigma.ab)
   a <- ab[,1]
   b <- ab[,2]
#
   x <- rnorm (100)
   y1 <- rnorm (100, a[group] + b[group]*x, sigma.y)
   y2 <- rbinom(100, 1, prob=invlogit(a[group] + b*x))

#  lm fit
   M1 <- lm (y1 ~ x)
   se.coef (M1)

#  glm fit
   M2 <- glm (y2 ~ x)
   se.coef (M2)

#  lmer fit
   M3 <- lmer (y1 ~ x + (1 + x |group))
   se.coef (M3)
   se.fixef (M3)
   se.ranef (M3)

#  glmer fit
   M4 <- glmer (y2 ~ 1 + (0 + x |group), family=binomial(link="logit"))
   se.coef (M4)
   se.fixef (M4)
   se.ranef (M4)
# Here's a simple example of a model of the form, y = a + bx + error,
# with 10 observations in each of 10 groups, and with both the
# intercept and the slope varying by group.  First we set up the model and data.

   group <- rep(1:10, rep(10,10))
   mu.a <- 0
   sigma.a <- 2
   mu.b <- 3
   sigma.b <- 4
   rho <- 0
   Sigma.ab <- array (c(sigma.a^2, rho*sigma.a*sigma.b,
                    rho*sigma.a*sigma.b, sigma.b^2), c(2,2))
   sigma.y <- 1
   ab <- mvrnorm (10, c(mu.a,mu.b), Sigma.ab)
   a <- ab[,1]
   b <- ab[,2]
#
   x <- rnorm (100)
   y1 <- rnorm (100, a[group] + b[group]*x, sigma.y)
   y2 <- rbinom(100, 1, prob=invlogit(a[group] + b*x))

#  lm fit
   M1 <- lm (y1 ~ x)
   se.coef (M1)

#  glm fit
   M2 <- glm (y2 ~ x)
   se.coef (M2)

#  lmer fit
   M3 <- lmer (y1 ~ x + (1 + x |group))
   se.coef (M3)
   se.fixef (M3)
   se.ranef (M3)

#  glmer fit
   M4 <- glmer (y2 ~ 1 + (0 + x |group), family=binomial(link="logit"))
   se.coef (M4)
   se.fixef (M4)
   se.ranef (M4)

Extract Residual Errors

Description

This generic function extracts residual errors from a fitted model.

Usage

sigma.hat(object,...)

## S3 method for class 'lm'
sigma.hat(object,...)
## S3 method for class 'glm'
sigma.hat(object,...)
## S3 method for class 'merMod'
sigma.hat(object,...)
## S3 method for class 'sim'
sigma.hat(object,...)
## S3 method for class 'sim.merMod'
sigma.hat(object,...)

sigma.hat(object,...)

## S3 method for class 'lm'
sigma.hat(object,...)
## S3 method for class 'glm'
sigma.hat(object,...)
## S3 method for class 'merMod'
sigma.hat(object,...)
## S3 method for class 'sim'
sigma.hat(object,...)
## S3 method for class 'sim.merMod'
sigma.hat(object,...)

Arguments

`object`	any fitted model object of `lm`, `glm` and `merMod` class
`...`	other arguments

Author(s)

Andrew Gelman [email protected]; Yu-Sung Su [email protected]

Examples

   group <- rep(1:10, rep(10,10))
   mu.a <- 0
   sigma.a <- 2
   mu.b <- 3
   sigma.b <- 4
   rho <- 0
   Sigma.ab <- array (c(sigma.a^2, rho*sigma.a*sigma.b,
                    rho*sigma.a*sigma.b, sigma.b^2), c(2,2))
   sigma.y <- 1
   ab <- mvrnorm (10, c(mu.a,mu.b), Sigma.ab)
   a <- ab[,1]
   b <- ab[,2]

   x <- rnorm (100)
   y1 <- rnorm (100, a[group] + b[group]*x, sigma.y)
   y2 <- rbinom(100, 1, prob=invlogit(a[group] + b*x))

   M1 <- lm (y1 ~ x)
   sigma.hat(M1)

   M2 <- bayesglm (y1 ~ x, prior.scale=Inf, prior.df=Inf)
   sigma.hat(M2) # should be same to sigma.hat(M1)

   M3 <- glm (y2 ~ x, family=binomial(link="logit"))
   sigma.hat(M3)

   M4 <- lmer (y1 ~ (1+x|group))
   sigma.hat(M4)

   M5 <- glmer (y2 ~ (1+x|group), family=binomial(link="logit"))
   sigma.hat(M5)

group <- rep(1:10, rep(10,10))
   mu.a <- 0
   sigma.a <- 2
   mu.b <- 3
   sigma.b <- 4
   rho <- 0
   Sigma.ab <- array (c(sigma.a^2, rho*sigma.a*sigma.b,
                    rho*sigma.a*sigma.b, sigma.b^2), c(2,2))
   sigma.y <- 1
   ab <- mvrnorm (10, c(mu.a,mu.b), Sigma.ab)
   a <- ab[,1]
   b <- ab[,2]

   x <- rnorm (100)
   y1 <- rnorm (100, a[group] + b[group]*x, sigma.y)
   y2 <- rbinom(100, 1, prob=invlogit(a[group] + b*x))

   M1 <- lm (y1 ~ x)
   sigma.hat(M1)

   M2 <- bayesglm (y1 ~ x, prior.scale=Inf, prior.df=Inf)
   sigma.hat(M2) # should be same to sigma.hat(M1)

   M3 <- glm (y2 ~ x, family=binomial(link="logit"))
   sigma.hat(M3)

   M4 <- lmer (y1 ~ (1+x|group))
   sigma.hat(M4)

   M5 <- glmer (y2 ~ (1+x|group), family=binomial(link="logit"))
   sigma.hat(M5)

Functions to Get Posterior Distributions

Description

This generic function gets posterior simulations of sigma and beta from a lm object, or simulations of beta from a glm object, or simulations of beta from a merMod object

Usage

sim(object, ...)

## S4 method for signature 'lm'
sim(object, n.sims = 100)
## S4 method for signature 'glm'
sim(object, n.sims = 100)
## S4 method for signature 'polr'
sim(object, n.sims = 100)
## S4 method for signature 'merMod'
sim(object, n.sims = 100)

## S3 method for class 'sim'
coef(object,...)
## S3 method for class 'sim.polr'
coef(object, slot=c("ALL", "coef", "zeta"),...)
## S3 method for class 'sim.merMod'
coef(object,...)
## S3 method for class 'sim.merMod'
fixef(object,...)
## S3 method for class 'sim.merMod'
ranef(object,...)
## S3 method for class 'sim.merMod'
fitted(object, regression,...)


sim(object, ...)

## S4 method for signature 'lm'
sim(object, n.sims = 100)
## S4 method for signature 'glm'
sim(object, n.sims = 100)
## S4 method for signature 'polr'
sim(object, n.sims = 100)
## S4 method for signature 'merMod'
sim(object, n.sims = 100)

## S3 method for class 'sim'
coef(object,...)
## S3 method for class 'sim.polr'
coef(object, slot=c("ALL", "coef", "zeta"),...)
## S3 method for class 'sim.merMod'
coef(object,...)
## S3 method for class 'sim.merMod'
fixef(object,...)
## S3 method for class 'sim.merMod'
ranef(object,...)
## S3 method for class 'sim.merMod'
fitted(object, regression,...)

Arguments

`object`	the output of a call to `lm` with n data points and k predictors.
`slot`	return which slot of `sim.polr`, available options are `coef, zeta, ALL`.
`...`	further arguments passed to or from other methods.
`n.sims`	number of independent simulation draws to create.
`regression`	the orginial mer model

Value

`coef`	matrix (dimensions n.sims x k) of n.sims random draws of coefficients.
`zeta`	matrix (dimensions n.sims x k) of n.sims random draws of zetas (cut points in polr).
`fixef`	matrix (dimensions n.sims x k) of n.sims random draws of coefficients of the fixed effects for the `merMod` objects. Previously, it is called `unmodeled`.
`sigma`	vector of n.sims random draws of sigma (for `glm`'s, this just returns a vector of 1's or else of the square root of the overdispersion parameter if that is in the model)

Author(s)

Andrew Gelman [email protected]; Yu-Sung Su [email protected]; Vincent Dorie [email protected]

References

Andrew Gelman and Jennifer Hill. (2006). Data Analysis Using Regression and Multilevel/Hierarchical Models. Cambridge University Press.

Examples

#Examples of "sim"
 set.seed (1)
 J <- 15
 n <- J*(J+1)/2
 group <- rep (1:J, 1:J)
 mu.a <- 5
 sigma.a <- 2
 a <- rnorm (J, mu.a, sigma.a)
 b <- -3
 x <- rnorm (n, 2, 1)
 sigma.y <- 6
 y <- rnorm (n, a[group] + b*x, sigma.y)
 u <- runif (J, 0, 3)
 y123.dat <- cbind (y, x, group)
# Linear regression
 x1 <- y123.dat[,2]
 y1 <- y123.dat[,1]
 M1 <- lm (y1 ~ x1)
 display(M1)
 M1.sim <- sim(M1)
 coef.M1.sim <- coef(M1.sim)
 sigma.M1.sim <- sigma.hat(M1.sim)
 ## to get the uncertainty for the simulated estimates
 apply(coef(M1.sim), 2, quantile)
 quantile(sigma.hat(M1.sim))

# Logistic regression
 u.data <- cbind (1:J, u)
 dimnames(u.data)[[2]] <- c("group", "u")
 u.dat <- as.data.frame (u.data)
 y <- rbinom (n, 1, invlogit (a[group] + b*x))
 M2 <- glm (y ~ x, family=binomial(link="logit"))
 display(M2)
 M2.sim <- sim (M2)
 coef.M2.sim <- coef(M2.sim)
 sigma.M2.sim <- sigma.hat(M2.sim)

# Ordered Logistic regression
house.plr <- polr(Sat ~ Infl + Type + Cont, weights = Freq, data = housing)
display(house.plr)
M.plr <- sim(house.plr)
coef.sim <- coef(M.plr, slot="coef")
zeta.sim <- coef(M.plr, slot="zeta")
coefall.sim <- coef(M.plr)

# Using lmer:
# Example 1
 E1 <- lmer (y ~ x + (1 | group))
 display(E1)
 E1.sim <- sim (E1)
 coef.E1.sim <- coef(E1.sim)
 fixef.E1.sim <- fixef(E1.sim)
 ranef.E1.sim <- ranef(E1.sim)
 sigma.E1.sim <- sigma.hat(E1.sim)
 yhat <- fitted(E1.sim, E1)

# Example 2
 u.full <- u[group]
 E2 <- lmer (y ~ x + u.full + (1 | group))
 display(E2)
 E2.sim <- sim (E2)
 coef.E2.sim <- coef(E2.sim)
 fixef.E2.sim <- fixef(E2.sim)
 ranef.E2.sim <- ranef(E2.sim)
 sigma.E2.sim <- sigma.hat(E2.sim)
 yhat <- fitted(E2.sim, E2)

# Example 3
 y <- rbinom (n, 1, invlogit (a[group] + b*x))
 E3 <- glmer (y ~ x + (1 | group), family=binomial(link="logit"))
 display(E3)
 E3.sim <- sim (E3)
 coef.E3.sim <- coef(E3.sim)
 fixef.E3.sim <- fixef(E3.sim)
 ranef.E3.sim <- ranef(E3.sim)
 sigma.E3.sim <- sigma.hat(E3.sim)
 yhat <- fitted(E3.sim, E3)
#Examples of "sim"
 set.seed (1)
 J <- 15
 n <- J*(J+1)/2
 group <- rep (1:J, 1:J)
 mu.a <- 5
 sigma.a <- 2
 a <- rnorm (J, mu.a, sigma.a)
 b <- -3
 x <- rnorm (n, 2, 1)
 sigma.y <- 6
 y <- rnorm (n, a[group] + b*x, sigma.y)
 u <- runif (J, 0, 3)
 y123.dat <- cbind (y, x, group)
# Linear regression
 x1 <- y123.dat[,2]
 y1 <- y123.dat[,1]
 M1 <- lm (y1 ~ x1)
 display(M1)
 M1.sim <- sim(M1)
 coef.M1.sim <- coef(M1.sim)
 sigma.M1.sim <- sigma.hat(M1.sim)
 ## to get the uncertainty for the simulated estimates
 apply(coef(M1.sim), 2, quantile)
 quantile(sigma.hat(M1.sim))

# Logistic regression
 u.data <- cbind (1:J, u)
 dimnames(u.data)[[2]] <- c("group", "u")
 u.dat <- as.data.frame (u.data)
 y <- rbinom (n, 1, invlogit (a[group] + b*x))
 M2 <- glm (y ~ x, family=binomial(link="logit"))
 display(M2)
 M2.sim <- sim (M2)
 coef.M2.sim <- coef(M2.sim)
 sigma.M2.sim <- sigma.hat(M2.sim)

# Ordered Logistic regression
house.plr <- polr(Sat ~ Infl + Type + Cont, weights = Freq, data = housing)
display(house.plr)
M.plr <- sim(house.plr)
coef.sim <- coef(M.plr, slot="coef")
zeta.sim <- coef(M.plr, slot="zeta")
coefall.sim <- coef(M.plr)

# Using lmer:
# Example 1
 E1 <- lmer (y ~ x + (1 | group))
 display(E1)
 E1.sim <- sim (E1)
 coef.E1.sim <- coef(E1.sim)
 fixef.E1.sim <- fixef(E1.sim)
 ranef.E1.sim <- ranef(E1.sim)
 sigma.E1.sim <- sigma.hat(E1.sim)
 yhat <- fitted(E1.sim, E1)

# Example 2
 u.full <- u[group]
 E2 <- lmer (y ~ x + u.full + (1 | group))
 display(E2)
 E2.sim <- sim (E2)
 coef.E2.sim <- coef(E2.sim)
 fixef.E2.sim <- fixef(E2.sim)
 ranef.E2.sim <- ranef(E2.sim)
 sigma.E2.sim <- sigma.hat(E2.sim)
 yhat <- fitted(E2.sim, E2)

# Example 3
 y <- rbinom (n, 1, invlogit (a[group] + b*x))
 E3 <- glmer (y ~ x + (1 | group), family=binomial(link="logit"))
 display(E3)
 E3.sim <- sim (E3)
 coef.E3.sim <- coef(E3.sim)
 fixef.E3.sim <- fixef(E3.sim)
 ranef.E3.sim <- ranef(E3.sim)
 sigma.E3.sim <- sigma.hat(E3.sim)
 yhat <- fitted(E3.sim, E3)

Function for Standardizing Regression Predictors by Centering and Dividing by 2 sd's

Description

Numeric variables that take on more than two values are each rescaled to have a mean of 0 and a sd of 0.5; Binary variables are rescaled to have a mean of 0 and a difference of 1 between their two categories; Non-numeric variables that take on more than two values are unchanged; Variables that take on only one value are unchanged

Usage

## S4 method for signature 'lm'
standardize(object, unchanged = NULL, 
    standardize.y = FALSE, binary.inputs = "center")
## S4 method for signature 'glm'
standardize(object, unchanged = NULL, 
    standardize.y = FALSE, binary.inputs = "center")
## S4 method for signature 'merMod'
standardize(object, unchanged = NULL, 
    standardize.y = FALSE, binary.inputs = "center")
## S4 method for signature 'polr'
standardize(object, unchanged = NULL, 
    standardize.y = FALSE, binary.inputs = "center")
## S4 method for signature 'lm'
standardize(object, unchanged = NULL, 
    standardize.y = FALSE, binary.inputs = "center")
## S4 method for signature 'glm'
standardize(object, unchanged = NULL, 
    standardize.y = FALSE, binary.inputs = "center")
## S4 method for signature 'merMod'
standardize(object, unchanged = NULL, 
    standardize.y = FALSE, binary.inputs = "center")
## S4 method for signature 'polr'
standardize(object, unchanged = NULL, 
    standardize.y = FALSE, binary.inputs = "center")

Arguments

`object`	an object of class `lm` or `glm`
`unchanged`	vector of names of parameters to leave unstandardized
`standardize.y`	if TRUE, the outcome variable is standardized also
`binary.inputs`	options for standardizing binary variables

Details

"0/1" (rescale so that the lower value is 0 and the upper is 1) "-0.5/0.5" (rescale so that the lower value is -0.5 and upper is 0.5) "center" (rescale so that the mean of the data is 0 and the difference between the two categories is 1) "full" (rescale by subtracting the mean and dividing by 2 sd's) "leave.alone" (do nothing)

Author(s)

Andrew Gelman [email protected] Yu-Sung Su [email protected]

References

Andrew Gelman. (2008). “Scaling regression inputs by dividing by two standard deviations.” Statistics in Medicine 27: 2865–2873. http://www.stat.columbia.edu/~gelman/research/published/standardizing7.pdf

Examples

  # Set up the fake data
  n <- 100
  x <- rnorm (n, 2, 1)
  x1 <- rnorm (n)
  x1 <- (x1-mean(x1))/(2*sd(x1))   # standardization
  x2 <- rbinom (n, 1, .5)
  b0 <- 1
  b1 <- 1.5
  b2 <- 2
  y <- rbinom (n, 1, invlogit(b0+b1*x1+b2*x2))
  y2 <- sample(1:5, n, replace=TRUE)
  M1 <- glm (y ~ x, family=binomial(link="logit"))
  display(M1)
  M1.1 <- glm (y ~ rescale(x), family=binomial(link="logit"))
  display(M1.1)
  M1.2 <- standardize(M1)
  display(M1.2)
  # M1.1 & M1.2 should be the same
  M2 <- polr(ordered(y2) ~ x)
  display(M2)
  M2.1 <- polr(ordered(y2) ~ rescale(x))
  display(M2.1)
  M2.2 <- standardize(M2.1)
  display(M2.2)
  # M2.1 & M2.2 should be the same
# Set up the fake data
  n <- 100
  x <- rnorm (n, 2, 1)
  x1 <- rnorm (n)
  x1 <- (x1-mean(x1))/(2*sd(x1))   # standardization
  x2 <- rbinom (n, 1, .5)
  b0 <- 1
  b1 <- 1.5
  b2 <- 2
  y <- rbinom (n, 1, invlogit(b0+b1*x1+b2*x2))
  y2 <- sample(1:5, n, replace=TRUE)
  M1 <- glm (y ~ x, family=binomial(link="logit"))
  display(M1)
  M1.1 <- glm (y ~ rescale(x), family=binomial(link="logit"))
  display(M1.1)
  M1.2 <- standardize(M1)
  display(M1.2)
  # M1.1 & M1.2 should be the same
  M2 <- polr(ordered(y2) ~ x)
  display(M2)
  M2.1 <- polr(ordered(y2) ~ rescale(x))
  display(M2.1)
  M2.2 <- standardize(M2.1)
  display(M2.2)
  # M2.1 & M2.2 should be the same

Trace plot of ‘bugs’ object

Description

Displays a plot of iterations vs. sampled values for each variable in the chain, with a separate plot per variable.

Usage

## S4 method for signature 'bugs'
traceplot( x, mfrow = c( 1, 1 ), varname = NULL,
  match.head = TRUE, ask = TRUE,
  col = rainbow( x$n.chains ),
  lty = 1, lwd = 1, ...)
## S4 method for signature 'bugs'
traceplot( x, mfrow = c( 1, 1 ), varname = NULL,
  match.head = TRUE, ask = TRUE,
  col = rainbow( x$n.chains ),
  lty = 1, lwd = 1, ...)

Arguments

`x`	A bugs object
`mfrow`	graphical parameter (see `par`)
`varname`	vector of variable names to plot
`match.head`	matches the variable names by the beginning of the variable names in bugs object
`ask`	logical; if `TRUE`, the user is asked before each plot, see `par(ask=.)`.
`col`	graphical parameter (see `par`)
`lty`	graphical parameter (see `par`)
`lwd`	graphical parameter (see `par`)
`...`	further graphical parameters

Author(s)

Masanao Yajima [email protected]. Yu-Sung Su [email protected]

Triangle Plot

Description

Function for making a triangle plot from a square matrix

Usage

triangleplot (x, y=NULL, cutpts=NULL, details=TRUE, 
            n.col.legend=5, cex.col=0.7, 
            cex.var=0.9, digits=1, color=FALSE)
triangleplot (x, y=NULL, cutpts=NULL, details=TRUE, 
            n.col.legend=5, cex.col=0.7, 
            cex.var=0.9, digits=1, color=FALSE)

Arguments

`x`	a square matrix.
`y`	a vector of names that corresponds to each element of the square matrix x.
`cutpts`	a vector of cutting points for color legend, default is `NULL`. The function will decide the cutting points if cutpts is not assigned.
`details`	show more than one digits correlaton values. Default is `TRUE`. `FALSE` is suggested to get readable output.
`n.col.legend`	number of legend for the color thermometer
`cex.col`	font size of the color thermometer.
`cex.var`	font size of the variable names.
`digits`	number of digits shown in the text of the color theromoeter.
`color`	color of the plot, default is FALSE, which uses gray scale.

Details

The function makes a triangle plot from a square matrix, e.g., the correlation plot, see corrplot. If a square matrix contains missing values, the cells of missing values will be marked x.

Author(s)

Yu-Sung Su [email protected]

Examples

old.par <- par(no.readonly = TRUE)

 # create a square matrix
 x <- matrix(runif(1600, 0, 1), 40, 40)
 
 # fig 1
 triangleplot(x)
 
 # fig 2 assign cutting points
 triangleplot(x, cutpts=c(0,0.25,0.5,0.75,1), digits=2)
 
 # fig 3 if x contains missing value
 x[12,13] <- x[13,12] <- NA
 x[25,27] <- x[27,25] <- NA
 triangleplot(x)
 
par(old.par)

#
#library(RColorBrewer)
#cormat <-  cor(iris[,-5])
#triangleplot2(cormat,color = brewer.pal( 5, "RdBu" ),
# n.col.legend=5, cex.col=0.7, cex.var=0.5)

old.par <- par(no.readonly = TRUE)

 # create a square matrix
 x <- matrix(runif(1600, 0, 1), 40, 40)
 
 # fig 1
 triangleplot(x)
 
 # fig 2 assign cutting points
 triangleplot(x, cutpts=c(0,0.25,0.5,0.75,1), digits=2)
 
 # fig 3 if x contains missing value
 x[12,13] <- x[13,12] <- NA
 x[25,27] <- x[27,25] <- NA
 triangleplot(x)
 
par(old.par)

#
#library(RColorBrewer)
#cormat <-  cor(iris[,-5])
#triangleplot2(cormat,color = brewer.pal( 5, "RdBu" ),
# n.col.legend=5, cex.col=0.7, cex.var=0.5)

Package 'arm'

Help Index

Functions to compute the balance statistics

Description

Usage

Arguments

Details

Note

Author(s)

References

See Also

Examples

Bayesian generalized linear models.

Description

Usage

Arguments

Details

Value

Author(s)

References

See Also

Examples

Bayesian Ordered Logistic or Probit Regression

Description

Usage

Arguments

Details

Value

Author(s)

See Also

Examples

Binned Residual Plot

Description

Usage

Arguments

Details

Value

Note

Author(s)

References

See Also

Examples

Generic Function for Making Coefficient Plot

Description

Usage

Arguments

Details

Value

Author(s)

References

See Also

Examples

Contrast Matrices

Description

Usage

Arguments

Details

Author(s)

See Also

Examples

Correlation Plot

Description

Usage

Arguments

Details

Value

Author(s)

References

See Also

Examples

Histogram for Discrete Distributions

Description

Usage

Arguments

Details

Examples

Functions for Processing lm, glm, mer, polr and svyglm Output

Description

Usage

Arguments