Standard error of regression coefficient without raw data

Is it possible to derive the standard error of a regression coefficient from summary data alone?

E.g., assume we are given the following variance-covariance matrix.

[Var(X)Cov(X,Y)Cov(X,Y)Var(Y)]

We can derive the regression coefficient βXY=Cov(X,Y)/Var(X)

βXY=Cov(X,Y)/Var(X)

.iven a specific n, is it possible to derive the standard error of β

β as well? If so, which formula is being used? It appears that all formulas for regression standard errors that I could find assume that you know the variance of residuals of the regression, which we don't know from summary data alone

where I know that

σ2(b)=σ2(XX)1=σ2(Var(X)(n1))1σ2(b)=σ2(X′X)−1=σ2(Var(X)(n−1))−1,

but typically matrix formulas for the standard error of bb then note that σ2σ2, the unknown variance of the errors, is estimated with MSE.

Based on comments by whuber, sum of squares residuals is:

YYbXX

Y′Y−bX′X, which according to Greg can be spelled out a

s

Var(Y)(n1)b(Var(x)(n1))Var(Y)(n−1)−b(Var(x)(n−1))

And to go from here to MSE, I think (?) all that is left is divide by n-2,

so (Var(Y)(n1)b(Var(x)(n1)))/(n2)(Var(Y)(n−1)−b(Var(x)(n−1)))/(n−2)

And plugging this all in yields:

σ2(b)=((Var(Y)(n1)b(Var(X)(n1)))/(n2))×(Var(X)(n1))1σ2(b)=((Var(Y)(n−1)−b(Var(X)(n−1)))/(n−2))×(Var(X)(n−1))−1

I will try to see if this works out in R, and will report back. If anyone sees something blatantly wrong, I appreciate the heads up.




n <- 24
p <- 3
beta <- seq(-p, p, length.out=p)# The model
set.seed(17)
x <- matrix(rnorm(n*p), ncol=p) # Independent variables
y <- x %*% beta + rnorm(n)      # Dependent variable plus error
#
# Compute the first and second order data summaries.
#
m <- rep(0, p+1)                # Default means
m <- colMeans(cbind(x,y))       # If means are available--comment out otherwise
v <- cov(cbind(x,y))            # All variances and covariances
# 
# From this point on, only the summaries `m` and `v` are used for the calculations
# (along with `n` and `p`, of course).
#
m <- m * n                      # Compute column sums
v <- v * (n-1)                  # Recover sums of squares of residuals
v <- v + outer(m, m)/n          # Adjust to obtain the sums of squares
v <- rbind(c(n, m), cbind(m, v))# Border with the sums and the data count
xx <- v[-(p+2), -(p+2)]         # Extract X'X
xy <- v[-(p+2), p+2]            # Extract X'Y
yy <- v[p+2, p+2]               # Extract Y'Y
b <- solve(xx, xy)              # Compute the coefficient estimates
s2 <- (yy - b %*% xy) / (n-p-1) # Compute the residual variance estimate
#
# Compare to `lm`.
#
fit <- summary(lm(y ~ x))
(rbind(Correct=coef(fit)[, "Estimate"], From.summary=b))    # Coeff. estimates
(c(Correct=fit$sigma, From.summary=sqrt(s2)))               # Residual SE
#
# The SE of the intercept will be incorrect unless true means are provided.
#
se <- sqrt(diag(solve(xx) * c(s2))) # Remove `diag` to compute the full var-covar matrix
(rbind(Correct=coef(fit)[, "Std. Error"], From.summary=se)) # Coeff. SEs



.


發佈了7 篇原創文章 · 獲贊 23 · 訪問量 8萬+
發表評論
所有評論
還沒有人評論,想成為第一個評論的人麼? 請在上方評論欄輸入並且點擊發布.
相關文章