Wednesday, September 30, 2009

Homework assignment, due mardi 6 oct

1. Exercise 4.1: Logarithmic transformation and regression: consider the following regression:

log(weight) = −3.5 + 2.0 log(height) + error,

with errors that have standard deviation 0.25. Weights are in pounds and heights
are in inches.

(a) Fill in the blanks: approximately 68% of the persons will have weights within
a factor of and of their predicted values from the regression.

(b) Draw the regression line and scatterplot of log(weight) versus log(height) that
make sense and are consistent with the fitted model. Be sure to label the axes
of your graph.

2. Exercise 4.2: The folder earnings has data from the Work, Family, and Well-Being Survey. Pull out the data on earnings, sex, height, and weight.

(a) In R, check the dataset and clean any unusually coded data.

(b) Fit a linear regression model predicting earnings from height. What transformation should you perform in order to interpret the intercept from this model as average earnings for people with average height?

(c) Fit some regression models with the goal of predicting earnings from some combination of sex, height, and weight. Be sure to try various transformations and interactions that might make sense. Choose your preferred model and justify.

(d) Interpret all model coefficients.

3. Exercise 4.3: lotting linear and nonlinear regressions: we downloaded data with weight (in pounds) and age (in years) from a random sample of American adults. We first created new variables: age10 = age/10 and age10.sq = (age/10)2, and indicators age18.29, age30.44, age45.64, and age65up for four age categories. We then fit some regressions, with the following results:

lm(formula = weight ~ age10)
coef.est coef.se
(Intercept) 161.0 7.3
age10 2.6 1.6
n = 2009, k = 2
residual sd = 119.7, R-Squared = 0.00

lm(formula = weight ~ age10 + age10.sq)
coef.est coef.se
(Intercept) 96.2 19.3
age10 33.6 8.7
age10.sq -3.2 0.9
n = 2009, k = 3
residual sd = 119.3, R-Squared = 0.01

lm(formula = weight ~ age30.44 + age45.64 + age65up)
coef.est coef.se
(Intercept) 157.2 5.4
age30.44TRUE 19.1 7.0
age45.64TRUE 27.2 7.6
age65upTRUE 8.5 8.7
n = 2009, k = 4
residual sd = 119.4, R-Squared = 0.01

(a) On a graph of weights versus age (that is, weight on y-axis, age on x-axis), draw the fitted regression line from the first model.

(b) On the same graph, draw the fitted regression line from the second model.

(c) On another graph with the same axes and scale, draw the fitted regression line from the third model. (It will be discontinuous.)

No comments:

Post a Comment