**14.79** Measuring the height of a California redwood tree is very difficult because these trees grow to heights over 300 feet. People familiar with these trees understand that the height of a California redwood tree is related to other characteristics of the tree, including the diameter of the tree at the breast height of a person (in inches) and the thickness of the bark of the tree (in inches). The file Redwood contains the height, diameter at breast height of a person, and bark thickness for a sample of 21 California redwood trees.

a. State the multiple regression equation that predicts the height of a tree, based on the tree’s diameter at breast height and the thickness of the bark.

b. Interpret the meaning of the slopes in this equation.

c. Predict the mean height for a tree that has a breast height diameter of 25 inches and a bark thickness of 2 inches.

d. Interpret the meaning of the coefficient of multiple determination in this problem,

e. Perform a residual analysis on the results and determine whether the regression assumptions are valid.

**14.81 **A baseball analytics specialist wants to determine which variables are important in predicting a team’s wins in a given season. He has collected data related to wins, earned run average (ERA), and runs scored for the 2012 season (stored in BB2012). Develop a model to predict the number of wins based on ERA and runs scored.

a. State the multiple regression equation.

b. Interpret the meaning of the slopes in this equation.

c. Predict the mean number of wins for a team that has an ERA of 4.50 and has scored 750 runs.

d. Perform a residual analysis on the results and determine whether the regression assumptions are valid.

e. Is there a significant relationship between the number of wins and the two independent variables (ERA and runs scored) at the 0.05 level of significance?

f. Determine the p-value in (e) and interpret the meaning.

g. Interpret the meaning of the coefficient of multiple determination in this problem?

h. Determine the adjusted r2.

i. At the 0.05 level of significance, determine whether each independent variable makes a significant contribution to the regression model. Indicate the most appropriate regression model for this set of data.

j. Determine the p-values in (i) and interpret their meaning.

k. Construct a 95% confidence interval estimate of the population slope between wins and ERA.

l. Compute and interpret the coefficients if partial determination.

m. Which is more important in predicting wins- pitching, as measured by ERA, or offense, as measured by runs scored? Explain.

n. Perform an influence analysis on your results and determine whether any observations should be deleted from the model. If necessary, reanalyze the regression model after deleting these observations and compare your results to those of the original model.

**Chapter 15 problems: 15.29, 15.37. For each problem compute coefficient of partial determination including contribution to overall model for each predictor as well as statistical significance of each subset. **

** **

**Time series. **

**15.29** A specialist in baseball analytics has expanded his analysis, presented in problem 14.81 on page 586, of which variables are important in predicting a team’s wins in a given baseball season. He has collected in BB2012 related to wins, ERA, saves, runs scored, hits allowed, and errors for the 2012 season.

a. Develop the most appropriate multiple regression model to predict a team’s wins. Be sure to include a thorough residual and influence analysis. In addition, provide a detailed explanation of the results.

b. Develop the most appropriate multiple regression model to predict a team’s ERA on the basis of hits allowed, walks allowed, errors, and saves. Be sure to include a thorough residual and influence analysis. In addition, provide a detailed explanation of the results.

**15.37 **Financial analysts engage in business valuation to determine a company’s value. A standard approach uses the multiple of earnings method: You multiply a company’s profits by a certain value (average or median) to arrive at a final value. More recently, regression analysis has been demonstrated to consistently deliver more accurate predictions. A valuator has been given the assignment of valuing a drug company. He obtained financial data on 72 drug companies…

Develop the most appropriate multiple regression model to predict the price-to-book value ratio. Be sure to perform thorough residual and influence analysis. In addition, provide a detailed explanation of your results.

**Chapter 16 problems: 16.5, 16.11, 16.13, 16.59.**

**16.5** The following data, stored in Spills provide the number of oil spills in the Gulf of Mexico from 1996 to 2012.

a. Plot the time series.

b. Fit a three-year moving average to the data and plot the results.

c. Using a smoothing coefficient of W=0.50, exponentially smooth the series and plot the result.

d. What is your exponentially smoothed forecast for 2013?

e. Repeat (c) and (d), using W=0.25.

f. Compare the results of (d) and (e).

g. What conclusions can you reach concerning the number of oil spills in the Gulf of Mexico?

**16.11** The linear trend forecasting equation for an annual time series containing 42 values (from 1972 to 2013) on net sales (in $billions) is

Yi= 1.2 + 0.5Xi

a. Interpret the Y intercept, b0.

From the above equation, b0=1.2 that represents the Y intercept. This represents the fitted trend value that reflects the real total revenues ($billions) for the base year 1972.

b. Interpret the slope, b1.

c. What is the fitted trend value for the tenth year?

d. What is the fitted trend value for the most recent year?

e. What is the projected trend forecast two years after the last value?

** **

**16.13** Gross domestic product (GDP) is a major indicator of a nation’s overall economic activity. It consists of personal consumption expenditures, gross domestic investment, net exports of goods and services, and government consumption expenditures. The file GDP contains the GDP (in billions of current dollars) for the United States from 1980 to 2012.

a. Plot the data.

b. Compute a linear trend for forecasting equation and plot the trend line.

c. What are your forecasts for 2013 and 2014?

d. What conclusions can you reach concerning the trend in GDP?