Regression Analysis PAGE 2
Regression AnalysisAugust 18, 2008IntroductionRegression analysis and correlation analysis are two methods widely used in statistics to investigate the associations of variables. Regression and correlation gauge the extent of a connection among variables in two methods that are similar, yet different. In the case of regression analysis, Y is considered as a single dependent variable and a working attribute of one or more variables that would be considered independent, such as X1, X2, X3, and so on.In the case of regression analysis the presumption is that the values of both the independent and dependent variables are acquired in a random fashion, devoid ...view middle of the document...
The data set originates from a study that examines a new method of measuring body composition.The body fat percentage, age and gender are given for 18 adults aged between 23 and 61. The data consists of 18 observations involving two variables. Those would be the age of the subject in years, and the percentage of body fat on the subject (Gibbons, Mazess and Peppler, 1984).
Data set:
Age Percent Fat
23 9.5
23 27.9
27 7.8
27 17.8
39 31.4
41 25.9
45 27.4
49 25.2
50 31.1
53 34.7
53 42
54 29.1
56 32.5
57 30.3
58 33
58 33.8
60 41.1
61 34.5
Regression AnalysisFigure 1 represents the estimated value of body fat for a person of x years. The estimated value is also referred to as. The equation for estimated percent fat as a function of age by regression is (1).Residual error from either inappropriate sample or extremes in the sample is accountedfor by the ε in equation (1). The R2 value implies that 63% of the variation in percent fatis due to age. Using the values in table 2, the correlation coefficient r is arrived at usingthe equation (2).On the other hand, from a graphical perspective, the coefficient of determination R2 canbe explained as the ratio between the SSR (squared sum total of distance between theestimator,) and SST (the mean and the squared sum total of distance between the data and themean). This is expressed more clearly by the equation (3).The R2 value will be the same whether one uses the correlative method and squares equation (2)or the graphical method and solves equation (3).However, there lies the possibility that the actual or true R2 value may be slightly higher or lower though. Both equations (2) and (3) use the assumption that the regression is represented by equation (1). Unfortunately, either some randomness or inappropriateness may exist in the sample. A true regression can be constructed using a t distribution and confidence intervals.When the estimated regression is given as(1),the true regression is given as (4).The value for b1 is arrived at using the equation (5).The value of b1 is 0.548 in this example. The value for b0 is determined by substituting the values of b1, any single x data, and the corresponding y data point into equation (1) and solving.The value b0 is 3.22 in this example. Under the assumption that the data is distributed normally, the equation for B1 incorporates a t-distribution and is given as (6),and where is the standard error. The equation for the standard error is (7).Meanwhile the equation for B0 is (8).Since both the R2 value is not very high and the confidence interval is very large, the data implies that although age is an indicator of percent fat, age cannot be used with a high degree of certainty to predict percent fat.Interpretation of ResultsThe regression correlation is...