Multi Regression.xls

Sheet1

Page 1
Regression # 1 with Demand vs. price, advertising, Income
Demand Price Advertising Income
42100 11.77 46100 38000
55500 9.96 47200 39100
71100 12.36 60900 40100
63200 12.49 55600 44200
77200 10.68 64400 41800
70900 12.07 60700 44800
55600 11.97 52100 39900
70700 11.23 57900 43600
71400 11.26 55600 41700
79400 9.79 60100 41200
60600 12.29 50700 44000
50800 12.7 46500 43300
61800 12.33 58600 41000
40500 10.88 42800 38300
85300 10.14 64800 42100
SUMMARY OUTPUT
Regression Statistics
Multiple R 0.9745062006
R Square 0.9496623349
Adjusted R Square 0.9359338808
Standard Error 3333.092314398
Observations 15
ANOVA
df SS MS F Significance F
Regression 3 2305491451.86071 768497150.620236 69.1747466484 0.0000002002
Residual 11 122204548.139292 11109504.3762993
Total 14 2427696000
Coefficients Standard Error t Stat P-value Lower 95% Upper 95%
Intercept -33301.7280624811 17898.4143542384 -1.8605965536 0.0897240189 -72695.8923688769 6092.4362439147
Price -4041.5338282552 1040.6404220588 -3.883698675 0.0025470799 -6331.9691125829 -1751.0985439276
Advertising 1.4543896909 0.1516168655 9.5925323722 0.0000011186 1.120683051 1.7880963308
Income 1.527891689 0.5128017369 2.9794978821 0.0125307541 0.3992221052 2.6565612727
Regression # 2 with Demand vs. price
Demand Price
42100 11.77
55500 9.96
71100 12.36
63200 12.49
77200 10.68
70900 12.07
55600 11.97
70700 11.23
71400 11.26 Page 2
79400 9.79
60600 12.29
50800 12.7
61800 12.33
40500 10.88
85300 10.14
SUMMARY OUTPUT
Regression Statistics
Multiple R 0.350203681
R Square 0.1226426182
Adjusted R Square 0.0551535888
Standard Error 12800.1111966511
Observations 15
ANOVA
df SS MS F Significance F
Regression 1 297738993.593787 297738993.593787 1.8172230261 0.2006653343
Residual 13 2129957006.40621 163842846.646632
Total 14 2427696000
Coefficients Standard Error t Stat P-value Lower 95% Upper 95%
Intercept 117762.55345144 40210.8157751149 2.9286288075 0.0117442613 30892.3841521681 204632.722750712
Price -4713.4615040228 3496.5186485085 -1.3480441484 0.2006653343 -12267.2293423258 2840.3063342802
Regression # 3 with Demand vs. advertising, Income
Demand Advertising Income
42100 46100 38000
55500 47200 39100
71100 60900 40100
63200 55600 44200
77200 64400 41800
70900 60700 44800
55600 52100 39900
70700 57900 43600
71400 55600 41700
79400 60100 41200
60600 50700 44000
50800 46500 43300
61800 58600 41000
40500 42800 38300
85300 64800 42100
SUMMARY OUTPUT
Regression Statistics
Multiple R 0.9384240597
R Square 0.8806397157
Adjusted R Square 0.860746335
Standard Error 4914.0146915052
Observations 15 Page 3
ANOVA
df SS MS F Significance F
Regression 2 2137925515.34005 1068962757.67003 44.2679767993 0.0000028917
Residual 12 289770484.659947 24147540.388329
Total 14 2427696000
Coefficients Standard Error t Stat P-value Lower 95% Upper 95%
Intercept -53658.5380752887 25230.920508409 -2.1266975994 0.0548767297 -108631.990443247 1314.9142926693
Advertising 1.6733636428 0.2074992554 8.0644320378 0.0000034631 1.2212616108 2.1254656748
Income 0.6132666177 0.671586564 0.9131609394 0.3791392279 -0.8499947791 2.0765280145
Regression # 4 with Demand vs. price, advertising
Demand Price Advertising
42100 11.77 46100
55500 9.96 47200
71100 12.36 60900
63200 12.49 55600
77200 10.68 64400
70900 12.07 60700
55600 11.97 52100
70700 11.23 57900
71400 11.26 55600
79400 9.79 60100
60600 12.29 50700
50800 12.7 46500
61800 12.33 58600
40500 10.88 42800
85300 10.14 64800
SUMMARY OUTPUT
Regression Statistics
Multiple R 0.9534348294
R Square 0.9090379738
Adjusted R Square 0.8938776362
Standard Error 4289.7955181119
Observations 15
ANOVA
df SS MS F Significance F
Regression 2 2206867852.95345 1103433926.47672 59.9615913769 0.0000005664
Residual 12 220828147.046554 18402345.5872128
Total 14 2427696000
Coefficients Standard Error t Stat P-value Lower 95% Upper 95%
Intercept 817.610310666 17704.0238072218 0.0461821742 0.9639247114 -37756.1432376096 39391.3638589416
Price -2617.5932710115 1189.7435071254 -2.200132428 0.0481253769 -5209.8216440211 -25.364898002
Advertising 1.6915685443 0.1660766479 10.1854689741 0.0000002935 1.3297186194 2.0534184692
Page 4
Month Ads Cost # of Ads Sales
1 13.9 12 43.6
2 12 11 38
3 9.3 9 30.1
4 9.7 7 35.3
5 12.3 12 46.4
6 11.4 8 34.2
7 9.3 6 30.2
8 14.3 13 40.7
9 10.2 8 38.5
10 8.4 6 22.6
11 11.2 8 37.6
12 11.1 10 35.2
Cost and Sales Data in thousands.
Regression Run # 1 Sales vs. # of Ads
SUMMARY OUTPUT
Regression Statistics
Multiple R 0.7808328324
R Square 0.6096999122
Adjusted R Square 0.5706699034
Standard Error 4.2056960876
Observations 12
ANOVA
df SS MS F Significance F
Regression 1 276.3078708551 276.3078708551 15.6213111689 0.0027197818
Residual 10 176.8787958115 17.6878795812
Total 11 453.1866666667
Coefficients Standard Error t Stat P-value Lower 95% Upper 95%
Intercept 16.9369109948 4.9818265158 3.3997392204 0.00677404 5.8367078603 28.0371141293
# of Ads 2.0832460733 0.5270864238 3.9523804433 0.0027197818 0.9088241309 3.2576680157
Regression Run # 2 Sales vs. Cost of Ads
SUMMARY OUTPUT
Regression Statistics
Multiple R 0.8204249527
R Square 0.6730971031
Adjusted R Square 0.6404068134
Standard Error 3.849000314
Observations 12
Page 5
ANOVA
df SS MS F Significance F
Regression 1 305.038632494 305.038632494 20.590123534 0.0010786286
Residual 10 148.1480341726 14.8148034173
Total 11 453.1866666667
Coefficients Standard Error t Stat P-value Lower 95% Upper 95%
Intercept 4.1727002232 7.1087903141 0.5869775361 0.5702325128 -11.6666744073 20.0120748536
Ads Cost 2.8724838266 0.6330355736 4.537634134 0.0010786286 1.4619924264 4.2829752268
Regression Run # 3 Sales vs. Cost of Ads and # of Ads
SUMMARY OUTPUT
Regression Statistics
Multiple R 0.8270511549
R Square 0.6840136128
Adjusted R Square 0.6137944156
Standard Error 3.9888848001
Observations 12
ANOVA
df SS MS F Significance F
Regression 2 309.9858491336 154.9929245668 9.7411198143 0.0056041233
Residual 9 143.200817533 15.9112019481
Total 11 453.1866666667
Coefficients Standard Error t Stat P-value Lower 95% Upper 95%
Intercept 6.5835783983 8.5421553852 0.7707163007 0.4606279385 -12.7401343195 25.907291116
Ads Cost 2.1388637996 1.4701498969 1.4548610343 0.1796798793 -1.1868488547 5.4645764539
# of Ads 0.6246753408 1.1202769287 0.5576079671 0.5907093723 -1.9095690692 3.1589197509
Analysis of Residuals (Autocorrelation)
Sex Months Employed Base Salary Months Employed Gender
Men 6 7.5 6 0
Men 10 8.6 10 0
Men 12 9.1 12 0
Men 18 10.3 18 0
Men 30 13 30 0
Women 5 6.2 5 1
Women 13 8.7 13 1
Women 15 9.4 15 1
Women 21 9.8 21 1
Page 6
Regression of Salary vs. Months Employed
SUMMARY OUTPUT
Regression Statistics
Multiple R 0.9622940283
R Square 0.926009797
Adjusted R Square 0.915439768
Standard Error 0.5493933476
Observations 9
ANOVA
df SS MS F Significance F
Regression 1 26.4427242027 26.4427242027 87.6071197878 0.0000330326
Residual 7 2.1128313528 0.3018330504
Total 8 28.5555555556
Coefficients Standard Error t Stat P-value Lower 95% Upper 95%
Intercept 5.8092778793 0.4038020095 14.3864511385 0.000001867 4.8544385379 6.7641172208
Months Employed 0.2332038391 0.0249152928 9.3598675091 0.0000330326 0.1742885758 0.2921191025
RESIDUAL OUTPUT
Observation Predicted Base Salary Residuals
1 7.2085009141 0.2914990859
2 8.1413162706 0.4586837294
3 8.6077239488 0.4922760512
4 10.0069469835 0.2930530165
5 12.805393053 0.194606947
6 6.975297075 -0.775297075
7 8.8409277879 -0.1409277879
8 9.3073354662 0.0926645338
9 10.7065585009 -0.9065585009
Note that the residuals are not random. Most of these are negative for Women
and mostly positive for men. This is perhaps the most important item to
analyze in a regression. If the model is good then the residuals would
be scattered randomly around the regression line. This causes us to
conclude that something else is going on which must be analyzed.
The obvious thing is to test for possible discrimination.
To analyze this we will add a variable called Gender and give men a value of 0
and Women a value of 1. We will now do a multiple regression with both
gender and months employed as independent variables.
Page 7
Regression of Salary vs. Months Employed and Gender
SUMMARY OUTPUT
Regression Statistics
Multiple R 0.9868191554
R Square 0.9738120456
Adjusted R Square 0.9650827274
Standard Error 0.3530372002
Observations 9
ANOVA
df SS MS F Significance F
Regression 2 27.8077439674 13.9038719837 111.5564845782 0.0000179599
Residual 6 0.7478115882 0.1246352647
Total 8 28.5555555556
Coefficients Standard Error t Stat P-value Lower 95% Upper 95%
Intercept 6.2484785327 0.2914501108 21.4392731454 0.0000006718 5.5353252812 6.9616317843
Months Employed 0.2270737807 0.016117228 14.0888855955 0.0000079815 0.1876363158 0.2665112457
Gender -0.7889745727 0.2384042577 -3.3093979969 0.0162168909 -1.3723292028 -0.2056199427
RESIDUAL OUTPUT
Observation Predicted Base Salary Residuals
1 7.6109212172 -0.1109212172
2 8.5192163401 0.0807836599
3 8.9733639016 0.1266360984
4 10.3358065861 -0.0358065861
5 13.060691955 -0.060691955
6 6.5948728637 -0.3948728637
7 8.4114631096 0.2885368904
8 8.8656106711 0.5343893289
9 10.2280533556 -0.4280533556
Note that the regression equation now has
* Better Coefficient of Correlation
* Higher F Value
* All Variables pass T Test meaning all are significant
* Lower Standard Error
* Residuals are random
In every category tested it is a better model.
Also note that Gender passed T Test. Our Null was no discrimination
and we rejected this hypothesis. It means that we have proven
discrimination from a legal perspective.
Page 8
Nonlinear Regression RESIDUAL OUTPUT
Batch Size # Defect Observation Predicted # Defect Residuals
100 5 1 -11.1875482377 16.1875482377
125 10 2 -2.0092616414 12.0092616414
125 6 3 -2.0092616414 8.0092616414
125 7 4 -2.0092616414 9.0092616414
150 6 5 7.169024955 -1.169024955
150 7 6 7.169024955 -0.169024955
175 17 7 16.3473115513 0.6526884487
175 15 8 16.3473115513 -1.3473115513
200 24 9 25.5255981477 -1.5255981477
200 21 10 25.5255981477 -4.5255981477
200 22 11 25.5255981477 -3.5255981477
225 26 12 34.703884744 -8.703884744
225 29 13 34.703884744 -5.703884744
225 25 14 34.703884744 -9.703884744
250 34 15 43.8821713404 -9.8821713404
250 37 16 43.8821713404 -6.8821713404
250 41 17 43.8821713404 -2.8821713404
250 34 18 43.8821713404 -9.8821713404
275 49 19 53.0604579367 -4.0604579367
300 53 20 62.2387445331 -9.2387445331
300 54 21 62.2387445331 -8.2387445331
325 69 22 71.4170311294 -2.4170311294
350 82 23 80.5953177258 1.4046822742
350 81 24 80.5953177258 0.4046822742
350 84 25 80.5953177258 3.4046822742
375 92 26 89.7736043221 2.2263956779
375 96 27 89.7736043221 6.2263956779
375 97 28 89.7736043221 7.2263956779
400 109 29 98.9518909184 10.0481090816
400 112 30 98.9518909184 13.0481090816
SUMMARY OUTPUT
Note that residuals have positive signs
Regression Statistics followed by negative signs followed by
Multiple R 0.9764238729 positive signs. This is a clear sign that
R Square 0.9534035796 we are trying to fit a straight line to a set
Adjusted R Square 0.9517394217 data points which may be quadratic.
Standard Error 7.5601037854 We will add a new variable called Batch
Observations 30 squared and try a multiple regression.
ANOVA
df SS MS F Significance F
Regression 1 32744.4552611268 32744.4552611268 572.9045280313 3.47597579694458E-20
Residual 28 1600.3447388732 57.1551692455
Total 29 34344.8
Coefficients Standard Error t Stat P-value Lower 95% Upper 95%
Intercept -47.9006946231 4.111558341 -11.6502529335 0 -56.3228495503 -39.4785396959
Batch Size 0.3671314639 0.0153384148 23.9354241247 3.47597579694456E-20 0.3357121101 0.3985508176
RESIDUAL OUTPUT Page 9
# Defect Batch Size Batch Sq. Observation Predicted # Defect Residuals
5 100 10000 1 4.3827161756 0.6172838244
10 125 15625 2 6.7213072886 3.2786927114
6 125 15625 3 6.7213072886 -0.7213072886
7 125 15625 4 6.7213072886 0.2786927114
6 150 22500 5 10.2468217618 -4.2468217618
7 150 22500 6 10.2468217618 -3.2468217618
17 175 30625 7 14.959259595 2.040740405
15 175 30625 8 14.959259595 0.040740405
24 200 40000 9 20.8586207885 3.1413792115
21 200 40000 10 20.8586207885 0.1413792115
22 200 40000 11 20.8586207885 1.1413792115
26 225 50625 12 27.944905342 -1.944905342
29 225 50625 13 27.944905342 1.055094658
25 225 50625 14 27.944905342 -2.944905342
34 250 62500 15 36.2181132557 -2.2181132557
37 250 62500 16 36.2181132557 0.7818867443
41 250 62500 17 36.2181132557 4.7818867443
34 250 62500 18 36.2181132557 -2.2181132557
49 275 75625 19 45.6782445296 3.3217554704
53 300 90000 20 56.3252991635 -3.3252991635
54 300 90000 21 56.3252991635 -2.3252991635
69 325 105625 22 68.1592771577 0.8407228423
82 350 122500 23 81.1801785119 0.8198214881
81 350 122500 24 81.1801785119 -0.1801785119
84 350 122500 25 81.1801785119 2.8198214881
92 375 140625 26 95.3880032263 -3.3880032263
96 375 140625 27 95.3880032263 0.6119967737
97 375 140625 28 95.3880032263 1.6119967737
109 400 160000 29 110.7827513008 -1.7827513008
112 400 160000 30 110.7827513008 1.2172486992
SUMMARY OUTPUT
Regression Statistics
Multiple R 0.9976895195
R Square 0.9953843772
Adjusted R Square 0.9950424793
Standard Error 2.4230577581
Observations 30
ANOVA
df SS MS F Significance F
Regression 2 34186.2773597291 17093.1386798645 2911.3490891121 2.93162976922614E-32
Residual 27 158.5226402709 5.8712088989
Total 29 34344.8
Coefficients Standard Error t Stat P-value Lower 95% Upper 95%
Intercept 6.8975853249 3.7368943622 1.8458068803 0.075912814 -0.7698834287 14.5650540785
Batch Size -0.1201025603 0.0314780261 -3.8154412804 0.0007196857 -0.1846900916 -0.055515029
Batch Sq. 0.0009495387 0.0000605927 15.670832634 0 0.0008252127 0.0010738646

Sheet1

1
Quantity
Predicted Quantity


You Need a Professional Writer To Work On Your Paper?