Homework 4

SOLUTIONS CS&SS 508 Winter 2012 Assigned: Feb. 17 Due: Mar. 2 Please submit code and output for all questions except those marked with an asterisk (*). For questions marked with an asterisk submit code only.

Lecture 6: Programming in R 1) Load the warpbreaks data set and attach it.* This data set gives the number of warp breaks per loom, where a loom corresponds to a fixed length of yarn. The variables are: [,1] breaks numeric The number of breaks [,2] wool factor The type of wool (A or B) [,3] tension factor The level of tension (L, M, H) There are measurements on 9 looms for each of the six types of warp (AL, AM, AH, BL, BM, BH). > data(warpbreaks) > attach(warpbreaks) 2) Use the tapply() function to compute the mean number of breaks separately for each level of tension. > tapply(breaks, tension, mean) L M H 36.38889 26.38889 21.66667 3) Use the tapply() function to compute the mean number of breaks separately for each type of wool. > tapply(breaks, wool, mean) A B 31.03704 25.25926 4 a) Again using the warpbreaks data set, create a factor variable which indicates the combination of tension and wool for each observation. You should do this with the paste() function. The paste() function can be used to paste together two character strings into a single character string. By default it adds a space between the two character strings. Example: > vec1 = "red" > vec2 = "head"

> vec3=paste(vec1,vec2, sep="") > vec3 [1] "redhead" Use the paste() function to create your factor variable, which takes the values “AL”, “AM”, “AH”, “BL”, “BM”, “BH”. > wooltension = as.factor( paste(wool, tension, sep="")) > table(wooltension) wooltension AH AL AM BH BL BM 9 9 9 9 9 9 4 b) Use the tapply() function to compute the mean number of breaks separately for each combination of wool type and tension level. > tapply(breaks, wooltension, mean) AH AL AM BH BL BM 24.55556 44.55556 24.00000 18.77778 28.22222 28.77778 > 5) The following code computes bootstrapped standard errors for a regression model from the swiss data set. Run the code line by line to figure out how each line works and what it is doing. Add a comment to each line of code describing exactly what it’s doing. ## Here is the regression model with regular (non-bootstrapped) ## standard errors: mod1=lm(Examination~Agriculture, data=swiss) summary(mod1) ## The following code bootstraps the standard errors: nboot = 1000 ## specify number of bootstrap simulations boot.est = matrix(nrow=nboot,ncol=2) ## Create an empty matrix with two columns and with one row for each ## simulation. for (b in 1:nboot) { ## Repeat this loop nboot times. boot.sample = sample(1:dim(swiss)[1], replace=T) ## sample the indices of the swiss data set WITH REPLACEMENT. ## So some indices may be sampled twice, and others not at all. boot.dat = swiss[boot.sample,] ## Subset the data set to include the rows sampled in the ## previous step. This is the “bootstrapped data set” mod=lm(Examination~Agriculture, data=boot.dat) ## Compute linear regression coefficients for the bootstrapped ## data set. boot.est[b,] = mod$coef ## save these coefficients to the matrix boot.est. ## The intercept is saved in the first column, the slope in the ## second.

} apply(boot.est, 2, sd) ## Compute the standard deviation of all bootstrapped coefficient ## estimates (intercept and slope) apply(boot.est, 2, sd) ## ditto. (This duplicate line was a typo.) apply(boot.est, 2, quantile, 0.025) ## Compute the 2.5% quantile of the bootstrapped coefficient ## estimates (intercept and slope). This gives the lower bound for the ## 95% bootstrap confidence interval apply(boot.est, 2, quantile, 0.975) ## Compute the 97.5% quantile of the bootstrapped coefficient ## estimates (intercept and slope). This gives the upper bound for the ## 95% bootstrap confidence interval

How do the bootstrapped standard errors compare to the regular standard errors? Regular standard errors: > summary(mod1)$coef[,2] (Intercept) Agriculture 2.11000959 0.03807406 Bootstrapped standard errors: > apply(boot.est, 2, sd) [1] 2.25346577 0.03680473 They are quite close!!

Lecture 7 6) We’ll use variables in the swiss data set to explain fertility levels. Here’s a summary description of the data set and variables available: Standardized fertility measure and socio-economic indicators for each of 47 French-speaking provinces of Switzerland at about 1888. [,1] Fertility Ig, ‘common standardized fertility measure’ [,2] Agriculture % of males involved in agriculture as occupation [,3] Examination % draftees receiving highest mark on army examination [,4] Education % education beyond primary school for draftees. [,5] Catholic % ‘catholic’ (as opposed to ‘protestant’). [,6] Infant.Mortality live births who live less than 1 year.

a) Perform a linear regression of Fertility on the percentage of draftees achieving the highest mark on the army Examination. Display summary output from the regression model, including standard errors and p-values. > > > >

data(swiss) attach(swiss) mod=lm(Fertility~Examination) summary(mod)

Call: lm(formula = Fertility ~ Examination) Residuals: Min 1Q -25.9375 -6.0044

Median -0.3393

3Q 7.9239

Max 19.7399

Coefficients: Estimate Std. Error t value Pr(>|t|) (Intercept) 86.8185 3.2576 26.651 < 2e-16 *** Examination -1.0113 0.1782 -5.675 9.45e-07 *** --Signif. codes: 0 ‘***’ 0.001 ‘**’ 0.01 ‘*’ 0.05 ‘.’ 0.1 ‘ ’ 1 Residual standard error: 9.642 on 45 degrees of freedom Multiple R-squared: 0.4172, Adjusted R-squared: 0.4042 F-statistic: 32.21 on 1 and 45 DF, p-value: 9.45e-07 b) Is the Examination variable statistically significant? What is its p-value? Yes. Its p-value is 9.45e-07, which is substantially less than 0.05. c) For an increase of one percentage point in Examination, how many units do we expect the Fertility measure to increase? We expect it to decrease by 1.0113 units. d) Make a scatter plot of fitted values on the x-axis and residuals on the y-axis. Add a horizontal line at y=0.

e) What does your plot in (d) tell you? The constant variance assumption appears to be met. f) Make a QQ normal plot of the residuals from your model. Add a QQ line. What does this plot tell you? > qqnorm(resid(mod)) > qqline(resid(mod)) The normality assumption is met:

g) Make a scatter plot of Fertility (y-axis) against Examination (x-axis) and add the regression line in red.

> plot(Fertility~Examination) > abline(lm(Fertility~Examination),col="red")

Homework 3

Lecture 6: Programming in R. 1) Load the warpbreaks data set and attach it.* This data set gives the number of warp breaks per loom, where a loom corresponds ...

162KB Sizes 2 Downloads 351 Views

Recommend Documents

Homework 3.pdf
Page 1 of 3. Homework 3 參考解答. 1. First, let's clarify what the function max (݂(݊), ݃(݊)) is. Let's define the. function h(݊) = max (݂(݊), ݃(݊)). Then. h(݊) = ൜݂(݊) if ݂(݊) ≥ ݃(݊),. ݃(݊) if ݃(݊) < ݂(݊). Since ݂(݊) a

Operating Systems Homework #3 - GitHub
May 14, 2015 - (Integer) Number indicating scheduling algorithm. 27 int policy;. 28. // Enum values for policy. 29 const short FCFS=0, SJF=1, PRIOR=2, RR=3;.

Homework 3. Identifying Bacteria
Virus. Genetic material. Morphology. Related disease. Influenza virus. RNA virus. Spherical flu a. Name of the microbe: b. Name of the disease: c. Domain: d.

HOMEWORK
Homework​​is one​​of​​the​​options​​parents/legal​​guardians​​have​​to​​support​​their​​child's learning.

Read [PDF] Math 2009 Homework Workbook Grade 3 Full Books
Math 2009 Homework Workbook Grade 3 Download at => https://pdfkulonline13e1.blogspot.com/0328341762 Math 2009 Homework Workbook Grade 3 pdf download, Math 2009 Homework Workbook Grade 3 audiobook download, Math 2009 Homework Workbook Grade 3 read

Lesson 3 Homework Practice (Triangles).pdf
There was a problem previewing this document. Retrying... Download. Connect more apps... Try one of the apps below to open or edit this item. Lesson 3 ...

Packet A Section 3 Homework 2017.pdf
What is the “growth factor” according to the equation? ______. b. What was the percent decrease (or percent loss) per day?______. c. How are “growth factor” ...

8-3- Intro to Geometry Homework #3 (Angles).pdf
8-3- Intro to Geometry Homework #3 (Angles).pdf. 8-3- Intro to Geometry Homework #3 (Angles).pdf. Open. Extract. Open with. Sign In. Main menu. Displaying ...

Homework 1
Question 3. As you have learnt, a number of hosts using Ethernet, share a single channel and each collision decreases throughput. If hosts on a 6-host 10Mbps 80m Ethernet LAN send frames 64byte long, a. what is the scenario that maximizes throughput

Homework #1 - GitHub
Please select either bus 2 or 3 as reference bus, calculate the line flows, and check if the results are the same. 2. Come up with a case study of LP, and solve it in ...

Homework Hotline.pdf
Whoops! There was a problem loading more pages. Whoops! There was a problem previewing this document. Retrying... Download. Connect more apps... Try one of the apps below to open or edit this item. Homework Hotline.pdf. Homework Hotline.pdf. Open. Ex

Homework 12 - Magnetism - GitHub
region containing a constant magnetic field B = 2.6T aligned with the positive ... With what speed v did the particle enter the region containing the magnetic field?

Homework Practice - WinningWithLinningMath
of the girls have blue eyes, what fraction of the girls in class have neither blue or brown eyes? 16. PIE Ubi made a banana cream pie. His brother ate. 1. −. 3.

homework
ventus and I was in San Siro Stadium when Inter won the. 'scudetto' with a 5-0 win. 2003 was the first year Valletta qualified for the UEFA. Champions League and the fist game was Valletta versus AC. Milan. Ta' Qali was in a craze when Massimo scored

Homework #4
Returned: Friday, March 25, 2016. 1. Growth of structure with neutrinos. [30 points; 5 points each]. In most of our studies of the matter-dominated era in class, we ...

homework-help.pdf
Page 1 of 22. Alternative Homework Ideas. We compiled this list from ideas collected from the Love and Logic Insider Club. members. Thank you to everyone ...

Homework 2
Recall the unitary group U1 = {z ∈ C | |z| = 1}. Find all the one dimensional continuous complex representations of U1. 5. Let G be a finite group. Let ρ : G −→ GLn(R) be an n-dimensional real representation. Prove that there exists a matrix Ï

Gr5M5 Homework Helper.pdf
Gr5M5 Homework Helper.pdf. Gr5M5 Homework Helper.pdf. Open. Extract. Open with. Sign In. Main menu. Displaying Gr5M5 Homework Helper.pdf. Page 1 of ...

Homework 2.pdf
Self-checkout lines at grocery stores are fairly commonplace these days. Describe what happened to. the demand for cashiers when these devices became ...