34 questions
School administrators collect data on students attending the school. Which of the following variables is quantitative?
Class
Whether the student is in AP classes
Whether the student has taken the SAT
GPA
None of these
A professor has kept records on grades that students have earned in his class. If he wants to examine the % of students earning the grades A, B, C, D, and F during the most recent term, which kind of plot would he make?
boxplot
timeplot
pie chart
histogram
dotplot
2 sections of a class took the same quiz. Section A has 15 students who had a mean score of 80, and Section B had 20 students with a mean score of 90. Overall, what was the approximate mean score for all of the students on the quiz?
85.7
84.3
85
None of these
Suppose that a normal model described student scores in a history class. Parker has a standardized score (z-score) of +2.5. This means that Parker:
is 2.5 points above average for the class
has a standard deviation of 2.5
has a score that is 2.5 times the average for the class
is 2.5 standard deviations above average for the class
none of these
Your Stats teacher tells you that your test score was in the 3rd quartile for the class. Which is true?
I) You got 75% on the test
II) You can't really tell what this means without knowing the standard deviation
III) You can't really tell what this means unless the class distribution is nearly normal
I only
III only
II and III
II only
None of these
The advantage of making a stem and leaf display instead of a dotplot is that a stem and leaf display:
satisfies the area principle
preserves the individual data values
shows the shape of the distribution better than a dotplot
is for quantitative data, while a dotplot is for categorical data
none of these
The 5 number summary of credit hours for 24 students in a statistics class is:
Min:13
Q1:15
Median: 16.5
Q3: 18
Max: 22
From the info we know:
there is at least one low outlier in the data
there is at least one high outlier in the data
there are both low and high outliers in the data
there are no outliers in the data
none of these
Which of the followinf data summaries are changed by adding a constant to each data value?
I) the mean
II) the median
III) the standard deviation
I and III
I only
III only
I, II, and III
I and II
If we want to discuss gaps and clusters in a data set, which of the following should NOT be chosen to display the data set?
dotplot
boxplot
stem and leaf plot
histogram
any of these would work
All BUT one of these statements contain a mistake. Which one is true?
The correlation between a football players weight and the position he plays is .54
The correlation between a car's length and its fuel efficiency is .71 mpg
There is a correlation of .63 between gender and political party
There is a high correlation of 1.09 between height of a corn stalk and its age in weeks
The correlation between the amount of fertilizer used and the yield of beans is .42
Which statement about influential points is true?
I) removal of an influential point changes the regression line
II) data points that are outliers in the horizontal direction are more likely to be influential points than points that are outliers in the vertical direction
III) influential points have large residuals
I only
I, II, and III
I and III
II and III
I and II
Residuals are:
data collected from individuals that is not consistent with the rest of the group
the difference between observed responses and values predicted by the model
variation in the data that is explained by the model
possible models not explored by the researcher
none of these
Which is true?
I) random scatter in the residuals indicates a model with high predictive power
II) if 2 variables are very strongly associated, then the correlation between them will be near +1 or -1
III) the higher the correlation between 2 variables the more likely the association is based on cause and effect
II only
I and II
I, II, and III
I only
None of these
A company's sales increase by the same amount each year. This growth is:
logarithmic
quadratic
power
exponential
linear
It's easy to measure the circumference of a tree's trunk, but not so to measure its height. Foresters developed a model for ponderosa pines that they use to predict the tree;s height (in feet) from the circumference of its trunk (in inches): A lumberjack finds a tree with a circumference of 60"; how tall does this model estimate the tree to be?
11'
83'
19'
93'
5'
2 variables that are actually NOT related to each other may nonetheless have a very high correlation because they both result from some other, possibly hidden, factor. This is an example of:
leverage
an outlier
regression
extrapolation
a lurking variable
A correlation of 0 between 2 quantitative variables means that:
re-expressing the data will guarantee a linear association between the 2 variables
there is no association between the 2 variables
there is no linear association between the 2 variables
we have done something wrong with our calculations
none of these
A residual plot is useful because:
I) it will help us see whether our model was appropriate
II) it might show a pattern in the data that was hard to see in the original scatterplot
III) it will clearly identify the influential points
II only
I and II
I, II, and III
I only
I and III
Which of the following is NOT a goal of re-expressing data?
make the spread of several groups more alike
make the scatter in the scatterplot spread out evenly rather than following a fan shape
make the form of a scatterplot more nearly linear
make the distribution of a variable more symmetric
all of these are goals of re-expressing data
The correlation coefficient between the hours that a person is awake during a 24 hr period and the hours that same person is asleep during a 24 hr period is most likely to be:
exactly -1
near +.8
exactly +1
near -.8
near 0
A regression analysis of students' college GPAs and their high school GPAs found . Which of these is true?
I) high school GPA accounts for 31.1% of college GPA
II) 31.1% of college GPAs can be correctly predicted with this model
III) 31.1% of the variance in college GPA can be accounted for by this model
I only
III only
I and II
II only
none of these
When using midterm exam scores to predict a student's final grade in the class. the student would prefer to have a:
residual equal to 0, because that means the student's final grade is exactly what we would predict with the model
negative residual, because that means the student's final grade is higher than what we would predict with the model
negative residual, because that means the student's final grade is lower than what we would predict with the model
positive residual, because that means the student's final grade is higher than what we would predict with the model
positive residual, because that means the student's final grade is lower than what we would predict with the model
A company sponsoring a new internet search engine wants to collect data on the ease of using it, Which is the best way to collect the data?
observational study
experiment
sample survey
census
simulation
More dogs are being diagnosed with thyroid problems than have been diagnosed in the past. A researcher identifies 50 puppies without thyroid problems and kept records of their diets fro several years to see if any developed thyroid problems. This is a(n):
randomized experiment
prospective study
blocked experiment
survey
retrospective study
A chemistry professor who teaches a large lecture class surveys his students who attend his class about how hw can make the class more interesting hoping he can get more students to attend. This survey method suffers from:
undercoverage
response bias
voluntary response bias
nonresponse bias
none of these
Double-blinding in experiments is important so that...
I) the evaluators do not know which treatment group the participants are in
II) the participants don;t know which treatment group they're in
III) no one knows which treatment any of the participants are getting
I and II
III only
I only
II only
I, II, and III
Placebos are a tool for:
control
randomization
blinding
sampling
blocking
Which of the following is NOT required in an experimental design?
control
blocking
randomization
replication
all are required in an experimental design
Which statement about bias is true?
I) bias results from random variation and will always be present
II) bias results from a sampling method likely to produce samples that do not represent the population
III) bias is usually reduced when sample size is bigger
I only
II only
II and III
III only
I and III
In an experiment the primary purpose of blinding is to reduce:
randomness
variation
bias
undercoverage
confounding
A bicycle shop equips 60% if their bikes with a water bottle holder, 55% of the bikes they sell have a kickstand attached to the bike, 34% of the bikes sold have both features. What is the probability that a randomly selected bicycle will have a kickstand OR a water bottle holder?
34%
56.7%
61.8%
81%
none of these
6 republicans and 4 democrats have applied for 2 open positions on a planning committee. Since all the applicants are qualified to serve, the City Council decides to pick the 2 new members randomly. What is the probability that BOTH come from the SAME party?
Political analysts estimate the probability that Hillary Clinton will run for president in 2020 is 45%, and the probability that NY's Governor George Pataki will run is 20%. If their political decisions are independent, what is the probability that ONLY Hillary runs for president?
9%
11%
25%
36%
45%
Some marathons allow 2 runners to "split: the marathon by each running a half marathon. Alice and Sharon plan to split a marathon. Alice's half marathon times average 92 mins with a st. dev. of 4 mins., and Sharon's half marathon times average 96 mins with a st. dev. of 2 mins. The expected time for Alica and Sharon to complete a full marathon is 92+96=188 mins. What is the st. dev. of their total time?
2 mins
4.5 mins
6 mins
20 mins
It can't be determined