- Response Variable: measures the
outcome
of a study (a.k.a. dependent)
- Explanatory Variable: tries to
explain the observed outcome or response (a.k.a. independent)
- Scatterplots are most effective
in comparing these two quantitative variables.
- When designing a scatterplot,
one should be aware of patterns, striking deviations, form, direction,
strength, or any potential outliers and maybe make mental note of these
characteristics
3.2 Correlations
-Correlation:
measures direction
and
strength of the linear relationship in a scatterplot and is written as
"r"
-When
r>0 there is a positive
association
& r<0 there is a negative association
-A positive
correlation means that high values of one variable are associated with
high values of a second variable. The relationship between height and
weight, between IQ scores and achievement test scores, and between
self-concept and grades are examples of positive correlation.
- A negative
correlation or relationship means that high values of one variable are
associated with low values of a second variable. Examples of negative
correlations include those between exercise and heart failure, between
successful test performance and feelings of incompetence, and between
absence from school and school achievement.
Positive Correlation
WEAK
STRONG
Negative Correlation
WEAK
STRONG
3.3 Least-Squares Regression
- The regression line is a
straight
line that describe how y changes as x changes which can also help us
predict
y given x
- LSRL: a line that makes the sum
of the squares of the vertical distances form a line that is as small
as
possible
LSRL
-Residual plot: a scatter plot
of the regression of the residuals against the explanatory that helps
us
assess the fit of the regression line.
-Outliers: lie outside the
overall
pattern of other observations.
One Can clearly
see this outlier at (390,14)
-Influential point: different
than
an outlier because if you remove an influential point from the plot
they
would significantly change the results of your calculations.