QCE General Mathematics - Unit 3 - Bivariate data analysis 1
Associations Between Two Numerical Variables | QCE General Mathematics
Learn how to read scatterplots, identify explanatory and response variables, describe direction, form and strength, and interpret correlation in QCE General Mathematics.
Updated 2026-05-18 - 7 min read
QCAA official coverage - General Mathematics 2025 v1.3
Exact syllabus points covered
- Identify the explanatory variable and the response variable.
- Construct and use a scatterplot to identify the association between two numerical variables.
- Describe an association between two numerical variables in terms of direction (positive/negative), form (linear/non-linear) and strength (strong/moderate/weak).
- Calculate Pearson's correlation coefficient, $r$, from raw data using technology, and interpret it to quantify the strength of a linear association.
- Calculate the coefficient of determination, $R^2$, from raw data using technology, and interpret it to assess the strength of a linear association in terms of the explained variation.
- Use the correlation coefficient, $r$, to determine the coefficient of determination, $R^2$, and vice versa.
Numerical bivariate data uses two measured quantities for each item. Examples include hours studied and test result, advertising spend and sales, or age and reaction time. A scatterplot is the first tool because it lets you see the relationship before calculating anything.
Original Sylligence diagram for general scatterplot correlation.
Explanatory and response variables
The explanatory variable is the variable used to explain or predict changes in the other variable. It is usually placed on the horizontal axis. The response variable is the variable being predicted or described, so it usually goes on the vertical axis.
In a study of hours of revision and exam mark, hours of revision is the explanatory variable and exam mark is the response variable. The decision is contextual. If the context changes, the axis choice can change too.
Describing the association
Use three words: direction, form and strength.
| Feature | What to look for | Example wording | |---|---|---| | Direction | Does $y$ tend to increase or decrease as $x$ increases? | positive association | | Form | Does the pattern look straight or curved? | approximately linear | | Strength | How tightly do the points follow the pattern? | strong association |
An exam response should be complete but not wordy: "The scatterplot shows a strong, positive, approximately linear association between weekly training time and race speed."
Correlation and explained variation
Pearson's correlation coefficient $r$ measures the strength and direction of a linear association. It ranges from $-1$ to $1$.
| $r$ value | Interpretation | |---:|---| | close to $1$ | strong positive linear association | | close to $0$ | little or no linear association | | close to $-1$ | strong negative linear association |
The coefficient of determination is $R^2=r^2$. It is often interpreted as the percentage of variation in the response variable explained by the linear relationship with the explanatory variable.
Worked example
Common traps
Also remember that $R^2$ cannot be negative. Squaring removes the sign, so keep the direction from $r$ and the explained variation from $R^2$.
Scatterplot construction details
A scatterplot does not have to start at zero. Choose scales that show the pattern clearly without exaggerating it. Label both axes with the variable and units. The explanatory variable normally goes on the horizontal axis because it is the value used to predict or explain the response variable.
When reading a scatterplot, first ask whether there is any visible relationship at all. If the points look like a random cloud, it is usually more honest to say there is little or no association. If there is a pattern, then describe its direction, form and strength.
Non-linear patterns
Correlation is only a measure of linear association. A scatterplot can show a strong curved pattern even when $r$ is not close to $1$ or $-1$.
| Pattern | What it means | |---|---| | Positive linear | values tend to rise in an approximately straight pattern | | Negative linear | values tend to fall in an approximately straight pattern | | Non-linear | values are related, but not well described by a straight line | | No clear association | no useful pattern is visible |
Technology and manual correlation setup
QCE questions usually allow technology for $r$, but you should understand what is being measured. Pearson's $r$ compares how far each $x$ value is from $\bar x$ with how far each $y$ value is from $\bar y$. A typical technology table may include $x$, $y$, $x-\bar x$, $y-\bar y$ and products of deviations.
For interpretation, the exact decimal is less important than the meaning:
| $|r|$ size | Typical description | |---:|---| | $0$ to about $0.3$ | weak or little linear association | | about $0.3$ to $0.7$ | moderate linear association | | about $0.7$ to $1$ | strong linear association |
These boundaries are guides, not laws. Always let the context and scatterplot support your wording.
Depth: constructing and reading scatterplots
A scatterplot should have the explanatory variable on the horizontal axis when the context suggests one variable may help predict the other. For example, hours studied would normally go on the horizontal axis and test score on the vertical axis. If neither variable is clearly explanatory, choose the axes that make the interpretation easiest and state the variables clearly.
Good scatterplot construction includes:
- labelled axes with units
- a scale that covers the full data range without compressing the points unnecessarily
- plotted points that preserve the ordered pairs
- a title or context statement when the graph is separated from the question
The graph is then described using direction, form and strength. Direction can be positive, negative or no clear direction. Form can be roughly linear, curved or clustered. Strength describes how tightly the points follow the form.
Clusters, gaps and unusual points
Not every scatterplot is a neat cloud around a straight line. Exam questions often expect comments on unusual structure.
| Feature | What it looks like | What to say | |---|---|---| | Cluster | several points grouped together | there may be subgroups in the data | | Gap | an empty interval between groups of points | the data may not cover all values evenly | | Outlier | one point far from the main pattern | it may affect the correlation and fitted line | | Curved pattern | points follow a bend rather than a line | linear correlation may understate the relationship |
If a point is unusual, describe it using both coordinates and context. "The point at about $(2,85)$ is unusual" is more useful than "there is an outlier".
Correlation coefficient interpretation
The correlation coefficient $r$ measures the direction and strength of a linear association. It always lies between $-1$ and $1$.
| Value of $r$ | Interpretation | |---:|---| | close to $1$ | strong positive linear association | | close to $-1$ | strong negative linear association | | close to $0$ | little or no linear association |
The phrase "linear association" matters. A curved relationship can have an $r$ value that is not close to $1$ or $-1$, even when the variables are clearly related.
Technology and rounding
If technology gives $r=-0.782641$, report a sensible rounded value such as $r=-0.783$ unless the question requests more detail. Keep enough digits during working, but write final interpretations in context.
When entering data into technology, check that each $x$ value remains paired with its correct $y$ value. Sorting one list without sorting the other destroys the bivariate data.