Introduction to statistical inference

Lecture 1

Iain R. Moodie

BIOC13 - Ekologi

Department of Biology, Lund University

2025-09-30

Introduction

Introduction

Aim of this lecture

  • To explain the foundational ideas of statistical inference
    • Focus on computational (instead of mathematical) statistical techniques
    • Focus on applying these ideas to answer ecological questions

Introduction

Your focus should be on

  • The broad ideas specific formulae
  • Statistics as a tool to answer interesting ecological questions

Populations and Samples

Populations and Samples

The how and the why of statistics

  • We are often limited by how much data we can collect
  • We want to say something more general than the data we collect
  • Example:
    • Measured the species diversity in 20 spruce plantations across Sweden
    • Want to say something about species diversity in all spruce plantations (in Sweden)

Populations and Samples

The how and the why of statistics

  • Cannot measure species diversity in every forest in Sweden
  • Instead, we collect a representative sample of data
  • Use the sample to draw conclusions about the population
  • Statistics allows us to approximate properties (e.g. mean species diversity) of entire populations from (usually) a single sample

Populations and Samples

The how and the why of statistics

Population

The totality of individual observations about which inferences are to be made, existing anywhere in the world or at least within a definitely specified sampling area limited in space and time (Sokal and Rohlf 1995).

Sample

A collection of individual observations selected by a specified procedure (Sokal and Rohlf 1995).

Populations and Samples

The how and the why of statistics

Populations

  • All the spruce (gran) trees in Skåne
  • All the blue tits (blåmes) in Sweden
  • All the genes in the common fruit fly (Drosophila melanogaster)
  • All the herring (sill) in the Baltic sea

Samples

  • 50 spruce trees in a forest in Skåne
  • 100 caught blue tits from nest boxes in Sweden
  • 20 genes from the Drosophila melanogaster genome
  • 1000 herring caught by a fishing boat off the coast of Karlskrona

Populations and Samples

The how and the why of statistics

Inference

Inference

Using the sample to infer population parameters

03:00

For this example, our population will be all the fish in a small pond. The body length (mm) and sex of the fish are shown below:

  • Males:
    • 96, 98, 112, 101, 101, 114, 104, 93, 95, 106
  • Females:
    • 115, 104, 105, 101, 93, 121, 106, 77, 104, 94

Task

In pairs, take a random sample (n=3) of males and females and calculate a mean for each. Subtract the means from each other (\(\bar{x}_{male}-\bar{x}_{female}\)).

Inference

Example

I rolled a 10 sided dice 6 times:

  • select the 2nd, 1st and 6th male:
    • 98, 96, 114 (\(\bar{x}_{male} = \frac{98+96+114}{3} = 102.67\))
  • select the 3rd, 9th and 8th female:
    • 105, 104, 77 (\(\bar{x}_{female} = \frac{105+104+77}{3} = 95.33\))
  • \(\bar{x}_{male}-\bar{x}_{female}=102.67-95.33=7.34\)

Inference

Building a sampling distribution

Inference

The sampling distribution

Inference

The sampling distribution

  • The distribution of observed statistics (e.g. difference in means) obtained through repeatedly sampling the population
  • Shows the range of statistics we can observe through sampling
  • Its central tendency gives us an idea about where the true population parameter value is
  • The sampling distribution is at the foundation of all of statistics!

Inference

The sampling distribution

Inference

The sampling distribution

Inference

The utility of the sampling distribution

Confidence intervals

  • If we collect another sample of the same size, how much would the observed statistic vary?

Inference

The utility of the sampling distribution

Hypothesis testing

  • Is our result compatible with a specific hypothesis?

Inference

The sampling distribution

  • But we only (usually) collect a single sample!
  • A sampling distribution by definition requires many samples!
  • How can we get a sampling distribution?
  • Great question!

Bootstrap resampling

Bootstrap resampling

How to construct a sampling distribution from a single sample

Sample of 6 males from the fish in the pond

  • Sample:
    • 96, 98, 112, 101, 114, 106
  • Observed mean:
    • \(\bar{x} = \frac{96 + 98 + 112 + 101 + 114}{5} = 104.2\)
  • I want to show the range of observed means would also be plausible if I collected another sample
    • But I can’t / don’t want to collect another sample!

Bootstrap resampling

How to construct a sampling distribution from a single sample

  • If we assume that my sample is representative of the populations
    • collected randomly and without bias
  • We can resample the sample with replacement to create more “samples”
    • Process known as “Bootstrapping
    • From English phrase: “To pull one’s self up by their own bootstraps”
    • Seems impossible?

Bootstrap resampling

Generating a bootstrap sample

03:00
  • Original sample:
    • 96, 98, 112, 101, 114, 106

Task

In pairs, resample with replacement (n=6) values from my sample to get a new “bootstrap sample”, then calculate the mean.

  • Example:
    • Roll a dice 6 times: 4 2 1 4 2 3
    • Bootstrap sample: 101, 98, 96, 101, 98, 112
    • Bootstrap sample mean: 101

Bootstrap resampling

Generating a bootstrap sampling distibution

Bootstrap resampling

Generating a bootstrap sampling distibution

  • Means from 1000 bootstrap samples
  • Observed mean (mean of original sample) in red
  • The observed mean is still our best guess at the true mean
  • Bootstrap sampling distribution allows us to quantify our uncertainty in the mean

Bootstrap resampling

The bootstrap sampling distibution

  • If we use the bootstrap sampling distibution in place of the real sampling distribution, we can can use it to infer things about the original population.
  • The bootstrap method can be applied to any statistic you can calculate from a sample!

Bootstrap resampling

The bootstrap sampling distibution

Bootstrap resampling

Works for any statistic!

  • Original sample:
    • 96, 98, 112, 101, 114, 106
  • Standard deviation is a measure of spread:

\[ s = \sqrt{\frac{\sum_{i}^n (x_i - \bar{x})^2}{n-1}} \]

Bootstrap resampling

Works for any statistic!

  • Original sample:
    • 96, 98, 112, 101, 114, 106
  • Standard deviation is a measure of spread:

\[ s = \sqrt{\frac{ (96 - 104.2)^2 + (98 - 104.2)^2 + (112 - 104.2)^2 + \newline (101 - 104.2)^2 + (114 - 104.2)^2 + (106 - 104.2)^2 }{6-1}} \]

Bootstrap resampling

Works for any statistic!

  • Original sample:
    • 96, 98, 112, 101, 114, 106
  • Standard deviation is a measure of spread:

\[ s = \sqrt{\frac{(-8.2)^2 + (-6.2)^2 + (7.8)^2 + (-3.2)^2 + (9.8)^2 + (1.8)^2}{5}} \]

Bootstrap resampling

Works for any statistic!

  • Original sample:
    • 96, 98, 112, 101, 114, 106
  • Standard deviation is a measure of spread:

\[ s = \sqrt{\frac{67.24 + 38.44 + 60.84 + 10.24 + 96.04 + 3.24}{5}} \]

Bootstrap resampling

Works for any statistic!

  • Original sample:
    • 96, 98, 112, 101, 114, 106
  • Standard deviation is a measure of spread:

\[ s = \sqrt{\frac{276.04}{5}} \]

Bootstrap resampling

Works for any statistic!

  • Original sample:
    • 96, 98, 112, 101, 114, 106
  • Standard deviation is a measure of spread:

\[ s = \sqrt{55.208} \]

Bootstrap resampling

Works for any statistic!

  • Original sample:
    • 96, 98, 112, 101, 114, 106
  • Standard deviation is a measure of spread:

\[ s = 7.43 \text{mm} \]

Bootstrap resampling

Works for any statistic!

  • Standard deviations from 1000 bootstrap samples
  • Observed SD (SD of original sample) in red
  • The observed SD is still our best guess at the true SD
  • Bootstrap sampling distribution allows us to quantify our uncertainty in the SD

Bootstrap resampling

Works for any statistic!

Quantifying sampling error

If we collected another sample of the same size, how much would our test statistic be likely to vary?

Quantifying sampling error

What range of observed statistics is plausible?

  • If we took another sample, it is unlikely that we would get exactly the same observed statistics
  • We want to quantify this (sampling error)
  • Problem: we (usually) only ever collect one sample
  • Solution: generate more samples using bootstrap resampling!

Quantifying sampling error

The standard error (SE)

  • The standard error is the standard deviation of the sampling distribution
  • Use the bootstrap generated sampling distribution as our sampling distribution
  • Assumptions: your sampling distribution is approximately normally distributed (bell-curve)
  • If calculated with a bootstrap sampling distribution often written as \(SE_{boot}=2.7\)

Quantifying sampling error

The 95% confidence interval

  • If we repeated our experiment many times and calculated a 95% CI each time, the 95% CI’s would include the “true” population value 95% of the time
  • Many methods to calculate:
    • Percentile method: Middle 95% of the sampling distribution
    • No assumptions, but requires a large (>30 >>14) sample size to be accurate

Quantifying sampling error

The 95% confidence interval

  • Often reported after the observed statistic:
    • Mean = 104 mm (95%CI: 99 - 110 mm)
  • Less strict definition: where we expect the true population parameter to be

Confidence intervals

Using a bootstrap sampling distribution

Confidence intervals

An example

02:00

“Breakthrough Drug Reduces COVID-19 Hospitalizations by 12%!”

  • A new antiviral drug reduced COVID-19 hospitalizations by 12% in a clinical trial involving 1,000 patients.
  • A new antiviral drug reduced COVID-19 hospitalizations by 12% (95% CI: -11 to 32%) in a clinical trial involving 1,000 patients.

Task

Think for 30 seconds, then discuss in pairs or small groups for 1.5 minutes about the effect of including the confidence interval in the statement.

Hypothesis testing

  • Is our sample compatible with a certain hypothesis?
  • If we collected another sample, could the conclusion have been different?

Hypothesis testing

Is our sample compatible with a certain hypothesis?

  • More stricly: a method of statistical inference used to decide whether our sample provides sufficient evidence to reject a particular hypothesis
  • Used to answer the questions:
    • Could our results be just due to chance?
    • If we collected another sample, could the conclusion have been different?

Hypothesis testing

One part of one way of conducting science

  1. Make observations
  2. Formulate a hypothesis
  3. Design experiment (discuss)
  4. Conduct the experiment (obtain data)
  5. Analyze the data (is your data compatible with your hypothesis?)
  6. Formulate conclusion
  7. Synthesize results with other studies, and determine next steps

Hypothesis testing

Back to the fish

  • While watching a nature documentary, I hear that in many fish species, females are often larger than males.
    • I wonder if this is true of the fish in my pond (from the previous example)?
    • I hypothesise that females will be bigger than males

Hypothesis testing

Back to the fish

  • Sample of males:
    • 96, 112, 101, 101, 114, 104
    • Mean = 104.67 mm
  • Sample of females:
    • 115, 104, 105, 101, 93, 121
    • Mean = 106.5 mm
  • Observed difference in means = 104.67 - 106.5 = -1.83 mm
  • In my sample, I observe that female fish are on average larger
  • Hypothesis confirmed?

Hypothesis testing

Back to the fish

  • Not so fast!
  • Only sampled 12 fish in total
  • How probable would it be to collect a sample with such a difference in mean sizes if there was actually no difference in the population?
  • Do I have sufficient evidence against my null hypothesis
    • There is no difference in mean size between males and females
  • Solution: need to compare my observed value against a sampling distribution of data collected assuming the null hypothesis is true
    • Null distribution

Hypothesis testing

Back to the fish

  • If null hypothesis is true:
    • Sex column should have no predictive value for the Body Length column
    • Randomly shuffling (permuting) Sex column should produce samples similar to observed sample
Sex Body Length (mm)
Male 96
Male 112
Male 101
Male 101
Male 114
Male 104
Female 115
Female 104
Female 105
Female 101
Female 93
Female 121

Hypothesis testing

Back to the fish

  • If null hypothesis is false:
    • Sex column has predictive value on Body Length column
    • Permuting breaks the special arrangement that has predictive value
    • Observed should be very different to values from permuting
Sex Body Length (mm)
Male 96
Male 112
Male 101
Male 101
Male 114
Male 104
Female 115
Female 104
Female 105
Female 101
Female 93
Female 121

Hypothesis testing

Back to the fish

  • Null hypothesis:
    • There is no difference in body lengths between males and females
  • Observed difference in means = -1.83
Sex Body Length (mm)
Male 96
Male 112
Male 101
Male 101
Male 114
Male 104
Female 115
Female 104
Female 105
Female 101
Female 93
Female 121

Hypothesis testing

Back to the fish

  • Null hypothesis:
    • There is no difference in body lengths between males and females
  • Observed difference in means = -1.83
  • Data simulated under the null hypothesis
    • Permutated difference in means = -1.17
Sex Body Length (mm)
Female 96
Male 112
Female 101
Male 101
Female 114
Male 104
Male 115
Female 104
Male 105
Female 101
Male 93
Female 121

Hypothesis testing

Generating a null distribution

  • We need to do this permuting procedure many times to generate enough samples to make a null distribution
    • Sampling distribution from a model of the null hypothesis
03:00

Task

On the paper you have been given, randomly assign 6 of the rows male and 6 female, and calculate the mean difference in body length.

Hypothesis testing

Generating a null distribution

Hypothesis testing

Generating a null distribution

Hypothesis testing

Comparing our observed statistic against the null distribution

Hypothesis testing

Comparing our observed statistic against the null distribution

Hypothesis testing

Comparing our observed statistic against the null distribution

  • p = 0.337
  • Probability of observing our original statistic, or one more extreme, assuming the null hypothesis is true
01:30

Task

Think for 30 seconds, then discuss for 1 minute. What should my conclusion be?

Hypothesis testing

A framework

Take home points

  • Statistics is the practise of using a sample to infer things about a population
  • The distribution of observed statistics obtained through repeatedly sampling the population is called the sampling distribution
  • Since we often only take one sample, we can generate a sampling distribution using data from our sample via bootstrapping
  • Confidence intervals (the middle X% of the sampling distribution) answer the question: if we collected another sample of the same size, how much would our test statistic be likely to vary?
  • Hypothesis testing (comparing observed to a null distribution) answer the question: how compatible is the observed with our null hypothesis

Exercises

Exercises

  • We will use R, a free statistics focused programming language to explore these ideas further
  • At the end you will have a set of tools you can apply to your own data!
  • Bring a computer running Windows, macOS or Linux
    • Borrow from IT dept if you want to

Referenced materials

Sokal, R. R., and F. J. Rohlf. 1995. Biometry. https://lubcat.lub.lu.se/cgi-bin/koha/opac-detail.pl?biblionumber=2483009.