Make a new folder in your course folder for the exercise (e.g. biob11/exercise_9)
Open RStudio
If you haven’t closed RStudio since the last exercise, I recommend you do so and then re-open it. If it asks if you want to save your R Session data, choose no.
Set your working directory by going to Session -> Set working directory -> Choose directory, then navigate to the folder you just made for this exercise.
Create a new Rmarkdown document (File -> New file -> R markdown..). Give it a clear title.
Please ensure you have followed the step above before you start!
Bergmann’s rule and fiddler crabs
The Atlantic marsh fiddler crab, Minuca pugnax, lives in salt marshes throughout the eastern coast of the United States. Historically, M. pugnax were distributed from northern Florida to Cape Cod, Massachusetts, but like other species have expanded their range northward due to ocean warming.
The pie_crab.csv dataset is from a study by Johnson and colleagues at the Plum Island Ecosystem Long Term Ecological Research site.
13 marshes were sampled on the Atlantic coast of the United States in summer 2016
Spanning > 12 degrees of latitude, from northeast Florida to northeast Massachusetts
Between 25 and 37 adult male fiddler crabs were collected, and their carapace size (mm) recorded
The dataset was collected to test Bergmann’s rule:
One of the best-known patterns in biogeography is Bergmann’s rule. It predicts that organisms at higher latitudes are larger than ones at lower latitudes. Many organisms follow Bergmann’s rule, including insects, birds, snakes, marine invertebrates, and terrestrial and marine mammals (Johnson et al. 2019).
Analysis
General
In your opinion, what (statistical) population are the researchers trying to make inferences about?
Data handling and plotting
Ensure you have loaded the tidyverse and infer packages.
Import the dataset using read_csv().
Check the data for mistakes.
Make illustrative plot(s) of the key variables to address “Bergmann’s rule” in the dataset using ggplot().
Does Minuca pugnax follow Bergmann’s rule
State the null and alternative hypothesis.
What method and test statistic(s) will you use? Why?
Hint
I think this is one of the examples where you could argue for either using a correlation analysis or using a regression analysis. It depends on if you think it would be meaningful to make the statement: “For each degree increase in latitude, Minuca pugnax will be on average X bigger”. If yes, and the relationship appears to be linear, then use linear regression. If you would feel safer saying “There is a positive/negative correlation between Minuca pugnax size and latitude”, and not ascribing causality or a strict rule, then you should use a correlation approach. You could also answer the question in a completely different way, by testing if crabs from the highest latitude differ from those at the lowest, for example.
Decide how you will generate a datasets under your null hypothesis to use in your null distribution.
5
Fit the linear regression to each sample to get a sampling distribution.
Plot the null distribution and the observed statistic. Calculate a p-value(s) for your null hypothesis.
Code hint
______ |>1visualise() +2shade_p_value(obs_stat = ______, direction ="______") +3labs(x ="______")4_____ |>get_p_value(obs_stat = ______, direction ="______")
1
Pipe your null distribution object into visualise().
2
Plot your observed statistic(s), and specify that the direction of your hypothesis.
3
You can change the axis labels to make the plot more clear.
4
Your null distribution.
Write a small statement that summarises your statistical methods and findings. You should:
State clearly the research question, and what your hypotheses were. Explain why these hypotheses answer your research question.
Explain your choice of test statistic/method. Relate this to your hypotheses and question.
State your observed statistics(s) and confidence intervals. Explain what these mean. Refer to a plot you made that shows the data.
State the outcome of your hypothesis test (quoting test statisitc(s) and p-values). Interpret this result, in both terms of your statistical hypothesis, but also the broad research question.