QK Box 6.7 & 6.11

Keough and Raimondi (1995) set up an experiment to examine the response of serpulid (polychaete worms) larvae to four types of biofilms on hard substrata in shallow marine waters. The four treatments were: sterile substrata, biofilms developed in the field with a net (to keep invertebrates), biofilms developed in the lab, and lab biofilms with a covering net (as a control for the presence of a net). The substrata were left for one week, and then the newly settled worms identified and counted. To control for small numbers of larvae passing through the netting during the conditioning period, they used an additional treatment, which was netted, and returned to the laboratory after one week and censused. The values of this treatment were used to adjust the numbers in the treatment that started in the field.

Serpulid polychaete worm. Photo M. Keough

The paper is here and the data file (also used in first edition) is here

Keough, M. J. & Raimondi, P. T. (1995). Responses of settling invertebrate larvae to bioorganic films: effects of different types of films. Journal of Experimental Marine Biology and Ecology, 185, 235-53.

Preliminaries

First, load the required packages

Import serpulid data file (serpulid.csv)

serpulid <- read.csv("data/serpulid.csv")
head(serpulid,10)

   film serp lserp spir bugula
1    SL   94 1.973    0      4
2    SL   82 1.914    2      2
3    SL   74 1.869    2      8
4    SL   74 1.869    0      2
5    SL  122 2.086    2     14
6    SL  112 2.049    2      6
7    SL   60 1.778    0     12
8    UL  142 2.152    2      4
9    UL   80 1.903    0      6
10   UL  100 2.000    0      2

Make film a factor

serpulid$film <- factor(serpulid$film)

Fit model for log(serpulids)

serpulid.aov <- aov(lserp~film, data=serpulid)

Check diagnostics

plot(serpulid.aov)

Generate overall anova table

summary(serpulid.aov)

            Df Sum Sq Mean Sq F value  Pr(>F)   
film         3 0.2458 0.08192   6.015 0.00331 **
Residuals   24 0.3269 0.01362                   
---
Signif. codes:  0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1

Look at the linear model summary and note the F (field) is the reference level (ie Intercept)

summary.lm(serpulid.aov)


Call:
aov(formula = lserp ~ film, data = serpulid)

Residuals:
     Min       1Q   Median       3Q      Max 
-0.22929 -0.06500  0.01843  0.07054  0.19557 

Coefficients:
            Estimate Std. Error t value Pr(>|t|)    
(Intercept)  2.11343    0.04411  47.912  < 2e-16 ***
filmNL       0.06843    0.06238   1.097  0.28356    
filmSL      -0.17943    0.06238  -2.876  0.00831 ** 
filmUL       0.01886    0.06238   0.302  0.76504    
---
Signif. codes:  0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1

Residual standard error: 0.1167 on 24 degrees of freedom
Multiple R-squared:  0.4292,    Adjusted R-squared:  0.3578 
F-statistic: 6.015 on 3 and 24 DF,  p-value: 0.003314

Get treatment means and compare to the above lm summary

aggregate(serpulid$lserp, list(serpulid$film), FUN=mean)

  Group.1        x
1       F 2.113429
2      NL 2.181857
3      SL 1.934000
4      UL 2.132286

Without any thinking, get pairwise tests

TukeyHSD(serpulid.aov)

  Tukey multiple comparisons of means
    95% family-wise confidence level

Fit: aov(formula = lserp ~ film, data = serpulid)

$film
             diff         lwr          upr     p adj
NL-F   0.06842857 -0.10365813  0.240515270 0.6947523
SL-F  -0.17942857 -0.35151527 -0.007341872 0.0387848
UL-F   0.01885714 -0.15322956  0.190943842 0.9901571
SL-NL -0.24785714 -0.41994384 -0.075770444 0.0029594
UL-NL -0.04957143 -0.22165813  0.122515270 0.8561572
UL-SL  0.19828571  0.02619902  0.370372413 0.0197120

Generate planned comparisons

We’re doing this by defining contrasts and refitting the model using this contrast. The planned comparison then appears as the first effect when we look at the model fitting (i.e., film1).

UL vs NL

# Default
contrasts(serpulid$film)

   NL SL UL
F   0  0  0
NL  1  0  0
SL  0  1  0
UL  0  0  1

# Change it
contrasts(serpulid$film) <- c(0,1,0,-1)
contrasts(serpulid$film)

   [,1]       [,2]       [,3]
F     0 -0.5000000 -0.7071068
NL    1 -0.1666667  0.4714045
SL    0  0.8333333 -0.2357023
UL   -1 -0.1666667  0.4714045

# Refit the model with new contrasts
serpulid.aov <- aov(lserp~film, data=serpulid)
summary(serpulid.aov)

            Df Sum Sq Mean Sq F value  Pr(>F)   
film         3 0.2458 0.08192   6.015 0.00331 **
Residuals   24 0.3269 0.01362                   
---
Signif. codes:  0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1

summary.lm(serpulid.aov)


Call:
aov(formula = lserp ~ film, data = serpulid)

Residuals:
     Min       1Q   Median       3Q      Max 
-0.22929 -0.06500  0.01843  0.07054  0.19557 

Coefficients:
            Estimate Std. Error t value Pr(>|t|)    
(Intercept)  2.09039    0.02206  94.780  < 2e-16 ***
film1        0.02479    0.03119   0.795  0.43461    
film2       -0.16407    0.04411  -3.720  0.00107 ** 
film3        0.08344    0.04411   1.892  0.07067 .  
---
Signif. codes:  0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1

Residual standard error: 0.1167 on 24 degrees of freedom
Multiple R-squared:  0.4292,    Adjusted R-squared:  0.3578 
F-statistic: 6.015 on 3 and 24 DF,  p-value: 0.003314

F vs average (NL & UL)

contrasts(serpulid$film) <- c(2,-1,0,-1)
contrasts(serpulid$film)

   [,1]       [,2]        [,3]
F     2 -0.2867757  0.03306064
NL   -1 -0.3677574 -0.66939360
SL    0  0.8603272 -0.09918192
UL   -1 -0.2057940  0.73551488

# Refit the model with new contrasts
serpulid.aov <- aov(lserp~film, data=serpulid)
summary(serpulid.aov)

            Df Sum Sq Mean Sq F value  Pr(>F)   
film         3 0.2458 0.08192   6.015 0.00331 **
Residuals   24 0.3269 0.01362                   
---
Signif. codes:  0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1

summary.lm(serpulid.aov)


Call:
aov(formula = lserp ~ film, data = serpulid)

Residuals:
     Min       1Q   Median       3Q      Max 
-0.22929 -0.06500  0.01843  0.07054  0.19557 

Coefficients:
            Estimate Std. Error t value Pr(>|t|)    
(Intercept)  2.09039    0.02206  94.780  < 2e-16 ***
film1       -0.01455    0.01801  -0.808 0.427118    
film2       -0.18341    0.04411  -4.158 0.000353 ***
film3       -0.01414    0.04411  -0.321 0.751322    
---
Signif. codes:  0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1

Residual standard error: 0.1167 on 24 degrees of freedom
Multiple R-squared:  0.4292,    Adjusted R-squared:  0.3578 
F-statistic: 6.015 on 3 and 24 DF,  p-value: 0.003314

SL vs average (F & NL & UL)

contrasts(serpulid$film) <- c(-1,-1,3,-1)
contrasts(serpulid$film)

   [,1]        [,2]       [,3]
F    -1 -0.67052910 -0.4658942
NL   -1  0.73874075 -0.3477481
SL    3  0.00000000  0.0000000
UL   -1 -0.06821164  0.8136423

# Refit the model with new contrasts
serpulid.aov <- aov(lserp~film, data=serpulid)
serpulid.aov

Call:
   aov(formula = lserp ~ film, data = serpulid)

Terms:
                     film Residuals
Sum of Squares  0.2457707 0.3268840
Deg. of Freedom         3        24

Residual standard error: 0.1167055
Estimated effects are balanced

summary.lm(serpulid.aov)


Call:
aov(formula = lserp ~ film, data = serpulid)

Residuals:
     Min       1Q   Median       3Q      Max 
-0.22929 -0.06500  0.01843  0.07054  0.19557 

Coefficients:
             Estimate Std. Error t value Pr(>|t|)    
(Intercept)  2.090393   0.022055  94.780  < 2e-16 ***
film1       -0.052131   0.012734  -4.094 0.000415 ***
film2        0.049265   0.044111   1.117 0.275117    
film3       -0.008453   0.044111  -0.192 0.849643    
---
Signif. codes:  0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1

Residual standard error: 0.1167 on 24 degrees of freedom
Multiple R-squared:  0.4292,    Adjusted R-squared:  0.3578 
F-statistic: 6.015 on 3 and 24 DF,  p-value: 0.003314

Diagnostics for untransformed data

We used log-transformed data to match the original paper, but if analysing these data from first principles, we’d look at the raw data first to decide which form of model or transformation to use.

serpraw.aov <- aov(serp~film, data=serpulid)
plot(serpraw.aov)

Information for power analysis using spirorbids, Bugula

These calculations are used for Box 6.11, where we consider data for two other invertebrate groups, spirorbid polychaetes and bryozoans in the genus Bugula, mainly B. neritina.

Spirorbid polychaete worm, c. 2mm diameter. Mick Keough

Recently metamorphosed bryozoan, Bugula. Approximately 1.5 mm high. Mick Keough

Required information

We need to run the analysis on each of these groups, to get two important pieces of information. We need estimates of the variance, and we generally use the residual mean square. We also want an estimate of a baseline for calculating a hypothetical Effect Size. In the context of this question, we’ll use the means for unfilmed surfaces, as we are thinking about the potential for our treatments to increase recruitment.

boxplot(spir~film, data=serpulid)

boxplot(bugula~film, data=serpulid)

spir.aov<-aov(spir~film, data=serpulid)
plot(spir.aov)

summary(spir.aov)

            Df Sum Sq Mean Sq F value Pr(>F)
film         3  6.507   2.169   1.678  0.198
Residuals   24 31.022   1.293

bugula.aov<-aov(bugula~film, data=serpulid)
plot(bugula.aov)

summary(bugula.aov)

            Df Sum Sq Mean Sq F value Pr(>F)
film         3   22.8   7.587   0.338  0.798
Residuals   24  538.3  22.429

spirmean<-summarySE(data=serpulid,measurevar = "spir", groupvars = "film")
spirmean

  film N      spir        sd        se        ci
1    F 7 0.5357143 1.0818442 0.4088987 1.0005390
2   NL 7 0.5714286 0.9759001 0.3688556 0.9025570
3   SL 7 1.1428571 1.0690450 0.4040610 0.9887017
4   UL 7 1.7142857 1.3801311 0.5216405 1.2764084

bugmean<-summarySE(data=serpulid,measurevar = "bugula", groupvars = "film")
bugmean

  film N   bugula       sd       se       ci
1    F 7 6.982143 5.520524 2.086562 5.105634
2   NL 7 8.857143 4.740906 1.791894 4.384607
3   SL 7 6.857143 4.740906 1.791894 4.384607
4   UL 7 6.571429 3.779645 1.428571 3.495588

Power calculations

There are two scenarios in Box 6.11. Both involve a doubling of settlement from the base treatment SL above. In the first scenario, one treatment is 6.86 and the others are 13.72. In the second scenario, treatment means are spaced evenly between 6.86 and 13.72.

alphasq1<-3*var(c(6.86, 13.72, 13.72,13.72))
alphasq2<-3*var(c(6.86, 9.14, 11.44,13.72))
msres=22.43
n=7
p=4
lambda1<-n*alphasq1/msres
lambda1

[1] 11.01484

lambda2<-n*alphasq2/msres
lambda2

[1] 8.168685

f1<-sqrt(alphasq1/p/msres)
f1

[1] 0.6272059

f2<-sqrt(alphasq2/p/msres)
f2

[1] 0.5401285

#For scenario 1, λ= 11.01 and Cohens *f* = 0.627
#For scenario 2, λ= 8.16 and Cohens *f* = 0.54
# scenario 1: power
pwr.anova.test(k=p,f=f1,sig.level=0.05, n=n)


     Balanced one-way analysis of variance power calculation 

              k = 4
              n = 7
              f = 0.6272059
      sig.level = 0.05
          power = 0.7298327

NOTE: n is number in each group

#scenario 2: power
pwr.anova.test(k=p,f=f2,sig.level=0.05, n=n)


     Balanced one-way analysis of variance power calculation 

              k = 4
              n = 7
              f = 0.5401285
      sig.level = 0.05
          power = 0.5860053

NOTE: n is number in each group

#scenario 1: required sample size
pwr.anova.test(k=p,f=f1,sig.level=0.05, power=0.8)


     Balanced one-way analysis of variance power calculation 

              k = 4
              n = 7.976459
              f = 0.6272059
      sig.level = 0.05
          power = 0.8

NOTE: n is number in each group

#scenario 2: required sample size
pwr.anova.test(k=p,f=f2,sig.level=0.05, power=0.8)


     Balanced one-way analysis of variance power calculation 

              k = 4
              n = 10.37355
              f = 0.5401285
      sig.level = 0.05
          power = 0.8

NOTE: n is number in each group