ECNPESTIMATE procedure

Calculates nonparametric estimates of species richness (D.A. Murray).

Options

`PRINT` = string token	Controls printed output (`summary`, `estimates`); default `summ`, `esti`
`GROUPS` = factor	Grouping factor for different samples
`NBOOT` = scalar	A scalar defining the number of bootstrap samples to be performed; default 100
`SEED` = scalar	Seed for random number generator; default 0

Parameters

`DATA` = variates, matrices or pointers	A variate containing abundances of species or a pointer or matrix specifying the individuals for each species for different sites/samples
`ESTIMATES` = variates or pointer	Saves the estimated species richness in a variate, or in a pointer if `GROUPS` are specified
`SE` = variates or pointers	Saves the analytic standard errors in a variate, or in a pointer if groups are specified
`BSE` = variates or pointers	Saves the bootstrap standard errors in a variate, or in a pointer if groups are specified

Description

Richness is the measure of the number of species within a sample. ECNPESTIMATE provides a number of nonparametric estimators for measuring true species richness. These estimators include the Chao 1, Chao 2, ACE, ICE, first-order jackknife, second-order jackknife and bootstrap. The Chao 1 and ACE are based on the abundances within the samples, whereas the other estimators are incidence-based using frequencies of species in a set of samples. Standard errors are calculated using analytical results where possible. In addition, for multiple samples, standard errors are calculated by resampling with replacement.

The data can be supplied using the DATA parameter either as a matrix where the rows contain the number of individuals for each species and the columns specify the different samples or sites, or as a pointer to variates containing samples for the individuals for each species. Alternatively, the individual species numbers can be supplied in a variate for a single sample/site. The GROUPS option can supply a grouping factor to produce estimates for different groups. The estimates and standard errors can be saved using the ESTIMATES, SE (analytic standard errors) and BSE (bootstrap standard errors) parameters. If a grouping factor is supplied then they will be saved in a pointer to variates, otherwise they are saved in a variate.

The PRINT option controls printed output, with settings:

`summary`	a summary of the data,
`estimates`	the species richness estimates and standard errors.

The NBOOT option specifies how many bootstrap samples to take to calculate the bootstrap standard errors and confidence intervals (default 100). The probability level for the confidence interval can be set by the CIPROBABILITY option; by default 0.95. The SEED option specifies the seed to use in the random number generator used to construct the bootstrap samples. The default value of zero continues an existing sequence of random numbers or, if the generator has not yet been used in this run of Genstat, it initializes the generator automatically.

Options: PRINT, GROUPS, NBOOT, SEED.

Parameters: DATA, ESTIMATES, SE, BSE.

Method

The Chao 1 estimator of the absolute number of species in an assemblage is calculated by:

s(Chao 1) = S_obs + F₁² / (2 × F₂)

where S_obs is the number of species in the sample, F₁ is the number of observed species represented by a single individual (frequency of singletons), and F₂ is the number of species that have exactly two individuals (frequency of doubletons). The variance for the estimate is given by:

var(Chao 1) = F₂ × { 0.5 × (F₁ / F₂)² + (F₁ / F₂)³ + 0.25 × (F₁ / F₂)⁴ }

When F₂ equals 0 the modified bias-corrected estimate is used:

s(Chao 1) = S_obs + F₁ × (F₁ – 1) / 2

and

var(Chao 1) = {F₁ × (F₁-1) / 2} + {F₁ × (2×F₁-1)² / 4} – F₁⁴ / (4 × s(Chao 1))

The Chao 2 estimator is calculated by:

s(Chao 2) = S_obs + Q₁² / (2 × Q₂)

where S_obs is the number of species in sample, Q₁ is the number of species that occur in exactly one sample (uniques), and Q₂ is the number of species that occur in exactly two samples (duplicates). The variance for the estimate is given by:

var(Chao 2) = Q₂ × { 0.5 × (Q₁ / Q₂)² + (Q₁ / Q₂)³ + 0.25 × (Q₁ / Q₂)⁴ }

When Q₂ equals 0 the modified bias-corrected estimate is used:

s(Chao 2) = S_obs + Q₁ × (Q₁ – 1) / 2

and

var(Chao 2) = {(H – 1) / H} × Q₁ × (Q₁ – 1) / 2

+ {(H – 1) / H}² × Q₁ × {2 × Q₁ – 1)²} / 4

+ {(H – 1) / H}² × Q₁⁴ / (4 × Chao2)

where H is the total number of samples.

The first-order jackknife estimate is evaluated by:

s(jack1) = S_obs + Q₁ × (H – 1) / H

with variance

var(jack1) = {(H – 1) / H} × { ∑_j=1…S (j² × f_j) – (Q₁² / H) }

where S is the number of species, Q₁ is the number of species that occur in exactly one sample and f_j is the number of samples with j unique species.

The second-order jackknife estimate is given by:

s(jack2) = S_obs + Q₁ × (2 × H – 3) / H – Q₂ × (H – 2)² / {H × (H – 1)}

where Q₁ is the number of species that occur in exactly one sample, and Q₂ is the number of species that occur in exactly two samples.

The bootstrap estimate is calculated by:

s(boot) = S_obs + ∑_j=1…S (1 – p_j)^H

where p_j is the proportion of species j. The variance is calculated using the method given in Smith & van Belle (1984).

The abundance-based coverage estimator (ACE) is given by:

s(ACE) = S_abund + S_rare / C_ACE + (F₁ / C_ACE) × γ²

where S_abund is the number of abundant species (>10), S_rare is the number of rare species (≤10), F₁ is the number of singletons,

C_ACE = 1 – F₁ / N_rare

where N_rare is the total number of individuals in rare species, and

γ = max {(S_rare/C_ACE) × ∑_i=1…10 {i × (i-1) × F_i} / (N_rare × (N_rare – 1)) – 1, 0}

The incidence-based coverage estimator (ICE) is given by:

s(ICE) = S_freq + S_infr / C_ICE + (Q₁ / C_ICE) × γ²

where S_freq is the number of frequent species (>10), S_infr is the number of infrequent species (<=10), Q₁ is the number of uniques, C_ICE = 1 – Q₁ / N_infr where N_infr is the total number of occurrences of infrequent species, and

γ = max{(S_infr/C_ICE) × (M_infr/(M_infr-1)) × (∑_i=1…10 {i × (i-1) × Q_i} / N_infr²) – 1, 0}

where M_infr is the number of samples with at least one infrequent species.

The bootstrap standard errors are generated using the BOOTSTRAP procedure sampling with replacement, and the species richness estimates are calculated from these samples.

Action with `RESTRICT`

If the data are in a variate, the statistics are calculated using only those units included in the restriction. If data are in a pointer or matrix, the restriction are ignored.

References

Chao, A. (1987). Estimating the population size for capture-recapture data with unequal catchability. Biometrics, 43, 783-791.

Magurran, A.E. (2003). Measuring Biological Diversity. Blackwell, Oxford.

Smith, E.P. & van Belle, G. (1984). Nonparametric estimation of species richness. Biometrics, 40, 119-129.

Example

CAPTION      'ECNPESTIMATE example',\
             'Data from Helshe & Forrester (1983), Biometrics, pages 1-19';\
             STYLE=meta,minor
POINTER      [NVALUES=10] quad
VARIATE      [VALUES=0,2,0,1,0,1,1,2,0,0,0,0,0,8] quad[1]
VARIATE      [VALUES=13,2,1,0,0,1,0,0,1,0,0,0,0,36] quad[2]
VARIATE      [VALUES=21,4,0,1,1,2,0,0,0,1,3,5,0,14] quad[3]
VARIATE      [VALUES=14,4,0,2,2,1,0,0,0,0,0,1,0,19] quad[4]
VARIATE      [VALUES=5,1,0,0,0,0,0,0,0,0,0,0,0,3] quad[5]
VARIATE      [VALUES=22,1,0,6,0,1,0,0,0,0,0,2,0,22] quad[6]
VARIATE      [VALUES=13,1,0,0,1,0,0,0,0,0,0,0,0,6] quad[7]
VARIATE      [VALUES=4,0,1,0,0,0,0,0,0,0,0,0,1,8] quad[8]
VARIATE      [VALUES=4,1,0,1,0,1,0,0,0,0,0,0,0,5] quad[9]
VARIATE      [VALUES=27,6,0,2,1,5,0,0,0,0,2,3,0,41] quad[10]
ECNPESTIMATE [SEED=204029] quad

Updated on March 8, 2019

Was this article helpful?

Yes No