Contact Information: Michael W. Belitz (mbelitz@ufl.edu)
The R package ‘phenesse’ provides functions to calculate Weibull-parameterized estimates of phenology for any percentile of a distribution (except 0 and 100). The algorithm of the estimator is described in Belitz et al (2020) https://doi.org/10.1111/2041-210X.13448. This publication also describes the results of detailed simulations and empirical examples documenting the efficacy of the estimator and other commonly used phenology estimators. We found the Weibull-parameterized estimator to be especially useful when estimating the onset or offset of phenology events using presence-only data.
We provide example incidental observations from iNaturalist for four species and a small extent of the United States. These data are for the year 2019 up until mid October and are not scored by phenological phases. The four species are Speyeria cybele, Danaus plexippus, Rudbeckia hirta, and Asclepias syriaca.
example iNaturalist data:
Estimate the onset (0.01%), 10% and 50% of when Speyeria cybele has been observed in 2019 across the entire extent. We recommend at least 250 iterations are run to get a stable estimate. The default number of iterations is 500.
s_cybele <- subset(inat_examples, scientific_name == "Speyeria cybele")
# calculate onset
weib_percentile(observations = s_cybele$doy, percentile = 0.01, iterations = 250)
#> [1] 112.2427
# note that the Weibull distribution does not estimate true 0th and 100th percentiles. Therefore the user must choose a percentile (quantile) between 0 and 1.
#calculate 10th percentile
weib_percentile(observations = s_cybele$doy, percentile = 0.1, iterations = 250)
#> [1] 152.2048
#calculate 50th percentile
weib_percentile(observations = s_cybele$doy, percentile = 0.5)
#> [1] 194.284
Estimate the beginning of when Speyeria cybele were observed in 2019 and calculate CI
s_cybele <- subset(inat_examples, scientific_name == "Speyeria cybele")
# calculate onset, we're using very low iterations and bootstraps to knit vignette quickly. Please increase both iterations and bootstraps if using for analyses
weib_percentile_ci(observations = s_cybele$doy, iterations = 10,
percentile = 0.01, bootstraps = 100)
#> estimate low_ci high_ci
#> 1 107.8024 93.12318 140.6371
# note warning that extreme order statistics used as endpoints. Increase number of bootstraps to avoid this warning.
There is a built in option to run the bootstraps in parallel. To do so, change the parameter “parallelize” to either “multicore” or “snow” and choose the number of processes to be used in parallel operation (ncpus).
# parallelize the above calculation using multicore parallelization and 4 cores.
# weib_percentile_ci(observations = s_cybele$doy, iterations = 10,
# percentile = 0.01, bootstraps = 100,
# parallelize = "multicore", ncpus = 4)
# not run because having multiple cores in running in vignette gives check_rhub warnings
Another option I have found useful when running many confidence interval calculations is to make a list of the observations that you want to estimate the CIs of and use mclapply (multiple core lapply) to estimate apply a function containing the weib_percentile_ci over a list using multiple cores. I often find this to be faster than using the built-in parallelization when estimating many weib_percentile_ci estimates and using 40 cores.
Estimate the 10% and 50% phenometrics and confidence intervals for a quantile estimate of Rudbeckia hirta.
r_hirta <- subset(inat_examples, scientific_name == "Rudbeckia hirta")
# calculate 50% quantile and CIs
quantile_ci(observations = r_hirta$doy, percentile = 0.5)
#> estimate low_ci high_ci
#> 50% 185.5 175 198.5
Calculate the mean estimate and confidence intervals of the estimate of Rudbeckia hirta.