--- title: "Multiprocessing and single-core processing" output: rmarkdown::html_vignette vignette: > %\VignetteIndexEntry{Multiprocessing and single-core processing} %\VignetteEngine{knitr::rmarkdown} %\VignetteEncoding{UTF-8} --- ```{r, include = FALSE} knitr::opts_chunk$set( collapse = TRUE, comment = "#>" ) ``` ## Use Multiprocessing ? The TAD package uses the `future` API to enhance reproducibility, and uniformize the way multiprocessing is done across plateforms. The `future` package is not mandatory, but it is recommanded (we used the `HenrikBengtsson/future@1.33.2` to test this package). You can install it with: ```R ## if you want to install the version 1.33.2 remotes::install_github("HenrikBengtsson/future@1.33.2") ## if you want to install any version remotes::install_github("HenrikBengtsson/future") ## if you want to install the latest version remotes::install_github("HenrikBengtsson/future@*release") ``` For small datasets (< 16 variables/columns), we recommand using the [`future::sequencial` strategy/plan](#run-with-single-core-processing). Otherwise, you can use the [`future::multisession` strategy/plan](#run-with-multiprocessing) to fasten some of the processing in this package. ## Run with Single Core processing Load the `future` library and define the `future::plan` to `future::multisession`. ```{r single core processing} library(future) plan(multisession) weights <- data.frame( sp1 = c(1, 0), sp2 = c(2, 8), sp3 = c(0, 2) ) aggregation_factor <- data.frame( plots = c("plot1", "plot2") ) time_before <- proc.time()[[1]] ## This will run in singleprocess mode str(TAD::generate_random_matrix( weights = weights, aggregation_factor = aggregation_factor, randomization_number = 500 )) ``` ## Run with Multiprocessing Load the `future` library and define the `future::plan` to `future::multisession`. ```{r multiprocessing} library(future) plan(multisession) weights <- data.frame( sp1 = c(1, 0), sp2 = c(2, 8), sp3 = c(0, 2) ) aggregation_factor <- data.frame( plots = c("plot1", "plot2") ) time_before <- proc.time()[[1]] ## This will run in multiprocessing mode str(TAD::generate_random_matrix( weights = weights, aggregation_factor = aggregation_factor, randomization_number = 500 )) ``` ## When you have finished Don't forget to reset the plan to `future::sequencial` when you have finished your processing. It is necesary to cleanup the resources allocated with the underlying calls of the `parallel` package. ```{r cleanup} plan(sequential) ``` ## Running the TAD Analysis ```{r tad analysis} with_parallelism <- function(x) { future::plan(future::multisession) on.exit(future::plan(future::sequential)) force(x) } # weights <- TAD::AB[, c(5:102)] weights <- TAD::AB[, c(5:30)] with_parallelism( result <- TAD::launch_analysis_tad( weights = weights, weights_factor = TAD::AB[, c("Year", "Plot", "Treatment", "Bloc")], trait_data = log(TAD::trait[["SLA"]])[seq_len(ncol(weights))], aggregation_factor_name = c("Year", "Bloc"), statistics_factor_name = c("Treatment"), regenerate_abundance_df = TRUE, regenerate_weighted_moments_df = TRUE, regenerate_stat_per_obs_df = TRUE, regenerate_stat_per_rand_df = TRUE, randomization_number = 100, seed = 1312, significativity_threshold = c(0.05, 0.95) ) ) str(result) ```