--- title: "maldipickr" output: rmarkdown::html_vignette vignette: > %\VignetteIndexEntry{maldipickr} %\VignetteEngine{knitr::rmarkdown} %\VignetteEncoding{UTF-8} --- ```{r, include = FALSE} knitr::opts_chunk$set( collapse = TRUE, comment = "#>" ) ``` ```{r setup} library(maldipickr) ``` ## Quickstart The `{maldipickr}` package helps microbiologists reduce duplicate/clonal bacteria from their cultures and eventually exclude previously selected bacteria. `{maldipickr}` achieve this feat by grouping together data from MALDI Biotyper and helps choose representative bacteria from each group using user-relevant metadata -- a process known as **cherry-picking**. `{maldipickr}` cherry-picks bacterial isolates with MALDI Biotyper: * [using taxonomic identification report](#using-taxonomic-identification-report) * [using spectra data](#using-spectra-data) ### Using taxonomic identification report First make sure `{maldipickr}` is installed and loaded, alternatively [follow the instructions to install the package](https://clavellab.github.io/maldipickr/index.html#installation). Cherry-picking four isolates based on their taxonomic identification by the MALDI Biotyper is done in a few steps with `{maldipickr}`. #### Get example data We import an example Biotyper CSV report and glimpse at the table. ```{r quickstart_report_data, eval = TRUE} report_tbl <- read_biotyper_report( system.file("biotyper_unknown.csv", package = "maldipickr") ) report_tbl %>% dplyr::select(name, bruker_species, bruker_log) %>% knitr::kable() ``` #### Delineate clusters and cherry-pick Delineate clusters from the identifications after filtering the reliable ones and cherry-pick one representative spectra. Unreliable identifications based on the log-score are replaced by "not reliable identification", but stay tuned as they do not represent the same isolates! ```{r quickstart_report_filter, eval = TRUE} report_tbl <- report_tbl %>% dplyr::mutate( bruker_species = dplyr::if_else(bruker_log >= 2, bruker_species, "not reliable identification") ) knitr::kable(report_tbl) ``` The chosen ones are indicated by `to_pick` column. ```{r quickstart_report_delineate, eval = TRUE} report_tbl %>% delineate_with_identification() %>% pick_spectra(report_tbl, criteria_column = "bruker_log") %>% dplyr::relocate(name, to_pick, bruker_species) %>% knitr::kable() ``` ### Using spectra data In parallel to taxonomic identification reports, `{maldipickr}` process spectra data. Make sure `{maldipickr}` is installed and loaded, alternatively [follow the instructions to install the package](https://clavellab.github.io/maldipickr/index.html#installation). Cherry-picking six isolates from three species based on their spectra data obtained from the MALDI Biotyper is done in a few steps with `{maldipickr}`. #### Get example data We set up the directory location of our example spectra data, but adjust for your requirements. We import and process the spectra which gives us a named list of three objects: spectra, peaks and metadata (more details in Value section of `process_spectra()`). ```{r quickstart_spectra_data, eval = TRUE} spectra_dir <- system.file("toy-species-spectra", package = "maldipickr") processed <- spectra_dir %>% import_biotyper_spectra() %>% process_spectra() ``` #### Delineate clusters and cherry-pick Delineate spectra clusters using Cosine similarity and cherry-pick one representative spectra. The chosen ones are indicated by `to_pick` column. ```{r quickstart_spectra_delineate, eval = TRUE} processed %>% list() %>% merge_processed_spectra() %>% coop::tcosine() %>% delineate_with_similarity(threshold = 0.92) %>% set_reference_spectra(processed$metadata) %>% pick_spectra() %>% dplyr::relocate(name, to_pick) %>% knitr::kable() ``` This provides only a brief overview of the features of `{maldipickr}`, browse the other vignettes to learn more about additional features. ## Session information ```{r session, eval = TRUE} sessionInfo() ```