--- title: "Concatenating cohort records" output: rmarkdown::html_vignette vignette: > %\VignetteIndexEntry{a05_concatanate_cohorts} %\VignetteEngine{knitr::rmarkdown} %\VignetteEncoding{UTF-8} --- ```{r, include = FALSE} knitr::opts_chunk$set( collapse = TRUE, comment = "#>" ) ``` ```{r setup} library(CohortConstructor) library(CohortCharacteristics) library(ggplot2) ``` ```{r, include = FALSE} knitr::opts_chunk$set( collapse = TRUE, eval = TRUE, message = FALSE, warning = FALSE, comment = "#>" ) library(CDMConnector) library(dplyr, warn.conflicts = FALSE) if (Sys.getenv("EUNOMIA_DATA_FOLDER") == ""){ Sys.setenv("EUNOMIA_DATA_FOLDER" = file.path(tempdir(), "eunomia"))} if (!dir.exists(Sys.getenv("EUNOMIA_DATA_FOLDER"))){ dir.create(Sys.getenv("EUNOMIA_DATA_FOLDER")) downloadEunomiaData() } ``` For this example we'll use the Eunomia synthetic data from the CDMConnector package. ```{r} con <- DBI::dbConnect(duckdb::duckdb(), dbdir = eunomiaDir()) cdm <- CDMConnector::cdmFromCon(con, cdmSchema = "main", writeSchema = "main", writePrefix = "my_study_") ``` Let's start by creating a cohort of users of acetaminophen ```{r} cdm$medications <- conceptCohort(cdm = cdm, conceptSet = list("acetaminophen" = 1127433), name = "medications") cohortCount(cdm$medications) ``` We can merge cohort records using the `collapseCohorts()` function in the CohortConstructor package. The function allows us to specifying the number of days between two cohort entries, which will then be merged into a single record. Let's first define a new cohort where records within 1095 days (~ 3 years) of each other will be merged. ```{r} cdm$medications_collapsed <- cdm$medications |> collapseCohorts( gap = 1095, name = "medications_collapsed" ) ``` Let's compare how this function would change the records of a single individual. ```{r} cdm$medications |> filter(subject_id == 1) cdm$medications_collapsed |> filter(subject_id == 1) ``` Subject 1 initially had 4 records between 1971 and 1982. After specifying that records within three years of each other are to be merged, the number of records decreases to three. The record from 1980-03-15 to 1980-03-29 and the record from 1982-09-11 to 1982-10-02 are merged to create a new record from 1980-03-15 to 1982-10-02. Now let's look at how the cohorts have been changed. ```{r} summary_attrition <- summariseCohortAttrition(cdm$medications_collapsed) tableCohortAttrition(summary_attrition) ```