---
title: "Use of SynthETIC to Generate Individual Claims of Realistic Features"
output: rmarkdown::html_vignette
vignette: >
%\VignetteIndexEntry{Use of SynthETIC to Generate Individual Claims of Realistic Features}
%\VignetteEngine{knitr::rmarkdown}
%\VignetteEncoding{UTF-8}
---
```{r, include = FALSE}
knitr::opts_chunk$set(
collapse = TRUE,
comment = "#>"
)
```
This vignette aims to illustrate how the `SynthETIC` package can be used to generate a general insurance claims history with realistic distributional assumptions consistent with the experience of a specific (but anonymous) Auto Liability portfolio. The simulator is composed of 8 modelling steps (or "modules"), each of which will build on (a selection of) the output from previous steps:
1. [*Claim occurrence*](#occurrence): claim frequency, claim occurrence times
2. [*Claim size*](#size): claim size in constant dollar values i.e. without inflation
3. [*Claim notification*](#notification): notification delay (delay from occurrence to notification)
4. [*Claim closure*](#closure): settlement delay (delay from notification to closure)
5. [*Claim payment count*](#payment-count): number of partial payments
6. [*Claim payment size*](#payment-size): sizes of partial payments in constant dollar values i.e. without inflation
7. [*Claim payment time*](#payment-time): inter-partial-payment delays, partial payment times in calendar period
8. [*Claim inflation*](#inflation): sizes of inflated partial payments
In particular, with this demo we will output
------------------------------------------------------------------------------------------
Description R Object
------------------------------- ----------------------------------------------------------
*N*, claim frequency `n_vector` = # claims for each accident period
*U*, claim occurrence time `occurrence_times[[i]]` = claim occurrence time for all claims that occurred in period *i*
*S*, claim size `claim_sizes[[i]]` = claim size for all claims that occurred in period *i*
*V*, notification delay `notidel[[i]]` = notification delay for all claims that occurred in period *i*
*W*, settlement delay `setldel[[i]]` = settlement delay for all claims that occurred in period *i*
*M*, number of partial payments `no_payments[[i]]` = number of partial payments for all claims that occurred in period *i*
size of partial payments `payment_sizes[[i]][[j]]` = $ partial payments for claim *j* of occurrence period *i*
inter-partial delays `payment_delays[[i]][[j]]` = inter partial delays for claim *j* of occurrence period *i*
payment times (continuous time) `payment_times[[i]][[j]]` = payment times (in continuous time) for claim *j* of occurrence period *i*
payment times (period) `payment_periods[[i]][[j]]` = payment times (in calendar periods) for claim *j* of occurrence period *i*
actual payments (inflated) `payment_inflated[[i]][[j]]` = $ partial payments (inflated) for claim *j* of occurrence period *i*
------------------------------------------------------------------------------------------
Reference
---
For a full description of `SythETIC`'s structure and test parameters, readers should refer to:
Avanzi, B, Taylor, G, Wang, M, Wong, B (2021). `SynthETIC`: An individual insurance claim simulator with feature control. *Insurance: Mathematics and Economics* 100, 296–308. https://doi.org/10.1016/j.insmatheco.2021.06.004
The work can also be accessed via [arXiv:2008.05693](https://arxiv.org/abs/2008.05693).
To cite this package in publications, please use:
```{r, eval=FALSE}
citation("SynthETIC")
```
Set Up
---
```{r}
library(SynthETIC)
set.seed(20200131)
```
Package-wise Global Parameters
---
We introduce the reference value `ref_claim` partly as a measure of the monetary unit and/or overall claims experience. The default distributional assumptions were set up with a specific (but anonymous) Auto Liability portfolio in mind. `ref_claim` then allows users to easily simulate a synthetic portfolio with similar claim pattern but in a different currency, for example. We also remark that users can alternatively choose to interpret `ref_claim` as a monetary unit. For example, one can set `ref_claim <- 1000` and think of all amounts in terms of $1,000. However, in this case the default functions (as listed below) will not work and users will need to supply their own set of functions and set the values as multiples of `ref_claim` rather than fractions as in the default setting.
We also require the user to input a `time_unit` (which should be given as **a fraction of year**), so that the default input parameters apply to contexts where the time units are no longer in quarters. In the default setting we have a `time_unit` of 1/4.
The default input parameters will update automatically with the choice of the two global variables `ref_claim` and `time_unit`, which ensures that the simulator produce sensible results in contexts other than the default setting. We remark that both `ref_claim` and `time_unit` only affect the default simulation functions, and users can also choose to set up their own modelling assumptions for any of the modules to match their experiences even better. In the latter case, it is the responsibility of the user to ensure that their input parameters are compatible with their time units and claims experience. For example, if the time units are quarters, then claim occurrence rates must be quarterly.
```{r}
set_parameters(ref_claim = 200000, time_unit = 1/4)
ref_claim <- return_parameters()[1]
time_unit <- return_parameters()[2]
```
The reference value, `ref_claim` will be used throughout the simulation process (as listed in the table below).
------------------------------------------------------------------------------------------
Module Details
------------------------------- ----------------------------------------------------------
2. Claim Size At `ref_claim = 200000`, by default we simulate
claim sizes from *S^0.2 ~ Normal (9.5, sd = 3)*,
left truncated at 30.
When the reference value changes, we output the
claim sizes scaled by a factor of `ref_claim / 200000`.
3. Claim Notification By default we set the mean notification delay
(in quarters) to be $$min(3, max(1, 2 - \frac{1}{3} \log(\frac{claim\_size}{0.5~ref\_claim}))$$
(which will be automatically converted to the
relevant `time_unit`) i.e. the mean notification
delay decreases logarithmically with claim
size. It has maximum value 3 and equals 2 for a
claim of size exactly at `0.5 * ref_claim`.
4. Claim Closure The default value for the mean settlement delay
involves a term that defines the benchmark for
a claim to be considered "small": `0.1 * ref_claim`.
The default mean settlement delay increases
logarithmically with claim size and equals 6
exactly at this benchmark. Furthermore there was
a legislative change, captured in the default
mean function, that impacted the settlement
delays of those "small" claims.
5. Claim Payment Count For the default sampling distribution, we need
two claim size benchmarks as we sample from
different distributions for claims of
different sizes. In general a small number of
partial payments is required to settle small
claims, and additional payments will be required
to settle more extreme claims.
It is assumed that claims below `0.0375 * ref_claim`
can be settled in 1 or 2 payments, claims between
`0.075 * ref_claim` in 2 or 3 payments, and claims
beyond `0.075 * ref_claim` in no less than 4
payments.
6. Claim Payment Size We use the same proportion of `ref_claim` as in
the *Claim Closure* module, namely `0.1 * ref_claim`.
This benchmark value is used when simulating the
proportion of the last two payments in the default
`simulate_amt_pmt` function.
The mean proportion of claim paid in the last two
payments increases logarithmically with claim size,
and equals 75% exactly at this benchmark.
8. Claim Inflation Two benchmarks values are required in this section,
one each for the default SI occurrence and SI
payment functions.
1) A legislative change, captured by SI occurrence,
reduced claim size by up to 40% for the smallest
claims and impacted claims up to `0.25 * ref_claim`
in size.
2) The default SI payment is set to be 30% p.a.
for the smallest claims and zero for claims
exceeding `ref_claim` in size, and varies linearly
for claims between 0 and `ref_claim`.
------------------------------------------------------------------------------------------
The `time_unit` chosen will impact the time-related modules, specifically
* Claim Notification;
* Claim Closure;
* Claim Payment Time;
* Claim Inflation.
1. Claim Occurrence {#occurrence}
---
Unless otherwise specified, `claim_frequency()` assumes the claim frequency follows a Poisson distribution with mean equal to the product of exposure `E` associated with period $i$ and expected claim frequency `freq` per unit exposure for that period. The exposure and expected frequency are allowed to vary across periods, but not within a period.
Given the claim frequency, `claim_occurrence()` samples the occurrence times of each claim from a uniform distribution. Together, the two functions assume **by default** that the arrival of claims follows a Poisson process, with potentially varying rates across different periods (see [Example 1.2](#ex1.2)).
Alternative sampling processes are discussed in [Example 1.3](#ex1.3) and [1.4](#ex1.4).
## Example 1.1: Constant exposure and frequency {#ex1.1}
### Input parameters
* `years` = number of years considered
* `I` = number of claims development periods considered (which equals the number of years divided by the `time_unit`)
* `E[i]` = exposure associated with each period *i*
* `lambda[i]` = expected claim frequency per unit exposure for period *i*
```{r}
years <- 10
I <- years / time_unit
E <- c(rep(12000, I)) # effective annual exposure rates
lambda <- c(rep(0.03, I))
```
### Implementation and Output
```{r}
# Number of claims occurring for each period i
# shorter equivalent code:
# n_vector <- claim_frequency()
n_vector <- claim_frequency(I = I, E = E, freq = lambda)
n_vector
# Occurrence time of each claim r, for each period i
occurrence_times <- claim_occurrence(frequency_vector = n_vector)
occurrence_times[[1]]
```
## Example 1.2: Increasing exposure, constant frequency per unit of exposure {#ex1.2}
Note that variables named with `_tmp` are for illustration purposes only and not used in the later simulation modules of this demo.
```{r}
## input parameters
years_tmp <- 10
I_tmp <- years_tmp / time_unit
# set linearly increasing exposure, ...
E_tmp <- c(rep(12000, I)) + seq(from = 0, by = 100, length = I)
# and constant frequency per unit of exposure
lambda_tmp <- c(rep(0.03, I))
## output
# Number of claims occurring for each period i
n_vector_tmp <- claim_frequency(I = I_tmp, E = E_tmp, freq = lambda_tmp)
n_vector_tmp
# Occurrence time of each claim r, for each period i
occurrence_times_tmp <- claim_occurrence(frequency_vector = n_vector_tmp)
occurrence_times_tmp[[1]]
```
## Example 1.3: Alternative claim frequency distribution {#ex1.3}
Users can choose to specify their own claim frequency distribution via `simfun`, which takes both random generation functions (`type = "r"`, the default) and cumulative distribution functions (`type = "p"`). For example, we can use the negative binomial distribution in base `R`, or the zero-truncated Poisson distribution from the `actuar` package.
```{r}
# simulate claim frequencies from negative binomial
# 1. using type-"r" specification (default)
claim_frequency(I = I, simfun = rnbinom, size = 100, mu = 100)
# 2. using type-"p" specification, equivalent to above
claim_frequency(I = I, simfun = pnbinom, type = "p", size = 100, mu = 100)
# simulate claim frequencies from zero-truncated Poisson
claim_frequency(I = I, simfun = actuar::rztpois, lambda = 90)
claim_frequency(I = I, simfun = actuar::pztpois, type = "p", lambda = 90)
```
Similar to [Example 1.2](#ex1.2), we can modify the frequency parameters to vary across periods:
```{r}
claim_frequency(I = I, simfun = actuar::rztpois, lambda = time_unit * E_tmp * lambda_tmp)
```
If one wishes to code their own sampling function (either a direct random generating function, or a proper CDF), this can be achieved by:
```{r dpi=150, fig.width=7, fig.height=4, out.width=650}
# sampling from non-homogeneous Poisson process
rnhpp.count <- function(no_periods) {
rate <- 3000
intensity <- function(x) {
# e.g. cyclical Poisson process
0.03 * (sin(x * pi / 2) / 4 + 1)
}
lambda_max <- 0.03 * (1/4 + 1)
target_num_events <- no_periods * rate * lambda_max
# simulate a homogeneous Poisson process
N <- stats::rpois(1, target_num_events) # total number of events
event_times <- sort(stats::runif(N, 0, no_periods)) # random times of occurrence
# use a thinning step to turn this into a non-homogeneous process
accept_probs <- intensity(event_times) / lambda_max
is_accepted <- (stats::runif(N) < accept_probs)
claim_times <- event_times[is_accepted]
as.numeric(table(cut(claim_times, breaks = 0:no_periods)))
}
n_vector_tmp <- claim_frequency(I = I, simfun = rnhpp.count)
plot(x = 1:I, y = n_vector_tmp, type = "l",
main = "Claim frequency simulated from a cyclical Poisson process",
xlab = "Occurrence period", ylab = "# Claims")
```
## Example 1.4: Alternative specification of the claim arrival process {#ex1.4}
We note that the `claim_occurrence()` function for simulating the claim times conditional on claim frequencies assumes a uniform distribution, and that this cannot be modified without changing the module. Indeed, the modular structure of `SynthETIC` ensures that one can easily unplug any one module and replace it with a version modified to his/her own purpose.
For example, if one wishes to replace this uniform distribution assumption and/or the whole [Claim Occurrence](#occurrence) module, they can simply supply their own vector of claim times and easily convert to the list format consistent with the `SynthETIC` framework for smooth integration with the later modules.
```{r}
# Equivalent to a Poisson process
event_times_tmp <- sort(stats::runif(n = 4000, 0, I))
accept_probs_tmp <- (sin(event_times_tmp * pi / 2) + 1) / 2
is_accepted_tmp <- (stats::runif(length(event_times_tmp)) < accept_probs_tmp)
claim_times_tmp <- event_times_tmp[is_accepted_tmp]
# Number of claims occurring for each period i
# by counting the number of event times in each interval (i, i + 1)
n_vector_tmp <- as.numeric(table(cut(claim_times_tmp, breaks = 0:I)))
n_vector_tmp
# Occurrence time of each claim r, for each period i
occurrence_times_tmp <- to_SynthETIC(x = claim_times_tmp,
frequency_vector = n_vector_tmp)
occurrence_times_tmp[[1]]
```
2. Claim Size {#size}
---
## Example 2.1: Default power normal {#ex2.1}
By default `claim_size()` assumes a left truncated power normal distribution: $S^{0.2} \sim \mathcal{N}(\mu = 9.5, \sigma = 3)$, left truncated at 30. **There is no need to specify a sampling distribution if the user is happy with the default power normal.** This example is mainly to demonstrate how the default function works.
### Input parameters
We can specify the CDF to generate claim sizes from. The default distribution function can be coded as follows:
```{r}
# use a power normal S^0.2 ~ N(9.5, 3), left truncated at 30
# this is the default distribution driving the claim_size() function
S_df <- function(s) {
# truncate and rescale
if (s < 30) {
return(0)
} else {
p_trun <- pnorm(s^0.2, 9.5, 3) - pnorm(30^0.2, 9.5, 3)
p_rescaled <- p_trun/(1 - pnorm(30^0.2, 9.5, 3))
return(p_rescaled)
}
}
```
### Implementation and Output
```{r}
# shorter equivalent: claim_sizes <- claim_size(frequency_vector = n_vector)
claim_sizes <- claim_size(frequency_vector = n_vector,
simfun = S_df, type = "p", range = c(0, 1e24))
claim_sizes[[1]]
```
## Example 2.2: Alternative claim size distribution {#ex2.2}
Users can also choose any other individual claim size distribution, e.g. Weibull from base `R` or inverse Gaussian from `actuar`:
```{r dpi=150, fig.width=7, fig.height=4, out.width=650}
## weibull
# estimate the weibull parameters to achieve the mean and cv matching that of
# the built-in test claim dataset
claim_size_mean <- mean(test_claim_dataset$claim_size)
claim_size_cv <- cv(test_claim_dataset$claim_size)
weibull_shape <- get_Weibull_parameters(target_mean = claim_size_mean,
target_cv = claim_size_cv)[1]
weibull_scale <- get_Weibull_parameters(target_mean = claim_size_mean,
target_cv = claim_size_cv)[2]
# simulate claim sizes with the estimated parameters
claim_sizes_weibull <- claim_size(frequency_vector = n_vector,
simfun = rweibull,
shape = weibull_shape, scale = weibull_scale)
# plot empirical CDF
plot(ecdf(unlist(test_claim_dataset$claim_size)), xlim = c(0, 2000000),
main = "Empirical distribution of simulated claim sizes",
xlab = "Individual claim size")
plot(ecdf(unlist(claim_sizes_weibull)), add = TRUE, col = 2)
## inverse Gaussian
# modify actuar::rinvgauss (left truncate it @30 and right censor it @5,000,000)
rinvgauss_censored <- function(n) {
s <- actuar::rinvgauss(n, mean = 180000, dispersion = 0.5e-5)
while (any(s < 30 | s > 5000000)) {
for (j in which(s < 30 | s > 5000000)) {
# for rejected values, resample
s[j] <- actuar::rinvgauss(1, mean = 180000, dispersion = 0.5e-5)
}
}
s
}
# simulate from the modified inverse Gaussian distribution
claim_sizes_invgauss <- claim_size(frequency_vector = n_vector, simfun = rinvgauss_censored)
# plot empirical CDF
plot(ecdf(unlist(claim_sizes_invgauss)), add = TRUE, col = 3)
legend.text <- c("Power normal", "Weibull", "Inverse Gaussian")
legend("bottomright", legend.text, col = 1:3, lty = 1, bty = "n")
```
## Example 2.3: Simulating claim sizes from covariates
The applications discussed above assume that the claim sizes are sampled from a single distribution for all policyholders (e.g. the default power normal, custom sampling distribution specified by `simfun`).
Suppose we instead want to simulate from a model which uses covariates to predict claim sizes. For example, consider a (theoretical) gamma GLM with log link:
\[
\begin{align*}
E(S_i) =\mu_i &=\exp(\boldsymbol{x}_i^\top \boldsymbol\beta)\\
&= \exp(\beta_0 + \beta_1 \times age_i + \beta_2 \times age_i^2)\\
&= \exp(27 - 0.768 \times age_i + 0.008 \times age_i^2)
\end{align*}
\]
```{r dpi=150, fig.width=7, fig.height=4, out.width=650}
# define the random generation function to simulate from the gamma GLM
sim_GLM <- function(n) {
# simulate covariates
age <- sample(20:70, size = n, replace = T)
mu <- exp(27 - 0.768 * age + 0.008 * age^2)
rgamma(n, shape = 10, scale = mu / 10)
}
claim_sizes_GLM <- claim_size(frequency_vector = n_vector, simfun = sim_GLM)
plot(ecdf(unlist(claim_sizes_GLM)), xlim = c(0, 2000000),
main = "Empirical distribution of claim sizes simulated from GLM",
xlab = "Individual claim size")
```
## Example 2.4: Bootstrapping from given loss data {#ex2.4}
Suppose we have an existing dataset of claim costs at hand that we wish to simulate from, e.g. `ausautoBI8999` (an automobile bodily injury claim dataset in Australia) from [`CASDatasets`](http://cas.uqam.ca). We can take a bootstrap resample of the dataset and then convert to `SynthETIC` format with ease:
```{r eval=FALSE}
# install.packages("CASdatasets", repos = "http://cas.uqam.ca/pub/", type = "source")
library(CASdatasets)
data("ausautoBI8999")
boot <- sample(ausautoBI8999$AggClaim, size = sum(n_vector), replace = TRUE)
claim_sizes_bootstrap <- to_SynthETIC(boot, frequency_vector = n_vector)
```
Another way to code this would be to write a random generation function to perform bootstrapping, and then use `claim_size` as usual:
```{r eval=FALSE}
sim_boot <- function(n) {
sample(ausautoBI8999$AggClaim, size = n, replace = TRUE)
}
claim_sizes_bootstrap <- claim_size(frequency_vector = n_vector, simfun = sim_boot)
```
Alternatively, one can easily fit a parametric distribution to an existing dataset with the help of the `fitdistrplus` package and then simulate from the fitted parametric distribution ([Example 2.2](#ex2.2)).
3. Claim Notification {#notification}
---
`SynthETIC` assumes the (removable) dependence of notification delay on claim size and occurrence period of the claim, and thus requires the user to specify a `paramfun` (*param*eter *fun*ction) with arguments `claim_size` and `occurrence_period` (and possibly more, see [Example 3.2](#ex3.2)). The dependencies **can be removed** if the arguments are not referenced inside the function; e.g. the default notification delay function (shown [below](#ex3.1)) is independent of the individual claim's `occurrence_period`.
Other than this pre-specified dependence structure, users are free to choose *any* distribution, whether it be a pre-defined distribution in `R`, or more advanced ones from packages, or a proper user-defined function, to better match their own claim experience.
Indeed, although not recommended, users are able to **add further dependencies** in their simulation. This is illustrated in [Example 4.2](#ex4.2) of the settlement delay module.
## Example 3.1: Default Weibull {#ex3.1}
By default, `SynthETIC` samples notification delays from a Weibull distribution:
```{r}
## input
# specify the Weibull parameters as a function of claim_size and occurrence_period
notidel_param <- function(claim_size, occurrence_period) {
# NOTE: users may add to, but not remove these two arguments (claim_size,
# occurrence_period) as they are part of SynthETIC's internal structure
# specify the target mean and target coefficient of variation
target_mean <- min(3, max(1, 2-(log(claim_size/(0.50 * ref_claim)))/3))/4 / time_unit
target_cv <- 0.70
# convert to Weibull parameters
shape <- get_Weibull_parameters(target_mean, target_cv)[1]
scale <- get_Weibull_parameters(target_mean, target_cv)[2]
c(shape = shape, scale = scale)
}
## output
notidel <- claim_notification(n_vector, claim_sizes,
rfun = rweibull, paramfun = notidel_param)
```
## Example 3.2: Alternative distribution for notification delay {#ex3.2}
`SynthETIC` does not restrict the choice of the sampling distribution. For example, we can use a transformed gamma distribution:
```{r dpi=150, fig.width=7, fig.height=4, out.width=650}
## input
# specify the transformed gamma parameters as a function of claim_size and occurrence_period
trgamma_param <- function(claim_size, occurrence_period, rate) {
c(shape1 = max(1, claim_size / ref_claim),
shape2 = 1 - occurrence_period / 200,
rate = rate)
}
## output
# simulate notification delays from the transformed gamma
notidel_trgamma <- claim_notification(n_vector, claim_sizes,
rfun = actuar::rtrgamma,
paramfun = trgamma_param, rate = 2)
# graphically compare the result with the default Weibull distribution
plot(ecdf(unlist(notidel)), xlim = c(0, 15),
main = "Empirical distribution of simulated notification delays",
xlab = "Notification delay (in quarters)")
plot(ecdf(unlist(notidel_trgamma)), add = TRUE, col = 2)
legend.text <- c("Weibull (default)", "Transformed gamma")
legend("bottomright", legend.text, col = 1:2, lty = 1, bty = "n")
```
Clearly the transformed gamma with the parameters specified above accelerates the reporting of the simulated claims.
## Example 3.3: User-defined sampling function for notification delay {#ex3.3}
One may wish to simulate from a more exotic sampling distribution that cannot be easily written as a nice pre-defined distribution function and its parameters. For example, consider a mixed distribution:
```{r}
rmixed_notidel <- function(n, claim_size) {
# consider a mixture distribution
# equal probability of sampling from x (Weibull) or y (transformed gamma)
x_selected <- sample(c(T, F), size = n, replace = TRUE)
x <- rweibull(n, shape = 2, scale = 1)
y <- actuar::rtrgamma(n, shape1 = min(1, claim_size / ref_claim), shape2 = 0.8, rate = 2)
result <- length(n)
result[x_selected] <- x[x_selected]; result[!x_selected] <- y[!x_selected]
return(result)
}
```
In this case, we can consider `claim_size` as the "parameter" for the sampling distribution (just in the same way as `shape` and `scale` for gamma distribution). Then we can either define a parameter function like below:
```{r}
rmixed_params <- function(claim_size, occurrence_period) {
# claim_size is the only "parameter" required for rmixed_notidel
c(claim_size = claim_size)
}
```
or simply run
```{r}
notidel_mixed <- claim_notification(n_vector, claim_sizes, rfun = rmixed_notidel)
```
which would give the same result as
```{r}
notidel_mixed <- claim_notification(n_vector, claim_sizes,
rfun = rmixed_notidel, paramfun = rmixed_params)
```
4. Claim Closure {#closure}
---
Claim settlement delay represents the delay from claim notification to closure. Like [notification delay](#notification), `SynthETIC` assumes the ([removable](#notification)) dependence of settlement delay on claim size and occurrence period of the claim, and thus requires the user to specify a `paramfun` (*param*eter *fun*ction) with arguments `claim_size` and `occurrence_period` (and possibly more, see [Example 3.2](#ex3.2)).
Other than this pre-specified dependence structure, users are free to choose *any* distribution by specifying their own `rfun` and/or `paramfun` (see `?claim_closure`).
Indeed, although not recommended, users are able to add further dependencies in their simulation. This is illustrated in [Example 4.2](#ex4.2).
## Example 4.1: Default Weibull {#ex4.1}
Below we show the default implementation with a Weibull distribution.
```{r}
## input
# specify the Weibull parameters as a function of claim_size and occurrence_period
setldel_param <- function(claim_size, occurrence_period) {
# NOTE: users may add to, but not remove these two arguments (claim_size,
# occurrence_period) as they are part of SynthETIC's internal structure
# specify the target Weibull mean
if (claim_size < (0.10 * ref_claim) & occurrence_period >= 21) {
a <- min(0.85, 0.65 + 0.02 * (occurrence_period - 21))
} else {
a <- max(0.85, 1 - 0.0075 * occurrence_period)
}
mean_quarter <- a * min(25, max(1, 6 + 4*log(claim_size/(0.10 * ref_claim))))
target_mean <- mean_quarter / 4 / time_unit
# specify the target Weibull coefficient of variation
target_cv <- 0.60
c(shape = get_Weibull_parameters(target_mean, target_cv)[1, ],
scale = get_Weibull_parameters(target_mean, target_cv)[2, ])
}
## output
# simulate the settlement delays from the Weibull with parameters above
setldel <- claim_closure(n_vector, claim_sizes, rfun = rweibull, paramfun = setldel_param)
setldel[[1]]
```
**There is no need to specify a sampling distribution if one is happy with the default Weibull specification.** This example is just to demonstrate some of the behind-the-scenes work of the default implementation, and at the same time, to show how one may specify and input a random sampling distribution of their choosing.
## Example 4.2: Additional dependencies {#ex4.2}
Suppose we would like to add the dependence of settlement delay on notification delay, which is not natively included in `SynthETIC` default setting. For example, let's consider the following parameter function:
```{r}
## input
# an extended parameter function for the simulation of settlement delays
setldel_param_extd <- function(claim_size, occurrence_period, notidel) {
# specify the target Weibull mean
if (claim_size < (0.10 * ref_claim) & occurrence_period >= 21) {
a <- min(0.85, 0.65 + 0.02 * (occurrence_period - 21))
} else {
a <- max(0.85, 1 - 0.0075 * occurrence_period)
}
mean_quarter <- a * min(25, max(1, 6 + 4*log(claim_size/(0.10 * ref_claim))))
# suppose the setldel mean is linearly related to the notidel of the claim
target_mean <- (mean_quarter + notidel) / 4 / time_unit
# specify the target Weibull coefficient of variation
target_cv <- 0.60
c(shape = get_Weibull_parameters(target_mean, target_cv)[1, ],
scale = get_Weibull_parameters(target_mean, target_cv)[2, ])
}
```
As this parameter function `setldel_param_extd` is dependent on `notidel`, it should not be surprising that we need to input the simulated notification delays when calling `claim_closure`. We need to make sure that the argument names are matched exactly (`notidel` in this example) and that the input is specified as a vector of simulated quantities (not a list).
```{r}
## output
# simulate the settlement delays from the Weibull with parameters above
notidel_vect <- unlist(notidel) # convert to a vector
setldel_extd <- claim_closure(n_vector, claim_sizes, rfun = rweibull,
paramfun = setldel_param_extd,
notidel = notidel_vect)
setldel_extd[[1]]
```
5. Claim Partial Payment - Number of Partial Payments {#payment-count}
---
`claim_payment_no()` generates the number of partial payments associated with a particular claim, from a user-defined random generation function which may depend on `claim_size`.
## Example 5.1: Default mixture distribution {#ex5.1}
Below we spell out the default function in `SynthETIC` that simulates the number of partial payments (from a mixture distribution):
```{r}
## input
# the default random generating function
rmixed_payment_no <- function(n, claim_size, claim_size_benchmark_1, claim_size_benchmark_2) {
# construct the range indicators
test_1 <- (claim_size_benchmark_1 < claim_size & claim_size <= claim_size_benchmark_2)
test_2 <- (claim_size > claim_size_benchmark_2)
# if claim_size <= claim_size_benchmark_1
no_pmt <- sample(c(1, 2), size = n, replace = T, prob = c(1/2, 1/2))
# if claim_size is between the two benchmark values
no_pmt[test_1] <- sample(c(2, 3), size = sum(test_1), replace = T, prob = c(1/3, 2/3))
# if claim_size > claim_size_benchmark_2
no_pmt_mean <- pmin(8, 4 + log(claim_size/claim_size_benchmark_2))
prob <- 1 / (no_pmt_mean - 3)
no_pmt[test_2] <- stats::rgeom(n = sum(test_2), prob = prob[test_2]) + 4
no_pmt
}
```
Since the random function directly takes `claim_size` as an input, no additional parameterisation is required (unlike in [Example 3.1](#ex3.1), where we first need a `paramfun` that turns the `claim_size` into Weibull parameters). We can simply run `claim_payment_no()` without inputting a `paramfun`.
```{r}
## output
no_payments <- claim_payment_no(n_vector, claim_sizes, rfun = rmixed_payment_no,
claim_size_benchmark_1 = 0.0375 * ref_claim,
claim_size_benchmark_2 = 0.075 * ref_claim)
no_payments[[1]]
```
Note that the `claim_size_benchmark_1` and `claim_size_benchmark_2` are passed on to `rmixed_payment_no` and will not be required if we choose an [alternative sampling distribution](#ex5.2).
This mixture sampling distribution has been included as the default. **There is no need to reproduce the above code if the user is happy with this default distribution.** A simple equivalent to the above code is just
```{r, eval=FALSE}
no_payments <- claim_payment_no(n_vector, claim_sizes)
```
This example is here only to demonstrate how the default function operates. If one would like to keep the structure of this function but modify the benchmark values, they may do so via
```{r}
no_payments_tmp <- claim_payment_no(n_vector, claim_sizes,
claim_size_benchmark_2 = 0.1 * ref_claim)
```
## Example 5.2: Alternative distribution for number of partial payments {#ex5.2}
Suppose we want to use a zero truncated Poisson distribution instead, with the rate parameter as a function of `claim_size`:
```{r}
## input
paymentNo_param <- function(claim_size) {
no_pmt_mean <- pmax(4, pmin(8, 4 + log(claim_size / 15000)))
c(lambda = no_pmt_mean - 3)
}
## output
no_payments_pois <- claim_payment_no(
n_vector, claim_sizes, rfun = actuar::rztpois, paramfun = paymentNo_param)
table(unlist(no_payments_pois))
```
Interlude: Claims Dataset
---
We can use the following code to create a claims dataset containing all individual claims features that we have simulated so far:
```{r}
claim_dataset <- generate_claim_dataset(
frequency_vector = n_vector,
occurrence_list = occurrence_times,
claim_size_list = claim_sizes,
notification_list = notidel,
settlement_list = setldel,
no_payments_list = no_payments
)
str(claim_dataset)
```
`test_claim_dataset`, included as part of the package, is an example dataset of individual claims features resulting from a specific run with the default assumptions.
```{r}
str(test_claim_dataset)
```
6. Claim Partial Payment - Sizes of Partial Payments (without inflation) {#payment-size}
---
## Example 6.1: Default Distribution {#ex6.1}
The default function samples the sizes of partial payments conditional on the number of partial payments, and the size of the claim:
```{r}
## input
rmixed_payment_size <- function(n, claim_size) {
# n = number of simulations, here n should be the number of partial payments
if (n >= 4) {
# 1) Simulate the "complement" of the proportion of total claim size
# represented by the last two payments
p_mean <- 1 - min(0.95, 0.75 + 0.04*log(claim_size/(0.10 * ref_claim)))
p_CV <- 0.20
p_parameters <- get_Beta_parameters(target_mean = p_mean, target_cv = p_CV)
last_two_pmts_complement <- stats::rbeta(
1, shape1 = p_parameters[1], shape2 = p_parameters[2])
last_two_pmts <- 1 - last_two_pmts_complement
# 2) Simulate the proportion of last_two_pmts paid in the second last payment
q_mean <- 0.9
q_CV <- 0.03
q_parameters <- get_Beta_parameters(target_mean = q_mean, target_cv = q_CV)
q <- stats::rbeta(1, shape1 = q_parameters[1], shape2 = q_parameters[2])
# 3) Calculate the respective proportions of claim amount paid in the
# last 2 payments
p_second_last <- q * last_two_pmts
p_last <- (1-q) * last_two_pmts
# 4) Simulate the "unnormalised" proportions of claim amount paid
# in the first (m - 2) payments
p_unnorm_mean <- last_two_pmts_complement/(n - 2)
p_unnorm_CV <- 0.10
p_unnorm_parameters <- get_Beta_parameters(
target_mean = p_unnorm_mean, target_cv = p_unnorm_CV)
amt <- stats::rbeta(
n - 2, shape1 = p_unnorm_parameters[1], shape2 = p_unnorm_parameters[2])
# 5) Normalise the proportions simulated in step 4
amt <- last_two_pmts_complement * (amt/sum(amt))
# 6) Attach the last 2 proportions, p_second_last and p_last
amt <- append(amt, c(p_second_last, p_last))
# 7) Multiply by claim_size to obtain the actual payment amounts
amt <- claim_size * amt
} else if (n == 2 | n == 3) {
p_unnorm_mean <- 1/n
p_unnorm_CV <- 0.10
p_unnorm_parameters <- get_Beta_parameters(
target_mean = p_unnorm_mean, target_cv = p_unnorm_CV)
amt <- stats::rbeta(
n, shape1 = p_unnorm_parameters[1], shape2 = p_unnorm_parameters[2])
# Normalise the proportions and multiply by claim_size to obtain the actual payment amounts
amt <- claim_size * amt/sum(amt)
} else {
# when there is a single payment
amt <- claim_size
}
return(amt)
}
## output
payment_sizes <- claim_payment_size(n_vector, claim_sizes, no_payments,
rfun = rmixed_payment_size)
payment_sizes[[1]][[1]]
```
As this is the default random generation function that `SynthETIC` adopts, a shorter equivalent command would be to call `claim_payment_no` without specifying a `rfun`.
```{r, eval=FALSE}
payment_sizes <- claim_payment_size(n_vector, claim_sizes, no_payments)
```
## Example 6.2: Alternative payment size distribution {#ex6.2}
Let's consider a simplistic example where we assume the partial payment sizes are (stochastically) equal. This will result in the following simulation function:
```{r}
## input
unif_payment_size <- function(n, claim_size) {
prop <- runif(n)
prop.normalised <- prop / sum(prop)
return(claim_size * prop.normalised)
}
## output
# note that we don't need to specify a paramfun as rfun is directly a function
# of claim_size
payment_sizes_unif <- claim_payment_size(n_vector, claim_sizes, no_payments,
rfun = unif_payment_size)
payment_sizes_unif[[1]][[1]]
```
7. Claim Payment Time {#payment-time}
---
The simulation of the inter-partial delays is almost identical to that of [partial payment sizes](#payment-size), except that it also depends on the claim settlement delay - the inter-partial delays should add up to the settlement delay.
Other than this, the `SynthETIC` function implementation of `claim_payment_delay()` is almost the same as `claim_payment_size()`, but of course, with a different default simulation function:
```{r}
## input
r_pmtdel <- function(n, claim_size, setldel, setldel_mean) {
result <- c(rep(NA, n))
# First simulate the unnormalised values of d, sampled from a Weibull distribution
if (n >= 4) {
# 1) Simulate the last payment delay
unnorm_d_mean <- (1 / 4) / time_unit
unnorm_d_cv <- 0.20
parameters <- get_Weibull_parameters(target_mean = unnorm_d_mean, target_cv = unnorm_d_cv)
result[n] <- stats::rweibull(1, shape = parameters[1], scale = parameters[2])
# 2) Simulate all the other payment delays
for (i in 1:(n - 1)) {
unnorm_d_mean <- setldel_mean / n
unnorm_d_cv <- 0.35
parameters <- get_Weibull_parameters(target_mean = unnorm_d_mean, target_cv = unnorm_d_cv)
result[i] <- stats::rweibull(1, shape = parameters[1], scale = parameters[2])
}
} else {
for (i in 1:n) {
unnorm_d_mean <- setldel_mean / n
unnorm_d_cv <- 0.35
parameters <- get_Weibull_parameters(target_mean = unnorm_d_mean, target_cv = unnorm_d_cv)
result[i] <- stats::rweibull(1, shape = parameters[1], scale = parameters[2])
}
}
# Normalise d such that sum(inter-partial delays) = settlement delay
# To make sure that the pmtdels add up exactly to setldel, we treat the last one separately
result[1:n-1] <- (setldel/sum(result)) * result[1:n-1]
result[n] <- setldel - sum(result[1:n-1])
return(result)
}
param_pmtdel <- function(claim_size, setldel, occurrence_period) {
# mean settlement delay
if (claim_size < (0.10 * ref_claim) & occurrence_period >= 21) {
a <- min(0.85, 0.65 + 0.02 * (occurrence_period - 21))
} else {
a <- max(0.85, 1 - 0.0075 * occurrence_period)
}
mean_quarter <- a * min(25, max(1, 6 + 4*log(claim_size/(0.10 * ref_claim))))
target_mean <- mean_quarter / 4 / time_unit
c(claim_size = claim_size,
setldel = setldel,
setldel_mean = target_mean)
}
## output
payment_delays <- claim_payment_delay(
n_vector, claim_sizes, no_payments, setldel,
rfun = r_pmtdel, paramfun = param_pmtdel,
occurrence_period = rep(1:I, times = n_vector))
# payment times on a continuous time scale
payment_times <- claim_payment_time(n_vector, occurrence_times, notidel, payment_delays)
# payment times in periods
payment_periods <- claim_payment_time(n_vector, occurrence_times, notidel, payment_delays,
discrete = TRUE)
cbind(payment_delays[[1]][[1]], payment_times[[1]][[1]], payment_periods[[1]][[1]])
```
8. Claim Inflation {#inflation}
---
### Input parameters
* **Base Inflation**: `base_inflation_past` = vector of historic **quarterly** inflation rates for the past $I$ periods, `base_inflation_future` = vector of expected **quarterly** base inflation rates for the next $I$ periods (users may also choose to simulate the future inflation rates); the lengths of the vector might differ from $I$ when a `time_unit` different from calendar quarter is used
* By default we assume nil base inflation (see documentation for `claim_payment_inflation`)
* **Superimposed Inflation with respect to occurrence time**: `SI_occurrence` = function of `occurrence_time` and `claim_size` that outputs the superimposed inflation index with respect to the occurrence time of the claim
* **Superimposed Inflation with respect to payment time**: `SI_payment` = function of `payment_time` and `claim_size` that outputs the superimposed inflation index with respect to payment time of the claim
```{r}
# Base inflation: a vector of quarterly rates
# In this demo we set base inflation to be at 2% p.a. constant for both past and future
# Users can choose to randominise the future rates if they wish
demo_rate <- (1 + 0.02)^(1/4) - 1
base_inflation_past <- rep(demo_rate, times = 40)
base_inflation_future <- rep(demo_rate, times = 40)
base_inflation_vector <- c(base_inflation_past, base_inflation_future)
# Superimposed inflation:
# 1) With respect to occurrence "time" (continuous scale)
SI_occurrence <- function(occurrence_time, claim_size) {
if (occurrence_time <= 20 / 4 / time_unit) {1}
else {1 - 0.4*max(0, 1 - claim_size/(0.25 * ref_claim))}
}
# 2) With respect to payment "time" (continuous scale)
# -> compounding by user-defined time unit
SI_payment <- function(payment_time, claim_size) {
period_rate <- (1 + 0.30)^(time_unit) - 1
beta <- period_rate * max(0, 1 - claim_size/ref_claim)
(1 + beta)^payment_time
}
```
### Implementation and Output
```{r}
# shorter equivalent code:
# payment_inflated <- claim_payment_inflation(
# n_vector, payment_sizes, payment_times, occurrence_times, claim_sizes,
# base_inflation_vector)
payment_inflated <- claim_payment_inflation(
n_vector,
payment_sizes,
payment_times,
occurrence_times,
claim_sizes,
base_inflation_vector,
SI_occurrence,
SI_payment
)
cbind(payment_sizes[[1]][[1]], payment_inflated[[1]][[1]])
```
Interlude: Transaction Dataset
---
Use the following code to create a transactions dataset containing full information of all the partial payments made.
```{r}
# construct a "claims" object to store all the simulated quantities
all_claims <- claims(
frequency_vector = n_vector,
occurrence_list = occurrence_times,
claim_size_list = claim_sizes,
notification_list = notidel,
settlement_list = setldel,
no_payments_list = no_payments,
payment_size_list = payment_sizes,
payment_delay_list = payment_delays,
payment_time_list = payment_times,
payment_inflated_list = payment_inflated
)
transaction_dataset <- generate_transaction_dataset(
all_claims,
adjust = FALSE # to keep the original (potentially out-of-bound) simulated payment times
)
str(transaction_dataset)
```
`test_transaction_dataset`, included as part of the package, is an example dataset showing full information of the claims features at a transaction/payment level, generated by a specific `SynthETIC` run with the default assumptions.
```{r}
str(test_transaction_dataset)
```
Output
---
`SynthETIC` includes an output function which summarises the claim payments by occurrence and development periods. The usage of the function takes the form
```{r, eval=FALSE}
claim_output(
frequency_vector = ,
payment_time_list = ,
payment_size_list = ,
aggregate_level = 1,
incremental = TRUE,
future = TRUE,
adjust = TRUE
)
```
Note that by default, we aggregate all out-of-bound transactions into the maximum development period. But if we set `adjust = FALSE`, then the function would produce a separate "tail" column to represent all payments beyond the maximum development period (see function documentation `?claim_output`).
Examples:
```{r}
# 1. Constant dollar value INCREMENTAL triangle
output <- claim_output(n_vector, payment_times, payment_sizes,
incremental = TRUE)
# 2. Constant dollar value CUMULATIVE triangle
output_cum <- claim_output(n_vector, payment_times, payment_sizes,
incremental = FALSE)
# 3. Actual (i.e. inflated) INCREMENTAL triangle
output_actual <- claim_output(n_vector, payment_times, payment_inflated,
incremental = TRUE)
# 4. Actual (i.e. inflated) CUMULATIVE triangle
output_actual_cum <- claim_output(n_vector, payment_times, payment_inflated,
incremental = FALSE)
# Aggregate at a yearly level
claim_output(n_vector, payment_times, payment_sizes, aggregate_level = 4)
```
Note that by setting `future = FALSE` we can obtain the upper left part of the triangle (i.e. only the past claim payments). The past data can then be used to perform chain-ladder reserving analysis:
```{r}
# output the past cumulative triangle
cumtri <- claim_output(n_vector, payment_times, payment_sizes,
aggregate_level = 4, incremental = FALSE, future = FALSE)
# calculate the age to age factors
selected <- vector()
J <- nrow(cumtri)
for (i in 1:(J - 1)) {
# use volume weighted age to age factors
selected[i] <- sum(cumtri[, (i + 1)], na.rm = TRUE) / sum(cumtri[1:(J - i), i], na.rm = TRUE)
}
# complete the triangle
CL_prediction <- cumtri
for (i in 2:J) {
for (j in (J - i + 2):J) {
CL_prediction[i, j] <- CL_prediction[i, j - 1] * selected[j - 1]
}
}
CL_prediction
```
We observe that the chain-ladder analysis performs very poorly on this simulated claim dataset. This is perhaps unsurprising in view of the data features and the extent to which they breach chain ladder assumptions. Data sets such as this are useful for testing models that endeavour to represent data outside the scope of the chain-ladder.
Plot of Cumulative Claims Payments
---
Note that by default, similar to the case of `claim_output` and `claim_payment_inflation`, we will truncate the claims development such that payments that were projected to fall out of the maximum development period are forced to be paid at the exact end of the maximum development period allowed. This convention will cause some concentration of transactions at the end of development period $I$ (shown as a surge in claims in the $I$th period).
Users can set `adjust = FALSE` to see the "true" picture of claims development without such artificial adjustment. If the plots look significantly different, this indicates to the user that the user's selection of lag parameters (notification and/or settlement delays) is not well matched to the maximum number of development periods allowed, and consideration might be given to changing one or the other.
```{r dpi=150, fig.width=7, fig.height=6, out.width=650}
plot(test_claims_object)
# compare with the "full complete picture"
plot(test_claims_object, adjust = FALSE)
```
```{r dpi=150, fig.width=7, fig.height=6, out.width=650}
# plot by occurrence and development years
plot(test_claims_object, by_year = TRUE)
```
Multiple Simulation Runs
---
Once all the input parameters have been set up, we can repeat the simulation process as many times as desired through a for loop. The code below saves the transaction dataset generated by each simulation run as a component of `results_all`.
```{r, eval = FALSE}
times <- 100
results_all <- vector("list")
for (i in 1:times) {
# Module 1: Claim occurrence
n_vector <- claim_frequency(I, E, lambda)
occurrence_times <- claim_occurrence(n_vector)
# Module 2: Claim size
claim_sizes <- claim_size(n_vector, S_df, type = "p", range = c(0, 1e24))
# Module 3: Claim notification
notidel <- claim_notification(n_vector, claim_sizes, paramfun = notidel_param)
# Module 4: Claim settlement
setldel <- claim_closure(n_vector, claim_sizes, paramfun = setldel_param)
# Module 5: Claim payment count
no_payments <- claim_payment_no(n_vector, claim_sizes, rfun = rmixed_payment_no,
claim_size_benchmark_1 = 0.0375 * ref_claim,
claim_size_benchmark_2 = 0.075 * ref_claim)
# Module 6: Claim payment size
payment_sizes <- claim_payment_size(n_vector, claim_sizes, no_payments,
rfun = rmixed_payment_size)
# Module 7: Claim payment time
payment_delays <- claim_payment_delay(n_vector, claim_sizes, no_payments, setldel,
rfun = r_pmtdel, paramfun = param_pmtdel,
occurrence_period = rep(1:I, times = n_vector))
payment_times <- claim_payment_time(n_vector, occurrence_times, notidel, payment_delays)
# Module 8: Claim inflation
payment_inflated <- claim_payment_inflation(
n_vector, payment_sizes, payment_times, occurrence_times,
claim_sizes, base_inflation_vector, SI_occurrence, SI_payment)
results_all[[i]] <- generate_transaction_dataset(
claims(
frequency_vector = n_vector,
occurrence_list = occurrence_times,
claim_size_list = claim_sizes,
notification_list = notidel,
settlement_list = setldel,
no_payments_list = no_payments,
payment_size_list = payment_sizes,
payment_delay_list = payment_delays,
payment_time_list = payment_times,
payment_inflated_list = payment_inflated),
# adjust = FALSE to retain the original simulated times
adjust = FALSE)
}
```
What if we are interested in seeing the average claims development over a large number of simulation runs? The `plot.claims` function in this package at present only works for a single `claims` object so we need to come up with a way to combine the `claims` objects generated by each run. A much simpler alternative would be to just increase the exposure rates and plot the resulting `claims` object. This has the same effect as averaging over a large number of simulation runs.
This long-run average of claims development offers insights into the effects of the distributional assumptions that users have made throughout the way, and hence the reasonableness of such choices.
The code below runs only for 10 simulations and we can already see the trend emerging, which matches with the result of our single simulation run above. Increasing `times` to run simulation will show a smoother trend, which we refrain from producing here because running simulation on this amount of data takes some time (100 simulations take around 10 minutes on a quad-core machine). We remark that the major simulation lags are caused by the `claim_payment_delay` and (less severely) `claim_payment_size` functions.
```{r dpi=150, fig.width=7, fig.height=6, out.width=650}
start.time <- proc.time()
times <- 10
# increase exposure to E*times to get the same results as the aggregation of
# multiple simulation runs
n_vector <- claim_frequency(I, E = E * times, lambda)
occurrence_times <- claim_occurrence(n_vector)
claim_sizes <- claim_size(n_vector)
notidel <- claim_notification(n_vector, claim_sizes, paramfun = notidel_param)
setldel <- claim_closure(n_vector, claim_sizes, paramfun = setldel_param)
no_payments <- claim_payment_no(n_vector, claim_sizes, rfun = rmixed_payment_no,
claim_size_benchmark_1 = 0.0375 * ref_claim,
claim_size_benchmark_2 = 0.075 * ref_claim)
payment_sizes <- claim_payment_size(n_vector, claim_sizes, no_payments, rmixed_payment_size)
payment_delays <- claim_payment_delay(n_vector, claim_sizes, no_payments, setldel,
rfun = r_pmtdel, paramfun = param_pmtdel,
occurrence_period = rep(1:I, times = n_vector))
payment_times <- claim_payment_time(n_vector, occurrence_times, notidel, payment_delays)
payment_inflated <- claim_payment_inflation(
n_vector, payment_sizes, payment_times, occurrence_times,
claim_sizes, base_inflation_vector, SI_occurrence, SI_payment)
all_claims <- claims(
frequency_vector = n_vector,
occurrence_list = occurrence_times,
claim_size_list = claim_sizes,
notification_list = notidel,
settlement_list = setldel,
no_payments_list = no_payments,
payment_size_list = payment_sizes,
payment_delay_list = payment_delays,
payment_time_list = payment_times,
payment_inflated_list = payment_inflated
)
plot(all_claims, adjust = FALSE) +
ggplot2::labs(subtitle = paste("With", times, "simulations"))
proc.time() - start.time
```
Users can also choose to plot by occurrence year, or remove the inflation by altering the arguments `by_year` and `inflated` in
```{r, eval=FALSE}
plot(claims, by_year = , inflated = , adjust = )
```