Acquiring valid norm scores fundamentally relies on the representativeness of the norm sample. Traditionally, random sampling is employed to meet this end, but even this approach may result in a sample that diverges from the population structure. The cNORM R package provides a potent solution by incorporating sampling weights into the norming process, thereby diminishing the impact of non-representative norm samples on the norm score quality. A key component of this process is raking, or iterative proportional fitting, which allows for the post-stratification of the norm sample based on stratification variables (SVs), considering their population marginals.
When a norm sample inadequately represents the target population, especially regarding pertinent stratification variables, it can compromise the quality of the resulting norm scores (Kruskal & Mosteller, 1979). An illustration of this issue is the neglect of parental educational background when determining norm scores for a children’s intelligence test. Such oversight can lead to skewed norm scores, thus distorting the child’s actual intelligence level (Hernandez et al., 2017). As norm scores commonly inform significant decisions, like school placement or the diagnosis of learning disabilities (Gary & Lenhard, 2021; Lenhard, Lenhard & Gary, 2019; Lenhard & Lenhard, 2021), any bias can potentially disadvantage those being evaluated. Therefore, techniques such as sample weighting methods become crucial to mitigate non-representativeness in norm samples.
Raking, also called iterative proportional fitting, is a post-stratification approach targeted to enhance sample representativeness with respect to two or more stratification variables . For this purpose, sample weights are computed for every case in the norm sample based on the ratio between the proportion of the corresponding strata in the target population and the proportion in the actual norm sample (Lumley, 2011). The procedure can be described as an iterative post-stratification with respect to one variable in each step. For example, let’s assume a target population containing 49% female as well as 51% male persons, while the resulting norm sample contains 45% female and 55% male subjects. To enhance the representativeness of the norm sample with respect to the SV sex (female/male), every single female case would be weighted with \(w_{female}=\frac{49\%}{45\%}=1.09\) and every single male case with \(w_{male}=\frac{55\%}{51\%}=0.93\). For stratifying a norm sample with respect to two or more variables, for example sex(female/male) and education(low/medium/high), the before described adaptation is applied several times regarding the marginals of one variable by time iteratively. For example, if the weights are adapted with respect to the variable sex first, the weights would be adapted regarding education in the second step. Since the weights no longer represent the population with respect to variable sex after the second step, the weights are computed to SV sex in the third step respectively to education in the fourth step and so on until the corresponding raking weights are converged. Finally, the resulting raking weights respectively the weighted norm sample represents the target population with respect to the marginal proportions of the used SVs. Each case is assigned with an according weight in a way that the proportions of the strata in the norm sample aligns with the composition of the representative population.
The integration of raking weights in cNORM is accomplished in three steps.
Raking weights are computed regarding the proportions of the SVs in the target population and the actual norm sample. Afterwards, the resulting raking weights are standardized by dividing every weight by the smallest resulting raking weight, i.e., the smallest weight is set to 1.0, while the ratio between one weight and each other remains the same. Consequently, underrepresented cases in the sample are weighted with a factor larger 1.0. To compute the weights, please provide a data frame with three columns to specify the population marginals. The first column specifies the stratification variable, the second the factor level of the stratification variable and the third the proportion for the representative population. The function ‘computeWeights()’ is used to retrieve the weights. The original data and the marginals have to be passed as function parameters.
Secondly, the norm sample is ranked with respect to the raking weights using weighted percentile. This step is the actual start of the further regression-based norming approach and it is automatically applied in the ‘cnorm()’ function, as soon as weights are specified.
Finally, the standardized raking weights are used in the weighted best-subset regression to obtain an adequate norm model. While the former steps can be seen as kind of data preparation, the computation of the regression-based norm model represents the actual norming process, since the resulting regression model is used for the actual mapping between achieved raw score and assigned norm score. By using the standardized raking weights in weighted regression, an overfit of the regression model with respect to overrepresented data points should be reduced. This third step is as well applied automatically when using the ‘cnorm()’ function.
In the following, the usage of raking weights in regression-based norming with cNORM is illustrated in detail based the on a not representative norm sample for the German version of the Peabody Picture Vocabulary Test (PPVT-IV)
library(cNORM)
# Assign data to object norm.data
norm.data <- ppvt
head(norm.data)
#> age sex migration region raw group
#> 1 2.5971 1 0 west 120 3.160655
#> 2 2.5993 1 0 west 67 3.160655
#> 3 2.6241 1 0 west 23 3.160655
#> 4 2.8622 1 0 south 50 3.160655
#> 5 2.8764 1 0 south 44 3.160655
#> 6 2.9308 1 0 west 55 3.160655
For the post-stratification, we need population marginals for the relevant stratification variables as a data frame, with each level of each stratification variable in a row. The data frame must contain the names of the SVs (column 1), the single levels (column 2) and the corresponding proportion in the target population (column 3).
# Generate population marginals
marginals <- data.frame(var = c("sex", "sex", "migration", "migration"),
level = c(1,2,0,1),
prop = c(0.51, 0.49, 0.65, 0.35))
head(marginals)
#> var level prop
#> 1 sex 1 0.51
#> 2 sex 2 0.49
#> 3 migration 0 0.65
#> 4 migration 1 0.35
To calculate raking weights, the cNORM’s ‘computeWeights()’ function is used, with the norm sample data and the population marginals as function parameters.
weights <- computeWeights(data = norm.data, population.margins = marginals)
#> Raking converged normally after 3 iterations.
Using the ‘cnorm()’ function passing the raking weights by function parameter ‘weights’, the initial weighted ranking and the actual norming process is started.
The resulting model contains four predictors with a RMSE of 3.54212.
summary(norm.model)
#> cNORM Model Summary
#> -------------------
#> Number of terms: 22
#> Adjusted R-squared: 0.9907
#> RMSE: 3.4848
#> Selection strategy: 1, largest consistent model
#> Highest consistent model: 22
#> Raw score variable: raw
#> Raw score range: 7 to 221
#> Age range: 3.160655 to 16.40039
#>
#> Regression function:
#> raw ~ 593.004651752334 + (-82.1870232774703*L1) + (3.5416649258307*L2) + (-0.0701519782499846*L3) + (0.00065635847543705*L4) + (-2.30435019905202e-06*L5) + (-141.768234215452*A1) + (5.76896983470499*A2) + (20.0330104018818*L1A1) + (-1.07381262923471*L1A2) + (0.0145431577811304*L1A3) + (-0.83401919489776*L2A1) + (0.0490333333882947*L2A2) + (-0.000812004618093347*L2A3) + (0.0162615352043706*L3A1) + (-0.000978183218445947*L3A2) + (1.67170404168208e-05*L3A3) + (-0.000147180491808169*L4A1) + (8.52621753671988e-06*L4A2) + (-1.33454728850254e-07*L4A3) + (4.86066534257422e-07*L5A1) + (-2.51019737779606e-08*L5A2) + (2.83566710828722e-10*L5A3)
#> Final solution: 22 terms (highest consistent model)
#> R-Square Adj. = 0.990684
#> Final regression model: raw ~ L1 + L2 + L3 + L4 + L5 + A1 + A2 + L1A1 + L1A2 + L1A3 + L2A1 + L2A2 + L2A3 + L3A1 + L3A2 + L3A3 + L4A1 + L4A2 + L4A3 + L5A1 + L5A2 + L5A3
#> Regression function: raw ~ 593.0046518 + (-82.18702328*L1) + (3.541664926*L2) + (-0.07015197825*L3) + (0.0006563584754*L4) + (-2.304350199e-06*L5) + (-141.7682342*A1) + (5.768969835*A2) + (20.0330104*L1A1) + (-1.073812629*L1A2) + (0.01454315778*L1A3) + (-0.8340191949*L2A1) + (0.04903333339*L2A2) + (-0.0008120046181*L2A3) + (0.0162615352*L3A1) + (-0.0009781832184*L3A2) + (1.671704042e-05*L3A3) + (-0.0001471804918*L4A1) + (8.526217537e-06*L4A2) + (-1.334547289e-07*L4A3) + (4.860665343e-07*L5A1) + (-2.510197378e-08*L5A2) + (2.835667108e-10*L5A3)
#> Raw Score RMSE = 3.48483
#> Post stratification was applied. The weights range from 1 to 1.415 (m = 1.116, sd = 0.182).
Moreover, the percentile plot reveals no hints on model violation, like intersecting percentile curves. It reaches a high multiple R2 with only few terms.
plot(norm.model, "subset")
plot(norm.model, "norm")
We extensively simulated biased distributions and assessed, if our approach can mitigate the effects of unrepresentative samples. cNORM itself already corrects for several types of sampling eror, namely if deviations occur in specific age groups or if joint probabilities of stratification variables are unbalanced (while preserving the marginals). Weighted Continuous Norming as well works very well in most, but not all use cases. Please note the following: