At the time of writing, semhelpinghands
has these groups of functions: manipulating parameter estimates tables,
comparing results from different methods, bootstrapping, and …
others.
This article will only introduce very briefly the groups. Some of them will be introduced in details in forthcoming article dedicated to them.
Let’s load the package first, and also load lavaan
.
In using lavaan
, I prefer reading the output of
parameterEstimates()
, which is compact and clear to me. I
sometimes would like to organize the rows and columns in ways meaningful
to a particular research questions. This can certainly be done using
base R or dplyr
. However, I am lazy and want to be able to
do things using just one or two functions, with just one or two
arguments.
This is a sample dataset for illustration, dvs_ivs
, with
3 predictors (x1
, x2
, and x3
), 3
outcome variables (y1
y2
, and
y3
), and a group variable (gp
).
First a single sample model:
data(dvs_ivs)
mod <-
"
y1 ~ x1 + x2 + x3
y2 ~ x1 + x3
y3 ~ y2 + x2
"
fit <- sem(model = mod,
data = dvs_ivs,
fixed.x = FALSE)
The parameter estimates tables:
est <- parameterEstimates(fit)
est
#> lhs op rhs est se z pvalue ci.lower ci.upper
#> 1 y1 ~ x1 0.206 0.087 2.354 0.019 0.034 0.377
#> 2 y1 ~ x2 0.381 0.093 4.100 0.000 0.199 0.563
#> 3 y1 ~ x3 0.162 0.088 1.852 0.064 -0.009 0.334
#> 4 y2 ~ x1 0.149 0.105 1.425 0.154 -0.056 0.354
#> 5 y2 ~ x3 0.230 0.105 2.187 0.029 0.024 0.437
#> 6 y3 ~ y2 0.044 0.106 0.420 0.675 -0.163 0.251
#> 7 y3 ~ x2 0.295 0.121 2.445 0.014 0.059 0.532
#> 8 y1 ~~ y1 0.695 0.098 7.071 0.000 0.502 0.888
#> 9 y2 ~~ y2 1.013 0.143 7.071 0.000 0.732 1.294
#> 10 y3 ~~ y3 1.206 0.171 7.071 0.000 0.872 1.540
#> 11 y1 ~~ y3 -0.027 0.092 -0.292 0.770 -0.206 0.153
#> 12 x1 ~~ x1 0.926 0.131 7.071 0.000 0.669 1.183
#> 13 x1 ~~ x2 0.118 0.088 1.339 0.180 -0.055 0.291
#> 14 x1 ~~ x3 0.000 0.092 -0.001 0.999 -0.180 0.180
#> 15 x2 ~~ x2 0.828 0.117 7.071 0.000 0.598 1.057
#> 16 x2 ~~ x3 0.073 0.087 0.840 0.401 -0.098 0.244
#> 17 x3 ~~ x3 0.912 0.129 7.071 0.000 0.659 1.165
A two-sample model is also fitted to the dataset:
This is the parameter estimates table:
est_gp <- parameterEstimates(fit_gp)
est_gp
#> lhs op rhs block group est se z pvalue ci.lower ci.upper
#> 1 y1 ~ x1 1 1 0.210 0.149 1.404 0.160 -0.083 0.502
#> 2 y1 ~ x2 1 1 0.243 0.138 1.761 0.078 -0.027 0.513
#> 3 y1 ~ x3 1 1 0.246 0.144 1.710 0.087 -0.036 0.527
#> 4 y2 ~ x1 1 1 0.036 0.179 0.199 0.842 -0.315 0.386
#> 5 y2 ~ x3 1 1 0.217 0.176 1.232 0.218 -0.128 0.561
#> 6 y3 ~ y2 1 1 0.096 0.120 0.802 0.423 -0.139 0.332
#> 7 y3 ~ x2 1 1 0.372 0.135 2.754 0.006 0.107 0.637
#> 8 y1 ~~ y1 1 1 0.754 0.157 4.796 0.000 0.446 1.062
#> 9 y2 ~~ y2 1 1 1.148 0.239 4.796 0.000 0.679 1.617
#> 10 y3 ~~ y3 1 1 0.787 0.164 4.796 0.000 0.466 1.109
#> 11 y1 ~~ y3 1 1 0.010 0.114 0.085 0.932 -0.213 0.232
#> 12 x1 ~~ x1 1 1 0.798 0.166 4.796 0.000 0.472 1.125
#> 13 x1 ~~ x2 1 1 0.223 0.132 1.694 0.090 -0.035 0.481
#> 14 x1 ~~ x3 1 1 0.121 0.121 1.001 0.317 -0.116 0.358
#> 15 x2 ~~ x2 1 1 0.938 0.196 4.796 0.000 0.555 1.321
#> 16 x2 ~~ x3 1 1 0.139 0.131 1.061 0.289 -0.118 0.397
#> 17 x3 ~~ x3 1 1 0.825 0.172 4.796 0.000 0.488 1.162
#> 18 y1 ~1 1 1 -0.295 0.137 -2.148 0.032 -0.564 -0.026
#> 19 y2 ~1 1 1 0.027 0.168 0.163 0.870 -0.302 0.357
#> 20 y3 ~1 1 1 0.233 0.131 1.775 0.076 -0.024 0.490
#> 21 x1 ~1 1 1 0.282 0.132 2.141 0.032 0.024 0.540
#> 22 x2 ~1 1 1 -0.067 0.143 -0.469 0.639 -0.347 0.213
#> 23 x3 ~1 1 1 -0.123 0.134 -0.915 0.360 -0.385 0.140
#> 24 y1 ~ x1 2 2 0.258 0.102 2.530 0.011 0.058 0.458
#> 25 y1 ~ x2 2 2 0.484 0.121 3.998 0.000 0.247 0.721
#> 26 y1 ~ x3 2 2 0.121 0.104 1.165 0.244 -0.082 0.324
#> 27 y2 ~ x1 2 2 0.226 0.127 1.774 0.076 -0.024 0.475
#> 28 y2 ~ x3 2 2 0.261 0.129 2.016 0.044 0.007 0.514
#> 29 y3 ~ y2 2 2 -0.003 0.162 -0.020 0.984 -0.322 0.315
#> 30 y3 ~ x2 2 2 0.291 0.190 1.527 0.127 -0.082 0.664
#> 31 y1 ~~ y1 2 2 0.568 0.109 5.196 0.000 0.354 0.783
#> 32 y2 ~~ y2 2 2 0.882 0.170 5.196 0.000 0.549 1.214
#> 33 y3 ~~ y3 2 2 1.412 0.272 5.196 0.000 0.880 1.945
#> 34 y1 ~~ y3 2 2 0.050 0.122 0.410 0.682 -0.189 0.289
#> 35 x1 ~~ x1 2 2 1.020 0.196 5.196 0.000 0.635 1.405
#> 36 x1 ~~ x2 2 2 0.042 0.117 0.361 0.718 -0.187 0.271
#> 37 x1 ~~ x3 2 2 -0.099 0.137 -0.719 0.472 -0.367 0.170
#> 38 x2 ~~ x2 2 2 0.722 0.139 5.196 0.000 0.450 0.994
#> 39 x2 ~~ x3 2 2 0.013 0.115 0.110 0.912 -0.212 0.238
#> 40 x3 ~~ x3 2 2 0.985 0.190 5.196 0.000 0.613 1.356
#> 41 y1 ~1 2 2 0.007 0.104 0.068 0.946 -0.196 0.211
#> 42 y2 ~1 2 2 -0.050 0.129 -0.392 0.695 -0.303 0.202
#> 43 y3 ~1 2 2 -0.333 0.163 -2.042 0.041 -0.652 -0.013
#> 44 x1 ~1 2 2 0.101 0.137 0.738 0.460 -0.168 0.371
#> 45 x2 ~1 2 2 0.092 0.116 0.797 0.425 -0.134 0.319
#> 46 x3 ~1 2 2 -0.065 0.135 -0.482 0.630 -0.330 0.200
So, these are what semhelpinghands
has for now:
add_sig()
Despite the controversy over null hypothesis significance testing, we
still need to report them, for now. They are there, but I have to decode
them mentally. Here comes add_sig()
:
add_sig(est)
#> lhs op rhs est sig se z pvalue ci.lower ci.upper
#> 1 y1 ~ x1 0.206 * 0.087 2.354 0.019 0.034 0.377
#> 2 y1 ~ x2 0.381 *** 0.093 4.100 0.000 0.199 0.563
#> 3 y1 ~ x3 0.162 0.088 1.852 0.064 -0.009 0.334
#> 4 y2 ~ x1 0.149 0.105 1.425 0.154 -0.056 0.354
#> 5 y2 ~ x3 0.230 * 0.105 2.187 0.029 0.024 0.437
#> 6 y3 ~ y2 0.044 0.106 0.420 0.675 -0.163 0.251
#> 7 y3 ~ x2 0.295 * 0.121 2.445 0.014 0.059 0.532
#> 8 y1 ~~ y1 0.695 *** 0.098 7.071 0.000 0.502 0.888
#> 9 y2 ~~ y2 1.013 *** 0.143 7.071 0.000 0.732 1.294
#> 10 y3 ~~ y3 1.206 *** 0.171 7.071 0.000 0.872 1.540
#> 11 y1 ~~ y3 -0.027 0.092 -0.292 0.770 -0.206 0.153
#> 12 x1 ~~ x1 0.926 *** 0.131 7.071 0.000 0.669 1.183
#> 13 x1 ~~ x2 0.118 0.088 1.339 0.180 -0.055 0.291
#> 14 x1 ~~ x3 0.000 0.092 -0.001 0.999 -0.180 0.180
#> 15 x2 ~~ x2 0.828 *** 0.117 7.071 0.000 0.598 1.057
#> 16 x2 ~~ x3 0.073 0.087 0.840 0.401 -0.098 0.244
#> 17 x3 ~~ x3 0.912 *** 0.129 7.071 0.000 0.659 1.165
If bootstrapping was used, it also supports using the confidence intervals, which may yield results different from the p-values when bootstrapping confidence intervals are used (but same results in this example):
add_sig(est,
use = c("pvalue", "ci"))
#> lhs op rhs est sig ci se z pvalue ci.lower ci.upper
#> 1 y1 ~ x1 0.206 * Sig 0.087 2.354 0.019 0.034 0.377
#> 2 y1 ~ x2 0.381 *** Sig 0.093 4.100 0.000 0.199 0.563
#> 3 y1 ~ x3 0.162 0.088 1.852 0.064 -0.009 0.334
#> 4 y2 ~ x1 0.149 0.105 1.425 0.154 -0.056 0.354
#> 5 y2 ~ x3 0.230 * Sig 0.105 2.187 0.029 0.024 0.437
#> 6 y3 ~ y2 0.044 0.106 0.420 0.675 -0.163 0.251
#> 7 y3 ~ x2 0.295 * Sig 0.121 2.445 0.014 0.059 0.532
#> 8 y1 ~~ y1 0.695 *** Sig 0.098 7.071 0.000 0.502 0.888
#> 9 y2 ~~ y2 1.013 *** Sig 0.143 7.071 0.000 0.732 1.294
#> 10 y3 ~~ y3 1.206 *** Sig 0.171 7.071 0.000 0.872 1.540
#> 11 y1 ~~ y3 -0.027 0.092 -0.292 0.770 -0.206 0.153
#> 12 x1 ~~ x1 0.926 *** Sig 0.131 7.071 0.000 0.669 1.183
#> 13 x1 ~~ x2 0.118 0.088 1.339 0.180 -0.055 0.291
#> 14 x1 ~~ x3 0.000 0.092 -0.001 0.999 -0.180 0.180
#> 15 x2 ~~ x2 0.828 *** Sig 0.117 7.071 0.000 0.598 1.057
#> 16 x2 ~~ x3 0.073 0.087 0.840 0.401 -0.098 0.244
#> 17 x3 ~~ x3 0.912 *** Sig 0.129 7.071 0.000 0.659 1.165
It also works on the standardized solution:
std <- standardizedSolution(fit)
add_sig(std)
#> lhs op rhs est.std sig se z pvalue ci.lower ci.upper
#> 1 y1 ~ x1 0.208 * 0.087 2.394 0.017 0.038 0.378
#> 2 y1 ~ x2 0.364 *** 0.083 4.364 0.000 0.200 0.527
#> 3 y1 ~ x3 0.163 0.087 1.871 0.061 -0.008 0.333
#> 4 y2 ~ x1 0.138 0.096 1.438 0.150 -0.050 0.326
#> 5 y2 ~ x3 0.212 * 0.095 2.236 0.025 0.026 0.397
#> 6 y3 ~ y2 0.041 0.097 0.420 0.674 -0.149 0.231
#> 7 y3 ~ x2 0.237 * 0.094 2.517 0.012 0.053 0.422
#> 8 y1 ~~ y1 0.767 *** 0.074 10.369 0.000 0.622 0.912
#> 9 y2 ~~ y2 0.936 *** 0.047 19.799 0.000 0.844 1.029
#> 10 y3 ~~ y3 0.941 *** 0.046 20.649 0.000 0.852 1.031
#> 11 y1 ~~ y3 -0.029 0.100 -0.292 0.770 -0.225 0.167
#> 12 x1 ~~ x1 1.000 0.000 NA NA 1.000 1.000
#> 13 x1 ~~ x2 0.135 0.098 1.377 0.169 -0.057 0.328
#> 14 x1 ~~ x3 0.000 0.100 -0.001 0.999 -0.196 0.196
#> 15 x2 ~~ x2 1.000 0.000 NA NA 1.000 1.000
#> 16 x2 ~~ x3 0.084 0.099 0.849 0.396 -0.110 0.279
#> 17 x3 ~~ x3 1.000 0.000 NA NA 1.000 1.000
Note: But be careful about the interpretation of the
p-values in the standardized solution, which are based on the
delta method. See
vignette("standardizedSolution_boot_ci")
.
See the help page of add_sig()
for other options.
filter_by()
Its purpose is very simple, selecting rows by commonly used columns:
operators (op
), “dependent variables” (lhs
),
and “independent variables” (rhs
).
filter_by(est,
op = "~")
#> lhs op rhs est se z pvalue ci.lower ci.upper
#> 1 y1 ~ x1 0.206 0.087 2.354 0.019 0.034 0.377
#> 2 y1 ~ x2 0.381 0.093 4.100 0.000 0.199 0.563
#> 3 y1 ~ x3 0.162 0.088 1.852 0.064 -0.009 0.334
#> 4 y2 ~ x1 0.149 0.105 1.425 0.154 -0.056 0.354
#> 5 y2 ~ x3 0.230 0.105 2.187 0.029 0.024 0.437
#> 6 y3 ~ y2 0.044 0.106 0.420 0.675 -0.163 0.251
#> 7 y3 ~ x2 0.295 0.121 2.445 0.014 0.059 0.532
It also supports filtering by group using group labels:
filter_by(est_gp,
op = "~",
group = "gp1",
fit = fit_gp)
#> lhs op rhs block group est se z pvalue ci.lower ci.upper
#> 1 y1 ~ x1 1 1 0.210 0.149 1.404 0.160 -0.083 0.502
#> 2 y1 ~ x2 1 1 0.243 0.138 1.761 0.078 -0.027 0.513
#> 3 y1 ~ x3 1 1 0.246 0.144 1.710 0.087 -0.036 0.527
#> 4 y2 ~ x1 1 1 0.036 0.179 0.199 0.842 -0.315 0.386
#> 5 y2 ~ x3 1 1 0.217 0.176 1.232 0.218 -0.128 0.561
#> 6 y3 ~ y2 1 1 0.096 0.120 0.802 0.423 -0.139 0.332
#> 7 y3 ~ x2 1 1 0.372 0.135 2.754 0.006 0.107 0.637
See the help page of filter_by()
for other options.
group_by_dvs()
and
group_by_ivs()
Sometimes I want the conventional iv-column-dv-row or
dv-column-iv-row format. That’s what group_by_dvs()
and
group_by_ivs()
are for:
group_by_dvs(est)
#> iv est_y1 est_y2 est_y3
#> x1 x1 0.206 0.149 --
#> x2 x2 0.381 -- 0.295
#> x3 x3 0.162 0.230 --
#> y2 y2 -- -- 0.044
group_by_ivs(est)
#> dv est_x1 est_x2 est_x3 est_y2
#> y1 y1 0.206 0.381 0.162 --
#> y2 y2 0.149 -- 0.230 --
#> y3 y3 -- 0.295 -- 0.044
They also supports extracting another column:
group_by_dvs(est,
col_name = "pvalue")
#> iv pvalue_y1 pvalue_y2 pvalue_y3
#> x1 x1 0.019 0.154 --
#> x2 x2 0.000 -- 0.014
#> x3 x3 0.064 0.029 --
#> y2 y2 -- -- 0.675
group_by_ivs(est,
col_name = "pvalue")
#> dv pvalue_x1 pvalue_x2 pvalue_x3 pvalue_y2
#> y1 y1 0.019 0.000 0.064 --
#> y2 y2 0.154 -- 0.029 --
#> y3 y3 -- 0.014 -- 0.675
See the help page of group_by_dvs()
and
group_by_ivs()
for other options.
group_by_groups()
In multiple sample models, one common task is comparing results
across groups. I wrote group_by_groups()
for this task, to
compare results side-by-side:
group_by_groups(est_gp)
#> lhs op rhs est_1 est_2
#> 1 y1 ~ x1 0.210 0.258
#> 2 y1 ~ x2 0.243 0.484
#> 3 y1 ~ x3 0.246 0.121
#> 4 y2 ~ x1 0.036 0.226
#> 5 y2 ~ x3 0.217 0.261
#> 6 y3 ~ x2 0.372 0.291
#> 7 y3 ~ y2 0.096 -0.003
#> 8 x1 ~~ x1 0.798 1.020
#> 9 x1 ~~ x2 0.223 0.042
#> 10 x1 ~~ x3 0.121 -0.099
#> 11 x2 ~~ x2 0.938 0.722
#> 12 x2 ~~ x3 0.139 0.013
#> 13 x3 ~~ x3 0.825 0.985
#> 14 y1 ~~ y1 0.754 0.568
#> 15 y1 ~~ y3 0.010 0.050
#> 16 y2 ~~ y2 1.148 0.882
#> 17 y3 ~~ y3 0.787 1.412
#> 18 x1 ~1 0.282 0.101
#> 19 x2 ~1 -0.067 0.092
#> 20 x3 ~1 -0.123 -0.065
#> 21 y1 ~1 -0.295 0.007
#> 22 y2 ~1 0.027 -0.050
#> 23 y3 ~1 0.233 -0.333
It also supports extracting several columns:
group_by_groups(est_gp,
col_names = c("est", "pvalue"))
#> lhs op rhs est_1 est_2 pvalue_1 pvalue_2
#> 1 y1 ~ x1 0.210 0.258 0.160 0.011
#> 2 y1 ~ x2 0.243 0.484 0.078 0.000
#> 3 y1 ~ x3 0.246 0.121 0.087 0.244
#> 4 y2 ~ x1 0.036 0.226 0.842 0.076
#> 5 y2 ~ x3 0.217 0.261 0.218 0.044
#> 6 y3 ~ x2 0.372 0.291 0.006 0.127
#> 7 y3 ~ y2 0.096 -0.003 0.423 0.984
#> 8 x1 ~~ x1 0.798 1.020 0.000 0.000
#> 9 x1 ~~ x2 0.223 0.042 0.090 0.718
#> 10 x1 ~~ x3 0.121 -0.099 0.317 0.472
#> 11 x2 ~~ x2 0.938 0.722 0.000 0.000
#> 12 x2 ~~ x3 0.139 0.013 0.289 0.912
#> 13 x3 ~~ x3 0.825 0.985 0.000 0.000
#> 14 y1 ~~ y1 0.754 0.568 0.000 0.000
#> 15 y1 ~~ y3 0.010 0.050 0.932 0.682
#> 16 y2 ~~ y2 1.148 0.882 0.000 0.000
#> 17 y3 ~~ y3 0.787 1.412 0.000 0.000
#> 18 x1 ~1 0.282 0.101 0.032 0.460
#> 19 x2 ~1 -0.067 0.092 0.639 0.425
#> 20 x3 ~1 -0.123 -0.065 0.360 0.630
#> 21 y1 ~1 -0.295 0.007 0.032 0.946
#> 22 y2 ~1 0.027 -0.050 0.870 0.695
#> 23 y3 ~1 0.233 -0.333 0.076 0.041
If the fit object is used, it can print group labels:
group_by_groups(fit_gp,
col_names = c("est", "pvalue"))
#> lhs op rhs est_gp1 est_gp2 pvalue_gp1 pvalue_gp2
#> 1 y1 ~ x1 0.210 0.258 0.160 0.011
#> 2 y1 ~ x2 0.243 0.484 0.078 0.000
#> 3 y1 ~ x3 0.246 0.121 0.087 0.244
#> 4 y2 ~ x1 0.036 0.226 0.842 0.076
#> 5 y2 ~ x3 0.217 0.261 0.218 0.044
#> 6 y3 ~ x2 0.372 0.291 0.006 0.127
#> 7 y3 ~ y2 0.096 -0.003 0.423 0.984
#> 8 x1 ~~ x1 0.798 1.020 0.000 0.000
#> 9 x1 ~~ x2 0.223 0.042 0.090 0.718
#> 10 x1 ~~ x3 0.121 -0.099 0.317 0.472
#> 11 x2 ~~ x2 0.938 0.722 0.000 0.000
#> 12 x2 ~~ x3 0.139 0.013 0.289 0.912
#> 13 x3 ~~ x3 0.825 0.985 0.000 0.000
#> 14 y1 ~~ y1 0.754 0.568 0.000 0.000
#> 15 y1 ~~ y3 0.010 0.050 0.932 0.682
#> 16 y2 ~~ y2 1.148 0.882 0.000 0.000
#> 17 y3 ~~ y3 0.787 1.412 0.000 0.000
#> 18 x1 ~1 0.282 0.101 0.032 0.460
#> 19 x2 ~1 -0.067 0.092 0.639 0.425
#> 20 x3 ~1 -0.123 -0.065 0.360 0.630
#> 21 y1 ~1 -0.295 0.007 0.032 0.946
#> 22 y2 ~1 0.027 -0.050 0.870 0.695
#> 23 y3 ~1 0.233 -0.333 0.076 0.041
See the help page of group_by_groups()
for other
options.
group_by_models()
This is inspired by the proposal Rönkkö proposed in a GitHub issue
for semTools
. I want something simple for a quick overview
and so I wrote group_by_models()
.
Suppose this is the other model fitted:
mod2 <-
"
y1 ~ x1 + x2 + x3
y2 ~ x1 + x2
y3 ~ y2 + x1
"
fit2 <- sem(model = mod2,
data = dvs_ivs,
fixed.x = FALSE)
est2 <- parameterEstimates(fit2)
est2
#> lhs op rhs est se z pvalue ci.lower ci.upper
#> 1 y1 ~ x1 0.209 0.087 2.394 0.017 0.038 0.381
#> 2 y1 ~ x2 0.387 0.093 4.171 0.000 0.205 0.569
#> 3 y1 ~ x3 0.162 0.088 1.853 0.064 -0.009 0.334
#> 4 y2 ~ x1 0.176 0.106 1.655 0.098 -0.032 0.384
#> 5 y2 ~ x2 -0.209 0.112 -1.864 0.062 -0.430 0.011
#> 6 y3 ~ y2 0.023 0.109 0.210 0.834 -0.190 0.236
#> 7 y3 ~ x1 -0.157 0.117 -1.339 0.181 -0.388 0.073
#> 8 y1 ~~ y1 0.695 0.098 7.071 0.000 0.502 0.888
#> 9 y2 ~~ y2 1.026 0.145 7.071 0.000 0.741 1.310
#> 10 y3 ~~ y3 1.254 0.177 7.071 0.000 0.906 1.601
#> 11 y1 ~~ y3 -0.028 0.093 -0.299 0.765 -0.211 0.155
#> 12 x1 ~~ x1 0.926 0.131 7.071 0.000 0.669 1.183
#> 13 x1 ~~ x2 0.118 0.088 1.339 0.180 -0.055 0.291
#> 14 x1 ~~ x3 0.000 0.092 -0.001 0.999 -0.180 0.180
#> 15 x2 ~~ x2 0.828 0.117 7.071 0.000 0.598 1.057
#> 16 x2 ~~ x3 0.073 0.087 0.840 0.401 -0.098 0.244
#> 17 x3 ~~ x3 0.912 0.129 7.071 0.000 0.659 1.165
These two models have no nested relationships. To compare the
estimates, group_by_models()
can be used:
group_by_models(list(Model1 = est,
Model2 = est2))
#> lhs op rhs est_Model1 est_Model2
#> 1 y1 ~ x1 0.206 0.209
#> 2 y1 ~ x2 0.381 0.387
#> 3 y1 ~ x3 0.162 0.162
#> 4 y2 ~ x1 0.149 0.176
#> 5 y2 ~ x2 NA -0.209
#> 6 y2 ~ x3 0.230 NA
#> 7 y3 ~ x1 NA -0.157
#> 8 y3 ~ x2 0.295 NA
#> 9 y3 ~ y2 0.044 0.023
#> 10 x1 ~~ x1 0.926 0.926
#> 11 x1 ~~ x2 0.118 0.118
#> 12 x1 ~~ x3 0.000 0.000
#> 13 x2 ~~ x2 0.828 0.828
#> 14 x2 ~~ x3 0.073 0.073
#> 15 x3 ~~ x3 0.912 0.912
#> 16 y1 ~~ y1 0.695 0.695
#> 17 y1 ~~ y3 -0.027 -0.028
#> 18 y2 ~~ y2 1.013 1.026
#> 19 y3 ~~ y3 1.206 1.254
It can also compare several columns:
group_by_models(list(Model1 = est,
Model2 = est2),
col_names = c("est", "pvalue"))
#> lhs op rhs est_Model1 est_Model2 pvalue_Model1 pvalue_Model2
#> 1 y1 ~ x1 0.206 0.209 0.019 0.017
#> 2 y1 ~ x2 0.381 0.387 0.000 0.000
#> 3 y1 ~ x3 0.162 0.162 0.064 0.064
#> 4 y2 ~ x1 0.149 0.176 0.154 0.098
#> 5 y2 ~ x2 NA -0.209 NA 0.062
#> 6 y2 ~ x3 0.230 NA 0.029 NA
#> 7 y3 ~ x1 NA -0.157 NA 0.181
#> 8 y3 ~ x2 0.295 NA 0.014 NA
#> 9 y3 ~ y2 0.044 0.023 0.675 0.834
#> 10 x1 ~~ x1 0.926 0.926 0.000 0.000
#> 11 x1 ~~ x2 0.118 0.118 0.180 0.180
#> 12 x1 ~~ x3 0.000 0.000 0.999 0.999
#> 13 x2 ~~ x2 0.828 0.828 0.000 0.000
#> 14 x2 ~~ x3 0.073 0.073 0.401 0.401
#> 15 x3 ~~ x3 0.912 0.912 0.000 0.000
#> 16 y1 ~~ y1 0.695 0.695 0.000 0.000
#> 17 y1 ~~ y3 -0.027 -0.028 0.770 0.765
#> 18 y2 ~~ y2 1.013 1.026 0.000 0.000
#> 19 y3 ~~ y3 1.206 1.254 0.000 0.000
See the help page of group_by_models()
for other
options.
sort_by()
The function sort_by()
can be used to sort rows using
(a) common fields such as "op"
, "lhs"
, and
"rhs"
, (b) operators such as "~"
and
"~~"
. It can be used on the output of some other functions
that manipulate a parameter estimates table.
out <- group_by_groups(est_gp,
col_names = c("est", "pvalue"))
out <- filter_by(out,
op = c("~", "~~"))
sort_by(out,
by = c("op", "rhs"))
#> lhs op rhs est_1 est_2 pvalue_1 pvalue_2
#> 1 y1 ~ x1 0.210 0.258 0.160 0.011
#> 2 y2 ~ x1 0.036 0.226 0.842 0.076
#> 3 y1 ~ x2 0.243 0.484 0.078 0.000
#> 4 y3 ~ x2 0.372 0.291 0.006 0.127
#> 5 y1 ~ x3 0.246 0.121 0.087 0.244
#> 6 y2 ~ x3 0.217 0.261 0.218 0.044
#> 7 y3 ~ y2 0.096 -0.003 0.423 0.984
#> 8 x1 ~~ x1 0.798 1.020 0.000 0.000
#> 9 x1 ~~ x2 0.223 0.042 0.090 0.718
#> 10 x2 ~~ x2 0.938 0.722 0.000 0.000
#> 11 x1 ~~ x3 0.121 -0.099 0.317 0.472
#> 12 x2 ~~ x3 0.139 0.013 0.289 0.912
#> 13 x3 ~~ x3 0.825 0.985 0.000 0.000
#> 14 y1 ~~ y1 0.754 0.568 0.000 0.000
#> 15 y2 ~~ y2 1.148 0.882 0.000 0.000
#> 16 y1 ~~ y3 0.010 0.050 0.932 0.682
#> 17 y3 ~~ y3 0.787 1.412 0.000 0.000
Some functions, such as group_by_models()
, will
automatically call sort_by()
to sort the results.
The default order should be acceptable in most cases, but can also be
customized. See the help page of sort_by()
on customizing
the order.
Though not officially supported, piping using |>
can
be used with most of the functions that manipulate parameter estimates
tables. For example:
est_gp |>
add_sig() |>
group_by_groups(col_names = c("est", "pvalue", "sig"),
group_first = FALSE) |>
filter_by(op = c("~"))
#> lhs op rhs est_1 pvalue_1 sig_1 est_2 pvalue_2 sig_2
#> 1 y1 ~ x1 0.210 0.160 0.258 0.011 *
#> 2 y1 ~ x2 0.243 0.078 0.484 0.000 ***
#> 3 y1 ~ x3 0.246 0.087 0.121 0.244
#> 4 y2 ~ x1 0.036 0.842 0.226 0.076
#> 5 y2 ~ x3 0.217 0.218 0.261 0.044 *
#> 6 y3 ~ x2 0.372 0.006 ** 0.291 0.127
#> 7 y3 ~ y2 0.096 0.423 -0.003 0.984
Though I believe the choice of the estimation method should be
justified, suppose we want to assess the sensitivity of the parameter
estimate results to the methods used, compare_estimators()
can be used as a quick way to compare results from different
methods.
out <- compare_estimators(fit,
estimator = c("ML", "GLS", "MLR"))
group_by_models(out,
col_names = c("se", "pvalue"))
#> lhs op rhs se_ML se_GLS se_MLR pvalue_ML pvalue_GLS pvalue_MLR
#> 1 y1 ~ x1 0.087 0.094 0.067 0.019 0.028 0.002
#> 2 y1 ~ x2 0.093 0.100 0.091 0.000 0.000 0.000
#> 3 y1 ~ x3 0.088 0.089 0.084 0.064 0.067 0.053
#> 4 y2 ~ x1 0.105 0.112 0.107 0.154 0.148 0.165
#> 5 y2 ~ x3 0.105 0.107 0.123 0.029 0.028 0.061
#> 6 y3 ~ x2 0.121 0.133 0.111 0.014 0.019 0.008
#> 7 y3 ~ y2 0.106 0.116 0.108 0.675 0.654 0.681
#> 8 x1 ~~ x1 0.131 0.129 0.136 0.000 0.000 0.000
#> 9 x1 ~~ x2 0.088 0.090 0.077 0.180 0.202 0.122
#> 10 x1 ~~ x3 0.092 0.092 0.102 0.999 0.840 0.999
#> 11 x2 ~~ x2 0.117 0.114 0.127 0.000 0.000 0.000
#> 12 x2 ~~ x3 0.087 0.089 0.091 0.401 0.414 0.423
#> 13 x3 ~~ x3 0.129 0.131 0.111 0.000 0.000 0.000
#> 14 y1 ~~ y1 0.098 0.100 0.119 0.000 0.000 0.000
#> 15 y1 ~~ y3 0.092 0.093 0.104 0.770 0.793 0.797
#> 16 y2 ~~ y2 0.143 0.139 0.141 0.000 0.000 0.000
#> 17 y3 ~~ y3 0.171 0.167 0.171 0.000 0.000 0.000
It simply refits the models for each estimator and returns the
results. They can then be treated as different “models” and processed by
group_by_models()
.
se_ratios()
is a wrapper of
group_by_models()
used to compare the standard errors by
different estimators in the output of
compare_estimators()
:
se_ratios(out,
reference = "ML")
#> lhs op rhs se_ML se_GLS se_MLR ratio_ML ratio_GLS ratio_MLR
#> 1 y1 ~ x1 0.087 0.094 0.067 1 1.072 0.765
#> 2 y1 ~ x2 0.093 0.100 0.091 1 1.073 0.977
#> 3 y1 ~ x3 0.088 0.089 0.084 1 1.013 0.958
#> 4 y2 ~ x1 0.105 0.112 0.107 1 1.068 1.027
#> 5 y2 ~ x3 0.105 0.107 0.123 1 1.011 1.165
#> 6 y3 ~ x2 0.121 0.133 0.111 1 1.101 0.918
#> 7 y3 ~ y2 0.106 0.116 0.108 1 1.099 1.020
#> 8 x1 ~~ x1 0.131 0.129 0.136 1 0.985 1.036
#> 9 x1 ~~ x2 0.088 0.090 0.077 1 1.015 0.867
#> 10 x1 ~~ x3 0.092 0.092 0.102 1 0.999 1.114
#> 11 x2 ~~ x2 0.117 0.114 0.127 1 0.974 1.084
#> 12 x2 ~~ x3 0.087 0.089 0.091 1 1.015 1.049
#> 13 x3 ~~ x3 0.129 0.131 0.111 1 1.012 0.859
#> 14 y1 ~~ y1 0.098 0.100 0.119 1 1.015 1.210
#> 15 y1 ~~ y3 0.092 0.093 0.104 1 1.015 1.134
#> 16 y2 ~~ y2 0.143 0.139 0.141 1 0.973 0.987
#> 17 y3 ~~ y3 0.171 0.167 0.171 1 0.979 1.005
See the help page of compare_estimators()
for other
options.
One issue with the standardized solution is the confidence intervals.
They are based on the delta method even if se = "boot"
is
used. For indirect effects, for which bootstrap confidence intervals are
commonly used, the confidence intervals for them in the standardized
solution are not what usually reported other tools for mediation. There
are some other powerful tools on the Internet and the [Google Group for
lavaan], see this
thread, to address this problem. I wrote
standardizedSolution_boot_ci()
not to replace them (it
obviously can’t), but to address a very specific case I usually
encounter myself: Generating the bootstrap confidence intervals for
standardized estimates based on the bootstrap estimates already
generated by se = "boot"
.
Please see vignette("standardizedSolution_boot_ci")
for
an illustration on standardizedSolution_boot_ci()
.
Another issue is examine bootstrap estimates, such as the
distribution of the estimates. The function plot_boot()
and
related functions can be used to compute bootstrap estimates and plot
them. The bootstrap estimates of free parameters (stored by
lavaan
), user-defined parameters (computed by
store_boot_def()
), and the standardized solution (computed
by store_boot_est_std()
) can be plotted.
Please see the article
on plot_boot()
on how to use this function.
lavaan::lavaan()
and its wrappers suc has
lavaan::sem()
and lavaan::cfa()
allow users to
set several options using an estimator
: ML
,
GLS
, WLSMV
, and others. However, it is not
easy to remember what the options set for an estimator. Instead of
finding them from the output of summary()
, this function
shows some of them in one table for a quick overview. This is an
example:
data(dvs_ivs)
mod <-
"
y1 ~ x1 + x2 + x3
y2 ~ x1 + x3
y3 ~ y2 + x2
"
fit_default <- sem(model = mod,
data = dvs_ivs)
show_more_options(fit_default)
#> Options Call Actual
#> Estimator(s) default ML
#> Standard Error (SE) default standard
#> Model Test Statistic(s) default standard
#> How Missing Data is Handled default listwise
#> Information Matrix (for SE) default expected
#> Information Matrix (for Model Test) default expected
#> Mean Structure default No
#> 'x' Fixed default TRUE
fit_MLR <- sem(model = mod,
data = dvs_ivs,
estimator = "MLR")
show_more_options(fit_MLR)
#> Options Call Actual
#> Estimator(s) MLR ML
#> Standard Error (SE) default robust.huber.white
#> Model Test Statistic(s) default yuan.bentler.mplus
#> How Missing Data is Handled default listwise
#> Information Matrix (for SE) default observed
#> Information Matrix (for Model Test) default observed
#> Mean Structure default No
#> 'x' Fixed default TRUE
fit_MLR_fiml <- sem(model = mod,
data = dvs_ivs,
estimator = "MLR",
missing = "fiml")
show_more_options(fit_MLR_fiml)
#> Options Call Actual
#> Estimator(s) MLR ML
#> Standard Error (SE) default robust.huber.white
#> Model Test Statistic(s) default yuan.bentler.mplus
#> How Missing Data is Handled fiml ml
#> Information Matrix (for SE) default observed
#> Information Matrix (for Model Test) default observed
#> Mean Structure default Yes
#> 'x' Fixed default TRUE
In structural equation modeling, closed-form solution is rare and
optimization (minimization) is used to find the/a solution. Out of
curiosity and teaching, I wrote a function to capture the minimization
history so I can examine it or even plot the process. The function,
record_history()
, is still in early development but should
work for now for common simple scenarios. Please refer the help page of
record_history()
to learn more about it.
In lavaan
, I rarely need to manually add covariances
between exogenous variables (defined in a loose sense: variables appear
on the right-hand side but not on the left-hand side of ~
).
However, I came into a situation in which lavaan
will not
do this (for good reasons). For example, when a covariance between a
residual term and an exogenous variable is set to free. I wrote two
simple functions, add_exo_cov()
and
auto_exo_cov()
, for this purpose. Please refer to their
help pages for further information.