Skip to contents

This function generates simulated item difficulties by drawing from normal distributions centered around existing difficulty estimates and their associated standard errors. The result is a mixture distribution, from which new difficulties are sampled using inverse CDF sampling.

Usage

irw_simu_diff(
  num_items = 10,
  num_replications = 1,
  irw_names = NULL,
  difficulty_pool = NULL
)

Arguments

num_items

Number of item difficulties to simulate per replication.

num_replications

Number of replications to perform. If 1, returns a numeric vector. If >1, returns a data frame.

irw_names

Optional character vector of IRW dataset names to filter from diff_long.

difficulty_pool

Optional custom data frame with columns dataset, difficulty, and SE. If provided, overrides the default IRW difficulty pool (diff_long).

Value

A numeric vector of difficulties (if num_replications = 1), or a data frame with replication and difficulty columns (if num_replications > 1).

Details

By default, the function uses diff_long, a built-in dataset included in the irw package. This dataset contains item difficulty estimates and standard errors from a curated subset of IRW datasets. You can:

  • Use the full IRW difficulty pool (diff_long)

  • Filter to specific IRW datasets via irw_names

  • Provide your own difficulty pool via difficulty_pool

This method is based on Zhang et al. (2025), which constructs realistic empirical distributions by accounting for uncertainty around item difficulty estimates.

References

Zhang, L., Liu, Y., Molenaar, D., & Domingue, B. (2025). Realistic Simulation of Item Difficulties. https://doi.org/10.31234/osf.io/jbhxy_v1

Examples

if (FALSE) { # \dontrun{
# Use all IRW data (default)
irw_simu_diff(num_items = 5)

# Filter to specific IRW datasets
irw_simu_diff(num_items = 5, irw_names = c("psychtools_epi", "psychtools_blot"))

# Use a custom difficulty pool
my_pool <- data.frame(dataset = "x",
                      difficulty = c(-0.2, 0.1),
                      SE = c(0.1, 0.2))
irw_simu_diff(num_items = 5, difficulty_pool = my_pool)

# Explore built-in IRW difficulty pool
head(diff_long)
unique(diff_long$dataset)
} # }