Skip to contents

Returns the names of datasets in the Item Response Warehouse (IRW) that match user-specified metadata, tag values, variable presence, and license criteria.

Usage

irw_filter(
  n_responses = NULL,
  n_categories = NULL,
  n_participants = NULL,
  n_items = NULL,
  responses_per_participant = NULL,
  responses_per_item = NULL,
  density = c(0.5, 1),
  var = NULL,
  age_range = NULL,
  child_age__for_child_focused_studies_ = NULL,
  construct_type = NULL,
  construct_name = NULL,
  sample = NULL,
  measurement_tool = NULL,
  item_format = NULL,
  primary_language_s_ = NULL,
  longitudinal = NULL,
  license = NULL
)

Arguments

n_responses

Numeric vector of length 1 or 2. Filters datasets by total number of responses.

  • Length 1: exact value (e.g., n_responses = 1000)

  • Length 2: range (e.g., n_responses = c(1000, Inf))

n_categories

Numeric vector of length 1 or 2. Filters by number of unique response categories.

n_participants

Numeric vector of length 1 or 2. Filters by number of unique participants (id).

n_items

Numeric vector of length 1 or 2. Filters by number of unique items.

responses_per_participant

Numeric vector of length 1 or 2. Filters by average responses per participant.

responses_per_item

Numeric vector of length 1 or 2. Filters by average responses per item.

density

Numeric vector of length 1 or 2, or NULL. Filters by matrix density.

  • Default is c(0.5, 1) to exclude sparse matrices.

  • Use NULL to disable this filter.

var

Character vector. Filters datasets by presence of variables.

  • Use exact names (e.g., "rt", "wave"), or

  • Use a prefix (e.g., "cov_") to match any variable starting with that prefix.

age_range

Character vector. Filters by participant age group (e.g., "Adult (18+)"). See irw_tag_options("age_range") for values.

child_age__for_child_focused_studies_

Character vector. Filters by child age subgroup. See irw_tag_options("child_age__for_child_focused_studies_") for values.

construct_type

Character vector. Filters by high-level construct category (e.g., "Affective/mental health"). See irw_tag_options("construct_type").

construct_name

Character vector. Filters by specific construct (e.g., "Big Five"). See irw_tag_options("construct_name").

sample

Character vector. Filters by sample type or recruitment method (e.g., "Educational", "Clinical"). See irw_tag_options("sample").

measurement_tool

Character vector. Filters by instrument type (e.g., "Survey/questionnaire"). See irw_tag_options("measurement_tool").

item_format

Character vector. Filters by item format (e.g., "Likert Scale/selected response"). See irw_tag_options("item_format").

primary_language_s_

Character vector. Filters by language used (e.g., "eng"). See irw_tag_options("primary_language_s_").

longitudinal

Logical or NULL. Filters based on presence of longitudinal structure.

  • TRUE: include only datasets with variables like wave or date

  • FALSE: exclude datasets with those variables

  • NULL (default): no filter

license

Character vector. Filters datasets by license (e.g., "CC BY 4.0"). See irw_license_options() for available values.

Value

A sorted character vector of dataset names that match all specified filters, or character(0) if no match is found.

Details

Filtering is based on:

  • Numeric metadata: number of responses, participants, items, etc.

  • Tag metadata: e.g., construct type, sample, measurement tool

  • Variable presence: e.g., rt, wave, cov_

  • License type: e.g., "CC BY 4.0", "CC0 1.0"

Metadata and Tag-Based Filtering

To explore available metadata:

  • summary(irw_metadata()) — numeric summaries (e.g., n_responses, density)

  • irw_tags() — full tag metadata table (1 row per dataset)

  • irw_tag_options("column_name") — valid values (with counts) for any tag column

  • irw_license_options() — available license values with frequencies

Tag-based metadata (e.g., construct_type, sample, item_format) can be passed directly as named arguments. See the parameter list below for supported tag columns.

Examples

if (FALSE) { # \dontrun{
# Numeric filters
irw_filter(n_responses = c(1000, Inf), n_items = c(10, 50))
irw_filter(n_participants = c(500, Inf), density = c(0.3, 0.9))

# Variable presence
irw_filter(var = "rt")
irw_filter(var = c("wave", "cov_"), density = NULL)

# Tag metadata filtering
irw_filter(construct_type = "Affective/mental health", sample = "Educational")
irw_tag_options("construct_type")  # view tag values

# License filtering
irw_license_options()
irw_filter(license = "CC BY 4.0")

# Filter by response category complexity
irw_filter(n_categories = 2, density = NULL)           # binary
irw_filter(n_categories = c(3, 5), density = NULL)     # small multi-category
irw_filter(n_categories = c(10, Inf), density = NULL)  # large category sets
} # }