Face Validity

Packages

if(!require("dplyr")) install.packages("dplyr")
if(!require("tidyr")) install.packages("tidyr")
if(!require("stringr")) install.packages("stringr")
if(!require("ggplot2")) install.packages("ggplot2")
if(!require("knitr")) install.packages("knitr")
if(!require("kableExtra")) install.packages("kableExtra")
if(!require("DT")) install.packages("DT")

Environment

sessionInfo()

## R version 4.2.2 (2022-10-31)
## Platform: aarch64-apple-darwin20 (64-bit)
## Running under: macOS 14.7.1
## 
## Matrix products: default
## BLAS:   /Library/Frameworks/R.framework/Versions/4.2-arm64/Resources/lib/libRblas.0.dylib
## LAPACK: /Library/Frameworks/R.framework/Versions/4.2-arm64/Resources/lib/libRlapack.dylib
## 
## locale:
## [1] en_US.UTF-8/en_US.UTF-8/en_US.UTF-8/C/en_US.UTF-8/en_US.UTF-8
## 
## attached base packages:
## [1] stats     graphics  grDevices utils     datasets  methods   base     
## 
## other attached packages:
## [1] DT_0.33          kableExtra_1.4.0 knitr_1.50       ggplot2_3.5.2   
## [5] stringr_1.5.1    tidyr_1.3.1      dplyr_1.1.4     
## 
## loaded via a namespace (and not attached):
##  [1] pillar_1.10.2      bslib_0.9.0        compiler_4.2.2     RColorBrewer_1.1-3
##  [5] jquerylib_0.1.4    tools_4.2.2        digest_0.6.37      viridisLite_0.4.2 
##  [9] jsonlite_2.0.0     evaluate_1.0.3     lifecycle_1.0.4    tibble_3.2.1      
## [13] gtable_0.3.6       pkgconfig_2.0.3    rlang_1.1.6        cli_3.6.5         
## [17] rstudioapi_0.17.1  yaml_2.3.10        xfun_0.52          fastmap_1.2.0     
## [21] xml2_1.3.8         withr_3.0.2        htmlwidgets_1.6.4  systemfonts_1.0.5 
## [25] generics_0.1.3     vctrs_0.6.5        sass_0.4.10        grid_4.2.2        
## [29] tidyselect_1.2.1   svglite_2.1.3      glue_1.8.0         R6_2.6.1          
## [33] rmarkdown_2.29     purrr_1.0.4        farver_2.1.2       magrittr_2.0.3    
## [37] scales_1.4.0       htmltools_0.5.8.1  stringi_1.8.7      cachem_1.1.0

Introduction

We asked 20 UK residents to provide feedback on the clarity of each aspect of our Hero scales. All participants were non academics, 18+. 2 of them were in-laws of the first authors, the rest were recruited on Prolific using the following filters:

Highest diploma =< High school diploma
Uk resident

Data Wrangling

To evaluate the code used for data wrangling, you can click “Show”.

Set <- read.csv("~/Desktop/We can be Heroes/Heroes and Fun/FaceValidityEvaluation.csv", sep=";", comment.char="#")
# 1. pull out the first row as text
labels <- as.character(Set[1, ])

# 2. combine with the existing names however you like
#    e.g. "Q1: How old are you?"
new_names <- paste0(names(Set), ": ", labels)

# 3. assign, then remove the first row
names(Set) <- new_names

# As always with Qualtrics, first two rows are useless
Set <- Set[-1,]

# Remove attention checks
Set <- Set[,-c(39, 66)]

# Remove irrelevant Variables
Set <- Set[,-c(1:7, 93:98)]


df_long <- Set %>%
  # (1) rename your date column to something simpler:
  rename(RecordedDate = `RecordedDate: Recorded Date`) %>%
  # (2) pivot every column except the date into Item/value pairs:
  pivot_longer(
    cols      = -RecordedDate,
    names_to  = "Item",
    values_to = "value"
  ) %>%
  # (3) convert the value column to numeric where possible:
  mutate(
    value = as.numeric(value)
  )

# Inspect the result
df_long <- subset(df_long, is.na(df_long$value) == F)

Main Table

The table below summarises summary statistics regarding the clarity of each item. It contains mean clarity (scale from 1 - Very unclear to 7 - very clear), SD, median, MAD, but also Floor: the % of people who picked 1, and Ceiling: the % of people who picked 7. It also describes from how median does the mean clarity of the items deviate from an ideal clarity score of 6.5/7 (i.e., between Quite clear and very clear) -> See column z_ideal.

# 4. Compute per-item diagnostics
item_stats <- df_long %>%
  group_by(Item) %>%
  summarise(
    Mean    = mean(value,   na.rm = TRUE),
    SD      = sd(value,     na.rm = TRUE),
    Median  = median(value, na.rm = TRUE),
    MAD     = mad(value,    na.rm = TRUE),
    Floor   = 100 * mean(value == 1, na.rm=TRUE), # % of people who picked 1
    Ceiling = 100 * mean(value == 7, na.rm=TRUE), # % of people who picked 7
    .groups = "drop"
  ) %>%
  mutate(
    z_median = (Mean - median(Mean)) / mad(Mean),# how many MADs above (if positive) or below (if negative) an item’s mean sits relative to the central mass of item-means
    z_Ideal = (Mean - 6.5) / mad(Mean),# how many MADs above (if positive) or below (if negative) an item’s mean sits relative to the ideal Scale point of 6.5

    IQR_flag = Mean < (quantile(Mean, .25) - 1*IQR(Mean)) | # We use 1*IQR --> This is more conservative than your usual 1.25*IQR
               Mean > (quantile(Mean, .75) + 1*IQR(Mean))
  )



# show all rows, with search/filter, horizontal scrolling
datatable(
  item_stats,
  options = list(
    pageLength = nrow(item_stats),   # default to show all rows
    scrollX    = TRUE,               # allow horizontal scrolling
    autoWidth  = TRUE                # auto‐adjust column widths
  ),
  class = "stripe hover compact",
  rownames = FALSE
)

From this table, we can see several important things: - Most items have a satisfying mean (M > 6) and/or satisfying median (median = 6.5 or 7, i.e., at least 50% of the sample gave the item the maximum score of 7) - 6 items are flagged as outliers using 1*IQR distance or 2 MAD below the median of the sample (see z-scores). - With the exception of 1 outlier, these items are in the General Villain evaluation.

I FLAGGED AS PROBLEMATIC (i.e., to delete or to revise) items deviating from more than 2 MAD from an ideal mean comprehension score of 6.5.