Examining reliability and validity of an online score (ALiEM AIR) for rating free open access medical education resources
- View All
STUDY OBJECTIVE: Since 2014, Academic Life in Emergency Medicine (ALiEM) has used the Approved Instructional Resources (AIR) score to critically appraise online content. The primary goals of this study are to determine the interrater reliability (IRR) of the ALiEM AIR rating score and determine its correlation with expert educator gestalt. We also determine the minimum number of educator-raters needed to achieve acceptable reliability. METHODS: Eight educators each rated 83 online educational posts with the ALiEM AIR scale. Items include accuracy, usage of evidence-based medicine, referencing, utility, and the Best Evidence in Emergency Medicine rating score. A generalizability study was conducted to determine IRR and rating variance contributions of facets such as rater, blogs, posts, and topic. A randomized selection of 40 blog posts previously rated through ALiEM AIR was then rated again by a blinded group of expert medical educators according to their gestalt. Their gestalt impression was subsequently correlated with the ALiEM AIR score. RESULTS: The IRR for the ALiEM AIR rating scale was 0.81 during the 6-month pilot period. Decision studies showed that at least 9 raters were required to achieve this reliability. Spearman correlations between mean AIR score and the mean expert gestalt ratings were 0.40 for recommendation for learners and 0.35 for their colleagues. CONCLUSION: The ALiEM AIR scale is a moderately to highly reliable, 5-question tool when used by medical educators for rating online resources. The score displays a fair correlation with expert educator gestalt in regard to the quality of the resources. The score displays a fair correlation with educator gestalt.
Link to Article