Read by QxMD icon Read

Journal of Applied Measurement

Pei-Chin Lu, Samantha Estrada, Steven Pulos
The Current Statistics Self-Efficacy (CSSE) scale, developed by Finney and Schraw (2003), is a 14-item instrument to assess students' statistics self-efficacy. No previous research has used the Rasch measurement models to evaluate the psychometric structure of its scores at the item level, and only a few of them have applied the CSSE in a graduate school setting. A modified 30-item CSSE scale was tested on a graduate student population (N = 179). The Rasch rating scale analysis identified 26 items forming a unidimensional measure...
2018: Journal of Applied Measurement
Joseph N Njiru, Joseph T Romanoski
This article describes the development and calibration of items from the 1997 to 2006 Tertiary Entrance Exams (TEE) in Chemistry conducted by the Curriculum Council of Western Australia for the purposes of establishing a Chemistry item bank. Only items that met the strict Rasch measurement criterion of ordered thresholds were included. Item Residuals and Chi-square conformity of the items were likewise scrutinized. Further, specialist experts in chemistry were employed to ascertain the qualitative properties of the items, particularly the item wording, so as to provide accurate item descriptors...
2018: Journal of Applied Measurement
Ruth Chu-Lien Chao, Kathy Green, Kranti Dugar, Joseph Longo
Although the United States offers some of the most advanced psychological services in the world, not everyone in U.S. shares equally in these services, and health disparities persist when assessments do not appropriately measure different populations' mental health problems. To address this assessment issue, we conducted factor and Rasch analyses to assess the psychometric characteristics of the Brief Symptom Inventory-18 (BSI-18) to evaluate whether the BSI is culturally appropriate for assessing African Americans' psychological distress...
2018: Journal of Applied Measurement
Hong Eng Goh, Ida Marais, Michael Ireland
Establishing the internal validity of psychometric instruments is an important research priority, and is especially vital for instruments that are used to collect data to guide public policy decisions. The Warwick-Edinburgh Mental Well-Being Scale (WEMWBS) is a well-established and widely-used instrument for assessing individual differences in well-being. The current analyses were motivated by concerns that metal wellbeing items that refer to interpersonal relationships (Items 9 and 12) may operate differently for those in a relationship compared to those not in a relationship...
2018: Journal of Applied Measurement
Eli Jones, Stefanie A Wind
When selecting a design for rater-mediated assessments, one important consideration is the number of raters who rate each examinee. In balancing costs and rater-coverage, rating designs are often implemented wherein only a portion of the examinees are rated by each judge, resulting in large amounts of missing data. One drawback to these sparse rating designs is the reduced precision of examinee ability estimates they provide. When increasing the number of raters per examinee is not feasible, another option may be to increase the number of ratings provided by each rater per examinee...
2018: Journal of Applied Measurement
Qingping He, Michelle Meadows
By treating each examination as a polytomous item and a grade that a student achieved in the exam as a score on the item, the partial credit model (PCM) has been used to analyse data from examinations in 16 GCSE subjects taken by 16-year olds in England. These examinations are provided by four different exam boards. By further treating students taking the exams testing the same subject but provided by different exam boards as different subgroups, differential category functioning (DCF) analysis was used to investigate the comparability of standards at specific grades in the examinations between the exam boards...
2018: Journal of Applied Measurement
Ickpyo Hong, Annie N Simpson, Kit N Simpson, Sandra S Brotherton, Craig A Velozo
This study compared disability levels between community-dwelling adults in the United States and South Korea using two national surveys of the United States and Korean National Health and Examination Survey (NHANES and KNHANES). The Rasch common-item equating method was used to create the same measurement framework and compared average disability levels. The disability levels between the two countries were estimated using the current disability estimation method (percentage of people having disability based on a single question)...
2018: Journal of Applied Measurement
Patrick U Osadebe
The study was carried out to assess the difficulty index of each item of an Economics Achievement test with the Rasch model. The infit and outfit as well as the reliability of the test were determined. Three research questions were drawn to guide the study. A sample of 200 was randomly selected using simple random sampling of balloting and proportionate stratified random sampling. The instrument of the study was an Economics Achievement Test with 100 items. The test has face and content validities. It has a reliability coefficient of 0...
2018: Journal of Applied Measurement
Beyza Aksu Dunya, Clark McKown, Everett V Smith
Social perspective-taking (SPT), which involves the ability infer others' intentions, is a consequential social cognitive process. The purpose of this study is to evaluate the psychometric properties of a web-based social perspective-taking (SELweb SPT) assessment designed for children in kindergarten through third grade. Data were collected from two separate samples of children. The first sample included 3224 children and the second sample included 4419 children. Data were calibrated using Rasch dichotomous model (Rasch, 1960)...
2018: Journal of Applied Measurement
Courtney Donovan
Teachers are expected to use data and assessments to drive their instruction. This is accomplished at a classroom level via the assessment process. The teachers Knowledge and Use of Data and Assessment (tKUDA) measure was created to capture teachers' knowledge and use of this assessment process. This paper explores the measure's utility using Rasch analysis. Evidence of reliability and validity was seen for both knowledge and use factors. Scale was used as expected and item analyses demonstrates good spread with a few items identified for future revision...
2018: Journal of Applied Measurement
Georgios D Sideridis, Cengiz Zopluoglu
The purpose of the present study was to evaluate various analytical means to detect academic cheating in an experimental setting. The omega index was compared and contrasted given a gold criterion of academic cheating which entailed a discrepant score between two administrations using an experimental study with real test takers. Participants were 164 elementary school students who were administered a mathematics exam followed by an equivalent mock exam under conditions of strict and relaxed, invigilation, respectively...
2018: Journal of Applied Measurement
Bo Hu
In linked-chain equating, equating errors may accumulate and cause scale drift. This simulation study extends the investigation on scale drift in linked-chain equating to mixed-format test. Specifically, the impact of equating method and the characteristics of anchor test and equating chain on equating errors and scale drift in IRT true score equating is examined. To evaluate equating results, a new method is used to derive true linking coefficients. The results indicate that the characteristic curve methods produce more accurate and reliable equating results than the moment methods...
2018: Journal of Applied Measurement
W Holmes Finch, Maria Hernandez Finch, Brian F French, David E McIntosh, Lauren Moss
An important aspect of the educational and psychological evaluation of individuals is the selection of scales with appropriate evidence of reliability and validity for inferences and uses of the scores for the population of interest. One key aspect of validity is the degree to which a scale fairly assesses the construct(s) of interest for members of different subgroups within the population. Typically, this issue is addressed statistically through assessment of differential item functioning (DIF) of individual items, or differential test functioning (DTF) of sets of items within the same measure...
2018: Journal of Applied Measurement
Carolina Saskia Fellinghauer, Birgit Prodinger, Alan Tennant
Imputation becomes common practice through availability of easy-to-use algorithms and software. This study aims to determine if different imputation strategies are robust to the extent and type of missingness, local item dependencies (LID), differential item functioning (DIF), and misfit when doing a Rasch analysis. Four samples were simulated and represented a sample with good metric properties, a sample with LID, a sample with DIF, and a sample with LID and DIF. Missing values were generated with increasing proportion and were either missing at random or completely at random...
2018: Journal of Applied Measurement
Marcos Cupani, Tatiana Castro Zamparella, Gisella Piumatti, Grupo Vinculado
The calibration of item banks provides the basis for computerized adaptive testing that ensures high diagnostic precision and minimizes participants' test burden. This study aims to develop a bank of items to measure the level of Knowledge on Biology using the Rasch model. The sample consisted of 1219 participants that studied in different faculties of the National University of Cordoba (mean age = 21.85 years, SD = 4.66; 66.9% are women). The items were organized in different forms and into separate subtests, with some common items across subtests...
2017: Journal of Applied Measurement
Oliver Prosperi
Confidence marking is increasingly used in multiple choice testing situations, but when the Rasch measurement model is applied to the data, only the binary data is used, discarding the information given by the confidence marking. This study shows how Wilson's ordered partition model (OPM), a member of the Rasch family of models, can be used to model the confidence information. The result is a model which is in strict relation to the binary Rasch model, since the Rasch ICC's are "split" into a set of curves each representing a confidence level...
2017: Journal of Applied Measurement
Dan Cloney, Cuc Nguyen, Raymond J Adams, Collette Tayler, Gordon Cleveland, Karen Thorpe
The Classroom Assessment Scoring System (CLASS) is an observational instrument assessing the nature of everyday interactions in educational settings. The instrument has strong theoretical groundings; however, prior empirical validation of the CLASS has exposed some psychometric weaknesses. Further the instrument has not been the subject of psychometric analysis at the indicator level. Using a large dataset including observations of 993 Australian classrooms, confirmatory factor analysis is used to replicate findings from the few existing validation studies...
2017: Journal of Applied Measurement
Robert Schwartz, Elizabeth Ayers, Mark Wilson
There are different ways to conceive and measure learning progressions. The approach used by the ADMSR project followed the "four building blocks" approach outlined by the Berkeley Evaluation and Assessment Research (BEAR) Center and the BEAR Assessment System. The final building block of this approach involves the application of a measurement model. This paper focuses on the application of unidimensional and multidimensional item response theory (IRT) measurement models to the data from the ADMSR project...
2017: Journal of Applied Measurement
Lin Ma, Kelly E Green
This study explored optimization of item-attribute matrices with the linear logistic test model (Fischer, 1973), with optimal models explaining more variance in item difficulty due to identified item attributes. Data were 8th-grade mathematics test item responses of two TIMSS 2007 booklets. The study investigated three categories of attributes (content, cognitive process, and comprehensive cognitive process) at two grain levels (larger, smaller) and also compared results with random attribute matrices. The proposed attributes accounted for most of the variance in item difficulty for two assessment booklets (81% and 65%)...
2017: Journal of Applied Measurement
Milja Curcin, Ezekiel Sweiry
In scoring short constructed-response items it may be possible to apply different rubric types depending on the trait of achievement assessed. A rating scale and a partial credit Many-Facet Rasch Models (MFRM) were used to investigate whether levels-based (holistic) and hybrid (analytic) scoring rubrics functioned interchangeably when scoring short-response English reading comprehension test items. Whereas most research in similar contexts has focused solely on rater reliability, the use of MFRM in this study enabled examination of both the reliability and rating scale functioning aspects of scoring rubrics in parallel...
2017: Journal of Applied Measurement
Fetch more papers »
Fetching more papers... Fetching...
Read by QxMD. Sign in or create an account to discover new knowledge that matter to you.
Remove bar
Read by QxMD icon Read

Search Tips

Use Boolean operators: AND/OR

diabetic AND foot
diabetes OR diabetic

Exclude a word using the 'minus' sign

Virchow -triad

Use Parentheses

water AND (cup OR glass)

Add an asterisk (*) at end of a word to include word stems

Neuro* will search for Neurology, Neuroscientist, Neurological, and so on

Use quotes to search for an exact phrase

"primary prevention of cancer"
(heart or cardiac or cardio*) AND arrest -"American Heart Association"