Read by QxMD icon Read

Journal of Applied Measurement

Lin Hsiao-Hui, Yuh-Tsuen Tzeng
This study aimed to advance the Scientific Multi-Text Reading Comprehension Assessment (SMTRCA) by developing a rubric which consisted of 4 subscales: information retrieval, information generalization, information interpretation, and information integration. The assessment tool included 11 close-ended and 8 open-ended items and its rubric. Two texts describing opposing views of the dispute of whether to continue the Fourth Nuclear Power Plant construction in Taiwan were developed and 1535 grade 5-9 students read these two texts in a counterbalanced order and answered the test items...
2018: Journal of Applied Measurement
Christine Kumlien, Michael Miller, Cecilia Fagerstrom, Peter Hagell
Self-management programs require a range of indicators to evaluate their outcomes. The Health Education Impact Questionnaire (heiQ) was developed to meet this need. The heiQ contains 40 items with 4 response categories, representing eight scales. We developed a Swedish version of the heiQ that was tested by cognitive interviews (n = 15) and psychometrically (n = 177) using classical test theory (CTT) and Rasch measurement theory (RMT). The Swedish heiQ was easily understood by interviewees and met CTT criteria, with supported scaling assumptions (corrected item-total correlations, 0...
2018: Journal of Applied Measurement
Chunlian Jiang, Do-Hong Kim, Chuang Wang
The purpose of this study is to examine the psychometric properties of an instrument to measure word problem solving skills in mathematics related to speed with 706 sixth grade Chinese and Singaporean students. Rasch measurement models were applied to examine the reliability, unidimensionality, rating scale functioning, item difficulty, and person difficulty. The differential item functioning (DIF) analysis was also performed to examine the differences in item difficulty estimates between Chinese and Singaporean students...
2018: Journal of Applied Measurement
Peter Hagell, Matthew Rouse, Stephen P McKenna
Alzheimer's disease (AD) is the most common form of dementia, characterized by cognitive, psychiatric and behavioral symptoms and increasing dependency. Family members typically assume increasing caregiving responsibilities, with considerable quality of life (QoL) impact. This article describes the testing of a needs-based QoL questionnaire for AD family caregivers. Initial analyses, according to Rasch measurement theory, suggested that items applied to spousal rather than non-spousal caregivers. Following removal of non-spousal responders, a 25-item questionnaire was identified that exhibited acceptable model fit, a mean (SD) person location of 0...
2018: Journal of Applied Measurement
Michael J Ireland, Hong Eng Goh, Ida Marais
The 10-item Emotion Regulation Questionnaire (ERQ) was developed to measure individual differences in the tendency to use two common emotion regulation strategies: cognitive reappraisal and suppression. The current study examined the psychometric properties of the ERQ in a heterogeneous mixed sample of 713 (64.9% female) community residents using the polytomous Rasch model. The results showed that the 10-item ERQ was multidimensional and supported the two distinct factors. The reappraisal and suppression subscales were both found to be unidimensional and fit the Rasch model...
2018: Journal of Applied Measurement
Rose E Stafford, Edward W Wolfe, Jodi M Casablanca, Tian Song
Previous research has shown that indices obtained from partial credit model (PCM) estimates can detect severity and centrality rater effects, though it remains unknown how rater effect detection is impacted by the missingness inherent in double-scoring rating designs. This simulation study evaluated the impact of missing data on rater severity and centrality detection. Data were generated for each rater effect type, which varied in rater pool quality, rater effect prevalence and magnitude, and extent of missingness...
2018: Journal of Applied Measurement
James J Thompson
Fluency may be considered as a conjoint measure of work product quality and speed. It is especially useful in educational and medical settings to evaluate expertise and/or competence. In this paper, didactic exams were used to model fluency. Binned propensity matching with question difficulty and time intensity was used to define a 'load' variable and construct fluency (sum correct/ elapsed response time). Response surfaces as speed-accuracy tradeoffs resulted from the analysis. Person by load fluency matrices behaved well in Rasch analysis and warranted the definition of a person fluency variable ('skill')...
2018: Journal of Applied Measurement
Stephen N Humphrey
Aligning scales in vertical equating carries a number of challenges for practitioners in contexts such as large-scale testing. This paper examines the impact of high and low discrimination on the results of vertical equating when the Rasch model is applied. A simulation study is used to show that different levels of discrimination introduce systematic error into estimates. A second simulation study shows that for the purpose of vertical equating, items with high or low discrimination contain information about translation constants that contains systematic error...
2018: Journal of Applied Measurement
Pei-Chin Lu, Samantha Estrada, Steven Pulos
The Current Statistics Self-Efficacy (CSSE) scale, developed by Finney and Schraw (2003), is a 14-item instrument to assess students' statistics self-efficacy. No previous research has used the Rasch measurement models to evaluate the psychometric structure of its scores at the item level, and only a few of them have applied the CSSE in a graduate school setting. A modified 30-item CSSE scale was tested on a graduate student population (N = 179). The Rasch rating scale analysis identified 26 items forming a unidimensional measure...
2018: Journal of Applied Measurement
Joseph N Njiru, Joseph T Romanoski
This article describes the development and calibration of items from the 1997 to 2006 Tertiary Entrance Exams (TEE) in Chemistry conducted by the Curriculum Council of Western Australia for the purposes of establishing a Chemistry item bank. Only items that met the strict Rasch measurement criterion of ordered thresholds were included. Item Residuals and Chi-square conformity of the items were likewise scrutinized. Further, specialist experts in chemistry were employed to ascertain the qualitative properties of the items, particularly the item wording, so as to provide accurate item descriptors...
2018: Journal of Applied Measurement
Ruth Chu-Lien Chao, Kathy Green, Kranti Dugar, Joseph Longo
Although the United States offers some of the most advanced psychological services in the world, not everyone in U.S. shares equally in these services, and health disparities persist when assessments do not appropriately measure different populations' mental health problems. To address this assessment issue, we conducted factor and Rasch analyses to assess the psychometric characteristics of the Brief Symptom Inventory-18 (BSI-18) to evaluate whether the BSI is culturally appropriate for assessing African Americans' psychological distress...
2018: Journal of Applied Measurement
Hong Eng Goh, Ida Marais, Michael Ireland
Establishing the internal validity of psychometric instruments is an important research priority, and is especially vital for instruments that are used to collect data to guide public policy decisions. The Warwick-Edinburgh Mental Well-Being Scale (WEMWBS) is a well-established and widely-used instrument for assessing individual differences in well-being. The current analyses were motivated by concerns that metal wellbeing items that refer to interpersonal relationships (Items 9 and 12) may operate differently for those in a relationship compared to those not in a relationship...
2018: Journal of Applied Measurement
Eli Jones, Stefanie A Wind
When selecting a design for rater-mediated assessments, one important consideration is the number of raters who rate each examinee. In balancing costs and rater-coverage, rating designs are often implemented wherein only a portion of the examinees are rated by each judge, resulting in large amounts of missing data. One drawback to these sparse rating designs is the reduced precision of examinee ability estimates they provide. When increasing the number of raters per examinee is not feasible, another option may be to increase the number of ratings provided by each rater per examinee...
2018: Journal of Applied Measurement
Qingping He, Michelle Meadows
By treating each examination as a polytomous item and a grade that a student achieved in the exam as a score on the item, the partial credit model (PCM) has been used to analyse data from examinations in 16 GCSE subjects taken by 16-year olds in England. These examinations are provided by four different exam boards. By further treating students taking the exams testing the same subject but provided by different exam boards as different subgroups, differential category functioning (DCF) analysis was used to investigate the comparability of standards at specific grades in the examinations between the exam boards...
2018: Journal of Applied Measurement
Ickpyo Hong, Annie N Simpson, Kit N Simpson, Sandra S Brotherton, Craig A Velozo
This study compared disability levels between community-dwelling adults in the United States and South Korea using two national surveys of the United States and Korean National Health and Examination Survey (NHANES and KNHANES). The Rasch common-item equating method was used to create the same measurement framework and compared average disability levels. The disability levels between the two countries were estimated using the current disability estimation method (percentage of people having disability based on a single question)...
2018: Journal of Applied Measurement
Patrick U Osadebe
The study was carried out to assess the difficulty index of each item of an Economics Achievement test with the Rasch model. The infit and outfit as well as the reliability of the test were determined. Three research questions were drawn to guide the study. A sample of 200 was randomly selected using simple random sampling of balloting and proportionate stratified random sampling. The instrument of the study was an Economics Achievement Test with 100 items. The test has face and content validities. It has a reliability coefficient of 0...
2018: Journal of Applied Measurement
Beyza Aksu Dunya, Clark McKown, Everett V Smith
Social perspective-taking (SPT), which involves the ability infer others' intentions, is a consequential social cognitive process. The purpose of this study is to evaluate the psychometric properties of a web-based social perspective-taking (SELweb SPT) assessment designed for children in kindergarten through third grade. Data were collected from two separate samples of children. The first sample included 3224 children and the second sample included 4419 children. Data were calibrated using Rasch dichotomous model (Rasch, 1960)...
2018: Journal of Applied Measurement
Courtney Donovan
Teachers are expected to use data and assessments to drive their instruction. This is accomplished at a classroom level via the assessment process. The teachers Knowledge and Use of Data and Assessment (tKUDA) measure was created to capture teachers' knowledge and use of this assessment process. This paper explores the measure's utility using Rasch analysis. Evidence of reliability and validity was seen for both knowledge and use factors. Scale was used as expected and item analyses demonstrates good spread with a few items identified for future revision...
2018: Journal of Applied Measurement
Georgios D Sideridis, Cengiz Zopluoglu
The purpose of the present study was to evaluate various analytical means to detect academic cheating in an experimental setting. The omega index was compared and contrasted given a gold criterion of academic cheating which entailed a discrepant score between two administrations using an experimental study with real test takers. Participants were 164 elementary school students who were administered a mathematics exam followed by an equivalent mock exam under conditions of strict and relaxed, invigilation, respectively...
2018: Journal of Applied Measurement
Bo Hu
In linked-chain equating, equating errors may accumulate and cause scale drift. This simulation study extends the investigation on scale drift in linked-chain equating to mixed-format test. Specifically, the impact of equating method and the characteristics of anchor test and equating chain on equating errors and scale drift in IRT true score equating is examined. To evaluate equating results, a new method is used to derive true linking coefficients. The results indicate that the characteristic curve methods produce more accurate and reliable equating results than the moment methods...
2018: Journal of Applied Measurement
Fetch more papers »
Fetching more papers... Fetching...
Read by QxMD. Sign in or create an account to discover new knowledge that matter to you.
Remove bar
Read by QxMD icon Read

Search Tips

Use Boolean operators: AND/OR

diabetic AND foot
diabetes OR diabetic

Exclude a word using the 'minus' sign

Virchow -triad

Use Parentheses

water AND (cup OR glass)

Add an asterisk (*) at end of a word to include word stems

Neuro* will search for Neurology, Neuroscientist, Neurological, and so on

Use quotes to search for an exact phrase

"primary prevention of cancer"
(heart or cardiac or cardio*) AND arrest -"American Heart Association"