Contingency Factors in the Effects of Rater Training on Interrater Agreement: Some Lint in the Bellybutton of Past Research期刊界 All Journals 搜尽天下杂志传播学术成果专业期刊搜索期刊信息化学术搜索

Contingency Factors in the Effects of Rater Training on Interrater Agreement: Some Lint in the Bellybutton of Past Research

Authors:	John F. Duffy Andrew C. Peacock

Abstract:	Researchers, examining the need to train performance appraisal raters, have typically used a complex rating task with several dimensions. However, results of a recent training survey (Mealiea and Duffy, 1986) showed that a number of Canadian organizations use a simple one-item, global rating of employee performance, and they are satisfied with their rating process. The present study looks at the influence of the rating task and the type of measurement on the relationship between rater training and rater effectiveness. Subjects were 105 (approximately 50 per cent female and 50 per cent male) patrons of a tavern in upstate New York who served as contestants in a contest of belly button beauty. Four judges, randomly selected from the audience, served over a three week period. Interrater reliability was assessed using a version of the intraclass correlation coefficient, and the Spearman Brown formula was used to estimate the mean reliability of the four judges. The results indicated an absence of leniency and central tendency bias and high interrater reliability without benefit of training. During the three weeks of the study, the average interrater reliabilities were .85, .99, and .99, respectively. The results, as hypothesized, extend the rater training and measurement literature and can be most parsimoniously explained by adding a contingency factor of task/scale characteristics to the rater training effectiveness theory. Résumé Les chercheurs qui ont examiné le besoin de formation pour les évaluateurs de performance ont toujurs utilisé une tǎche d'évaluation complexe au cours de laquelle plusieurs dimensions étaient évaluées. Cependfant les résultats d'un sondage récent (Mealiea et Duffy, 1986) ont démontré qu'un grand nombre d'entreprises canadiennes utilisent un système global à composante unique pour l'évaluation de la performance et sont satisfaits de cette méthode. La présente étude évalue l'influence de la tǎche d'évaluation et le type de mesure sur la relation entre la formation de l'évaluation et son efficacité. Les sujets de cette étude étaient formés de 105 clients (moitié hommes et moitié femmes approximativement) d'un bar de l'état de New York qui participaient à un concours de beauté ombilicale. Quatre juges choisis au hazard dans la salle ont servi pendant une période de trois semaines. La fiabilité a été évaluée à l'aide d'une version du coefficient de correlation interclasse et la formule Spearman Brown a été utilisée pour estimer la fiabilité moyenne des quatre juges. Les résultats ont démontré l'absence de laxisme et de préjugé à tendance centrale et une fiabilité relative élevées des évaluateurs. Pendant les trois semaines de l'étude, les moyennes de fiabilité étaient de .85, .99, et .99 respectivement. Les résultats, tel que prévu confirment les ouvrages concernant la formation et l'évaluation des évaluateurs et peut ětre expliquée avec grand parcimonie en ajoutant un facteur de contingentement de caractéristiques tǎche/ échelle a la théorie d'efficacité de formation de l'évaluateur.

Keywords:

设为首页 | 免责声明 | 关于勤云 | 加入收藏