Paper Title
ASSESSING TECHNOLOGICAL CREATIVITY: ENHANCING RELIABILITY AND VALIDITY
Abstract
Creativity has become a key focus in Taiwan, driven by government initiatives aimed at fostering innovation and problem-solving skills. However, accurately measuring creativity remains challenging due to the complexity of creative processes and outcomes. Traditional methods, such as product analysis and the Torrance Tests of Creative Thinking (TTCT), face limitations in rating scale reliability and ecological validity. Amabile’s (1996) consensual assessment technique offers an alternative by using experienced judges who independently evaluate open-ended, observable products, thereby maintaining contextual relevance. Despite this, recent research indicates that rater severity significantly affects creativity assessments, highlighting the need for improved evaluation methods. This study addresses this issue by employing a two-parameter logistic rater model (Wu, 1997) to analyze technological creativity. The model accounts for variability in rater severity and examines intra-rater reliability, thus providing a more nuanced understanding of rater effects. Conducted with 8th-grade students in northern Taiwan, the study involved designing creative cell phones tailored to their peers' needs. Students first reviewed popular cell phone designs and then submitted graphical designs with functional descriptions. Three researchers independently evaluated 233 designs using a 5-point scale, assessing overall performance, usefulness, creativity, and organization. The model demonstrated good fit, effectively capturing rater severity while ensuring high intra-rater reliability. These findings suggest that the two-parameter logistic rater model is a robust tool for evaluating technological creativity, offering more precise and consistent assessments. It is recommended for future studies to adopt this model to enhance rating accuracy and minimize bias. This approach contributes to advancing creativity assessment methodologies, ensuring more reliable and valid measurements.
Keywords - creativity assessment, rater reliability, item response theory