The quality of individual ratings of conformation traits can commonly be evaluated by calculating inter-rater correlations and repeatability coefficients. We present an approach in which we associate the individual rating scores with the underlying horse shapes derived from standardized images, performing a shape regression. Therefore, we analyzed the shape of 102 Lipizzan stallions from the Spanish Riding School in Vienna, defined by 246 shape-correlated two-dimensional coordinates using techniques from the field of image analysis and geometric morphometrics. In addition we examined the differences in the conformation classifiers' perceptions of type traits and functional traits. In this study part, the rating scores of eight conformation classifiers were tested for agreement, yielding inter-rater correlations ranging from 0.30 to 0.55 and kappa coefficients ranging from 0.08 to 0.42. From the 12 scoring traits assessed on a valuating scale, type traits with a mean kappa coefficient (kappa) of 0.27 demonstrated a higher agreement than functional traits (kappa = 0.14). Based on 246 two-dimensional anatomical and somatometric landmarks, the shape variation was analyzed by the use of generalized orthogonal least-squares Procrustes (generalized Procrustes analysis - GPA) procedures. Shape variables were regressed into the results from visually scored linear type trait classifications (shape regressions). From the 48 performed shape regressions (eight classifiers, six traits), 42% resulted in a significant equation. In 58% of the ratings, no association between scores and the phenotype of the horses was found. Phenotypic differences of model horses along significant regression curves of mean ratings and individual ratings were exemplarily visualized and compared by warped and averaged images. Finally, we demonstrated that the method of shape regression offers the possibility to evaluate the association of individual ratings from expert conformation classifiers with the shapes of horses. The detected bias in classifiers' rankings have not been considered in breeding programs, and its impact on selection procedures still needs further research.