Abstract: Subjective lameness evaluations are critical components of equine musculoskeletal health assessments. Objective approaches can supplement diagnosis and may be preferred for specific cases and scientific purposes. Objective: Evaluate agreements between subjective evaluation of two veterinarians and standard clinical interpretations from outputs of an AI-based smartphone application (Sleip; AI-SPA), the inertial measurement unit (IMU) system Equinosis Q Lameness Locator (LL), and the IMU system Equisym (ES). Methods: In vivo experiment. Methods: Twenty-five research horses (10-30 years) were evaluated on a straight-line trot. Limbs were independently graded and converted to an ordinal scale that, for objective systems, were converted from system-specific data outputs. Default settings and outputs for AI-SPA and LL were utilised to grade lameness while a manual process was developed for the ES. Pairwise agreement was calculated via weighted Cohen's κ, and agreement across rater types was calculated via Gwet's Agreement Coefficient 2 (GA2). Results: Objective evaluator agreement (GA2 = 0.84, 95% Confidence Interval (CI): 0.77-0.91) was higher than subjective evaluator agreement (GA2 = 0.73, 95% CI: 0.63-0.83) across all limbs. For all five evaluators/systems, overall forelimb agreement (GA2 = 0.82, 95% CI: 0.69-0.95) was greater than overall hindlimb agreement (GA2 = 0.73, 95% CI: 0.59-0.87). Pairwise agreement scores between the objective systems were often higher than those involving veterinary evaluators. The ES system often produced the highest agreement when compared with each rater individually. Conclusions: Horses were evaluated on a straight line only. Lameness diagnosis was limited to visual observation. Outcomes for each horse's four limbs were considered independent measurements. Conclusions: This work highlights the utility of commercially available objective evaluation systems, including the more recent ES system. Hindlimb asymmetries had lower agreement regardless of evaluator type. Objective systems had higher agreements when compared with subjective straight-line veterinary examination. The ability to uniformly assess asymmetries may assist diagnosis when compared with subjective evaluation alone.
The Equine Research Bank provides access to a large database of publicly available scientific literature. Inclusion in the Research Bank does not imply endorsement of study methods or findings by Mad Barn.
This research summary has been generated with artificial intelligence and may contain errors and omissions. Refer to the original study to confirm details provided. Submit correction.
Overview
This study evaluates the level of agreement between traditional subjective evaluations by veterinarians and results from three objective lameness detection systems in horses.
It aims to determine how well these objective technologies align with professional visual assessments in identifying equine lameness.
Background
Lameness evaluation in horses is a key part of assessing their musculoskeletal health.
Typically, veterinarians perform subjective visual examinations during a horse’s movement, often on a straight line.
Objective systems use technology to measure movement asymmetries and may provide quantitative data that supplement or support subjective evaluations.
Understanding agreements between subjective and objective evaluations can inform clinical decisions and improve diagnostic accuracy.
Objective
Compare agreement levels between two veterinarians’ subjective lameness grades and three objective systems:
An AI-based smartphone application called Sleip (AI-SPA)
The Equinosis Q Lameness Locator (LL), an inertial measurement unit (IMU) system
The Equisym (ES) IMU system, with a specially devised manual interpretation process
Determine the consistency and reliability of these methods in grading natural equine lameness.
Methods
Subjects: 25 research horses aged between 10 and 30 years were studied.
Evaluation setting: Horses were trotted on a straight line to enable observational and instrumented assessment.
Measurements:
Each horse’s limbs were independently scored for lameness by each evaluator and system.
Scoring was transformed into a common ordinal scale for comparability across subjective and objective formats.
For AI-SPA and LL, default automated output settings were used; for ES, a manual scoring method was applied.
Agreement analysis:
Weighted Cohen’s κ was used for pairwise agreement between raters/systems.
Gwet’s Agreement Coefficient 2 (GA2), a robust statistic for categorical agreement, was calculated for comparisons across all raters.
Key Results
Overall, objective systems showed higher agreement among themselves (GA2 = 0.84) than between veterinarians (GA2 = 0.73).
Agreement was better when evaluating forelimbs (GA2 = 0.82) compared to hindlimbs (GA2 = 0.73) across all methods.
Pairwise agreements between objective systems (AI-SPA, LL, ES) were usually stronger than those involving veterinarians’ subjective evaluations.
The Equisym (ES) system tended to have the highest agreement with individual raters, suggesting strong compatibility with subjective assessments.
Conclusions and Implications
The study highlights that objective measurement tools can reliably complement traditional veterinary assessments, particularly in recognizing asymmetries linked to lameness.
Lower agreement for hindlimb lameness indicates this area is more challenging to assess reliably, whether subjectively or objectively.
Higher objectivity and repeatability found in these systems could improve diagnostic consistency and reduce observer bias.
All evaluations were performed on horses moving on a straight line with only visual observation, which is a limitation—turning or other movements might influence results.
Each limb was considered independently, which is important because lameness in one limb can affect gait patterns in others.
The manual process developed for the ES system is promising and represents an advancement in objective diagnostic tools.
In clinical practice, integrating such objective technologies could enhance the accuracy and confidence of lameness diagnosis and treatment planning.
Cite This Article
APA
McPeek JL, Menarim B, Sponseller B, McClendon M, Adam EN, Adams AA, Slone S, Page AE.
(2025).
Agreement between veterinarians and three objective evaluation systems in naturally occurring equine lameness.
Equine Vet J.
https://doi.org/10.1111/evj.70116
Jeffcott LB, Rossdale PD, Freestone J, Frank CJ, Towers‐Clark PF. An assessment of wastage in Thoroughbred racing from conception to 4 years of age. Equine Vet J 1982;14(3):185–198.
Walsh P, Thornton J, Asato J, Walker N, McCoy G, Baal J. Approaches to describing inter‐rater reliability of the overall clinical appearance of febrile infants and toddlers in the emergency department. PeerJ 2014;2:e651.