Despite what you think, it’s not based on sound.
This set of seven experiments  suggests that novices’ judgment mirrors that of professionals; both novices and experts make judgments about music performance quickly and automatically on the basis of visual information. Given the relative lack of consensus about competition outcomes noted among even expert judges, the fact that novices are able to quickly identify the actual competition winners at such high rates through silent videos alone is of both statistical and practical signiﬁcance. These ﬁndings point to a powerful effect of vision-biased preferences on selection processes even at the highest levels of performance. […]
Professional musicians and competition judges consciously value sound as central to this domain of performance, yet they arrive at different winners depending on whether visual information is available or not. This ﬁnding suggests that visual cues are indeed persuasive and sway judges away from recognizing the best performance that they themselves have, by consensus, deﬁned as dependent on sound. Professional judgment appears to be made with little conscious awareness that visual cues factor so heavily into preferences and decisions.