Wrong Posture Detection from Videos
Date of Award
5-9-2026
Degree Name
M.S. in Computer Science
Department
Department of Computer Science
Advisor/Chair
Tam V. Nguyen
Abstract
This thesis started with a straightforward question: can a camera and a laptop replace an expert instructor when it comes to catching bad piano technique? The short answer is mostly yes, at least for binary Normal versus Abnormal classification, but the path to that answer turned out to be more methodologically interesting than expected. Working with 2,556 eight-second video clips from 10 volunteers, I evaluated three complementary approaches to feature extraction: raw CNN embeddings from eight pretrained architectures (TIMM, S3D, I3D, R21D, ResNet, CLIP, I3D FLOW, RAFT), skeletal joint distances computed via MediaPipe, and features extracted after YOLO-based background removal. Classification throughout used pure, uncompressed k-NN (k = 5) on raw feature vectors, a deliberate choice that turned out to expose something important: PCA, which is almost reflexively applied in fusion pipelines, was hiding real differences between extractors rather than simplifying them. The clearest surprise was how differently spatial and temporal models responded to background removal. Stripping the background improved S3D and R21D by 1.5% to 4.0% while degrading TIMM and CLIP by nearly 20%. That asymmetry is not noise, it reflects something fundamental about what each architecture is actually encoding. Combining TIMM, S3D, and I3D features (33,792 dimensions, uncompressed) reached 92.57% accuracy. Skeletal features alone hit 75.39%, which is respectable given how little information pair wise joint distances carry compared to a full video signal. SHAP attribution, backbone GradCAM, and occlusion sensitivity were used together to trace why the pipeline made the predictions it did, and the results were surprisingly coherent across all three methods.
Keywords
Computer Science
Rights Statement
Copyright 2026, author.
Recommended Citation
Chousalkar, Soham, "Wrong Posture Detection from Videos" (2026). Graduate Theses and Dissertations. 7674.
https://ecommons.udayton.edu/graduate_theses/7674

Comments
OCLC No. 1591628142