Data Analysis on Classifying the Severity of Genetic Mutations
Kelly Laureen Pleiman
Cancer tumors can have thousands of mutations but determining which of those mutations actually contribute to tumor growth is critical in understanding the disease. Through the use of productive models in machine learning, this capstone project focuses on determining the severity of different genetic mutations using available data from Kaggle on the mutation’s gene, variation, and clinical text evidence. By performing data analysis and applying different models on this complex data set, the class or severity of genetic mutations on a scale from 1-9 can be predicted. Decision tree, random forest, SVD, logistic regression, and K nearest neighbor are among the models that were used to classify genetic variation. Obtaining higher model accuracies allows for better classification of genetic mutations and could eventually expedite the time pathologists spend manually classifying mutations.
Primary Advisor's Department
Stander Symposium project, College of Arts and Sciences
United Nations Sustainable Development Goals
Industry, Innovation, and Infrastructure
"Data Analysis on Classifying the Severity of Genetic Mutations" (2021). Stander Symposium Projects. 2207.