Training Convolutional Neural Network Classifiers Using Simultaneous Scaled Supercomputing

Date of Award

2020

Degree Name

M.S. in Electrical Engineering

Department

Department of Electrical and Computer Engineering

Advisor/Chair

Advisor: Eric Balster

Abstract

Convolutional neural networks (CNN) are revolutionizing and improving today's technological landscape at a remarkable rate. Yet even in their success, creating optimal trained networks depends on expensive empirical processing to generate the best results. They require powerful processors, expansive datasets, days of training time, and hundreds of training instances across a range of hyperparameters to identify optimal results. These requirements can be difficult to access for the typical CNN technologist and ultimately wasteful of resources, since only the most optimal model will be utilized. To overcome these challenges and create a foundation for the next generation of CNN technologist, a three-stage solution is proposed: (1) To cultivate a new dataset containing millions of domain-specific (aerial) annotated images; (2) to design a flexible experiment generator framework which is easy to use, can operate on the fastest supercomputers in the world, and can simultaneously train hundreds of unique CNN networks; and (3) to establish benchmarks of accuracies and optimal training hyperparameters. An aerial imagery database is presented which contains 260 new cultivated datasets, features tens of millions of annotated image chips, and provides several distinct vehicular classes. Accompanying the database, a CNN-training framework is presented which can generate hundreds of CNN experiments with extensively customizable input parameters. It operates across 11 cutting-edge CNN architectures, any Keras-formatted database, and is supported on 3 unique Linux operating systems - including two supercomputers ranked in the top 70 worldwide. Training can be easily performed by simply inputting desirable parameter ranges in a pre-formatted spreadsheet. The framework creates unique training experiments for every combination of dataset, hyperparameter, data augmentation, and super computer requested. The resulting hundreds of trained networks provides the performance to perform intensive qualitative analysis to highlight key input requirements for optimal results. Finally, as proof of this performance, select sample benchmarking results of both the accuracies on aerial image datasets and their training hyperparamters are presented to illustrate the framework's utility.

Keywords

Engineering, Electrical Engineering, Computer Engineering, Computer Science, Artificial Intelligence, Convolutional Neural Networks, Machine Learning, High Performance Computing, Image Classification, Image Dataset

Rights Statement

Copyright © 2020, author

Share

COinS