Assessment of Generative AI for Text-to-Image Synthesis

Date of Award

5-5-2024

Degree Name

M.C.S. in Computer Science

Department

Department of Computer Science

Advisor/Chair

Tam Nguyen

Abstract

Text-to-image models, often associated with Generative AI techniques, have faced several challenges that have become synonymous with the broader field of Computer Vision and Machine Learning. Generating realistic images from textual descriptions is a complex task. While recent models have made significant progress, achieving photo realism remains a challenge. Understanding and translating natural language descriptions into meaningful visual representations is inherently difficult. Many text-to-image models struggle to produce images with sufficient resolution and clarity, especially for complex scenes or objects. Assessing the quality of these generated images remains a challenging task due to the perception of image quality. Furthermore, generated images lack clear ground truth and higher details for human attention. To address this challenge in the study of artificially generated images, we are introducing a novel approach with a dataset, Generative Artificial Image Assessment (GAIA) trained with architecture to define a definitive reference point. This comprises images from eight popular text-to-image AI models as well as user rankings from a crowd-sourced annotation. This dataset is evaluated and predicted by pre-trained state-of-the-art networks with ranking classes and diverse regression techniques in a comparative study to analyze the images. The approach is a combination of objective evaluation metrics, subjective human judgment, benchmark datasets with diverse ground truth annotations, and advancements in multimodal learning techniques giving us a way forward to advancing the field of text-to-image generation.

Keywords

Computer Science

Rights Statement

Copyright 2024, author

Share

COinS
 
 
 

Links