Download Project (2.6 MB)
In today's digital age, the amount of data being generated and shared on a daily basis is growing at an unprecedented rate. With this growth comes the challenge of managing this vast amount of data effectively. That being said, there are approximately fifteen billion images shared on social media per day. The same image may exist in multiple locations in different formats, sizes, and with slight variations, making it difficult for end-users to filter and detect duplicate images. This duplication can lead to unnecessary storage costs, reduced data quality, and decreased productivity as users waste time searching for the right image.Detecting duplicate images is a crucial task in various fields and there is a growing need to automate this process. The primary objective of this project is to create a system that can identify duplicate images by comparing two images, even if they have slight differences in color, size, or format. To achieve the goal, we developed a system that detects and flags duplicates. The system utilizes various techniques such as visual similarity, image hashing, computer vision and Machine Learning techniques. The system is integrated into a web application that enables users to upload images and detects duplicates. The system also highlights the differences between the images. Overall, the development of a duplicate image detection web application can offer significant benefits to organizations with extensive image collections. By automating the process of identifying duplicate images, it can save time, reduce costs, and enhance the overall data quality.
Course Project 202310 CPS 595 P1
Ahmed El Ouadrhiri, Phu Phung
Primary Advisor's Department
Stander Symposium, College of Arts and Sciences
Institutional Learning Goals
Practical Wisdom; Community; Critical Evaluation of Our Times
"Duplicate Image Detection using Machine Learning" (2023). Stander Symposium Projects. 3004.
Presentation: 1:15-2:30 p.m., Kennedy Union Ballroom