Document Type

Article

Publication Date

8-23-2025

Abstract

Identifying the underlying structural patterns in data and extracting meaningful insights is a key challenge in data analysis. One effective approach to this problem is matrix factorization (MF), which approximates large matrices with lower-dimensional representations, making it effective for uncovering hidden patterns. MF techniques are widely applicable across various domains, such as recommender systems, cancer genomics, system identification, clustering, and image processing. Despite their effectiveness, existing MF methods often struggle with computational constraints and convergence challenges when tackling large-scale, nonsmooth, and nonconvex optimization problems, which are common in real-world applications.

This project aims to explore both the theoretical understanding and practical application of MF by integrating optimization techniques designed for large-scale problems. Specifically, we focus on Federated Binary Matrix Factorization (FBMF), an extension of traditional MF designed for decentralized settings with binary-valued data. This approach is particularly relevant for privacy-preserving and large-scale distributed applications, such as healthcare and recommender systems. To address the challenges of optimizing MF in these settings, we leverage optimization methods – integer programming (IP). The research combined methodological development with empirical validation. We first introduced an algorithm that integrates alternating optimization, a randomized block-coordinate strategy, and integer programming to enhance solution accuracy for FBMF. We then applied the proposed approach to large-scale cancer genomics and recommendation tasks, comparing its performance against a state of the art FBMF method.

Keywords

matrix factorization, federated learning, integer programming

Disciplines

Computer Sciences | Theory and Algorithms

Comments

Note: Note: This report is based on our paper "Federated Boolean matrix factorization using integer programming” (Phan, D. N., Nguyen, Q. A., & Nguyen, T. N., 2025), accepted to the 24th IEEE International Conference on Machine Learning and Applications (ICMLA 2025) and under review at the time of publication in this series.

Acknowledgment: I’d like to thank the Berry Summer Thesis Institute and the University Honors Program at the University of Dayton for allowing and funding me to do research full time over the summer. I also want to thank all excellent staff who run this program over 13 years to give students incredible experience and plenty of resources. Special thank goes to Dr. Ngoc Nguyen and Dr. Nhat Phan for being great mentors this summer and guiding me through the research process.


Share

COinS
 
 

To view the content in your browser, please download Adobe Reader or, alternately,
you may Download the file to your hard drive.

NOTE: The latest versions of Adobe Reader do not support viewing PDF files within Firefox on Mac OS and if you are using a modern (Intel) Mac, there is no official plugin for viewing PDF files within the browser window.