Abstract
The Huffman algorithm is a standard algorithm used to generate an optimized binary encoding of a string. Using this algorithm, it may be possible to come up with more than one possible optimized binary encoding for a given string, so the following question arises: what strings have a unique optimized binary encoding? By unique encoding, we mean a unique vertex-labeled binary tree resulting from the Huffman algorithm. We do not worry about edge labels as every binary tree corresponding to a string with n distinct characters has 2n−1 ways of assigning the edge labels and determining the code. This investigation was inspired by an exercise in Pearls in Graph Theory by Nora Hartsfield and Gerhard Ringel. The Huffman algorithm is used primarily for file compression. This includes but is not limited to image, audio, and text compression. Given this algorithm’s standard use and current relevance, anything that adds to the body of knowledge on this algorithm is potentially helpful for those who employ its use. Our primary focus in this paper concerns the following question:
Question 1.0.1. Is the percentage of strings with n distinct characters and a total size of k that produce unique vertex-labeled trees under the Huffman algorithm monotonic with respect to k for fixed n? Is the percentage monotonic with respect to n for fixed k?
We will provide experimental data that motivated this and a corresponding conjecture. We will answer one of the two questions in the negative, and provide some evidence that the other may be true.
Recommended Citation
Diller, Elijah and Volk, Adam
(2026)
"Huffman Tree Uniqueness,"
Electronic Proceedings of Undergraduate Mathematics Day: Vol. 9, Article 2.
Available at:
https://ecommons.udayton.edu/epumd/vol9/iss1/2