Document Type
Article
Publication Date
8-2013
Publication Source
ACM Transactions on Multimedia Computing, Communications, and Applications
Abstract
Decrypting the secret of beauty or attractiveness has been the pursuit of artists and philosophers for centuries. To date, the computational model for attractiveness estimation has been actively explored in the computer vision and multimedia community, yet with the focus mainly on facial features. In this article, we conduct a comprehensive study on female attractiveness conveyed by single/multiple modalities of cues, that is, face, dressing and/or voice; the aim is to discover how different modalities individually and collectively affect the human sense of beauty.
To extensively investigate the problem, we collect the Multi-Modality Beauty (M2B) dataset, which is annotated with attractiveness levels converted from manual k-wise ratings and semantic attributes of different modalities. Inspired by the common consensus that middle-level attribute prediction can assist higher-level computer vision tasks, we manually labeled many attributes for each modality. Next, a tri-layer Dual-supervised Feature-Attribute-Task (DFAT) network is proposed to jointly learn the attribute model and attractiveness model of single/multiple modalities.
To remedy possible loss of information caused by incomplete manual attributes, we also propose a novel Latent Dual-supervised Feature-Attribute-Task (LDFAT) network, where latent attributes are combined with manual attributes to contribute to the final attractiveness estimation. The extensive experimental evaluations on the collected M2B dataset well demonstrate the effectiveness of the proposed DFAT and LDFAT networks for female attractiveness prediction.
Inclusive pages
1-20
ISBN/ISSN
1551-6857
Document Version
Postprint
Copyright
Copyright © 2013, Association for Computing Machinery
Publisher
Association for Computing Machinery
Volume
9
Peer Reviewed
yes
Issue
4
eCommons Citation
Nguyen, Tam; Liu, Si; Ni, Bingbing; Tan, Jun; Rui, Yong; and Yan, Shuicheng, "Towards Decrypting Attractiveness via Multi-Modality Cue" (2013). Computer Science Faculty Publications. 73.
https://ecommons.udayton.edu/cps_fac_pub/73
Included in
Graphics and Human Computer Interfaces Commons, Other Computer Sciences Commons, Social Psychology Commons
Comments
This document available for download is the authors' accepted manuscript, provided in compliance with the publisher's policy on self-archiving. Differences may exist between this document and the published version, which is available using the link provided. Permission documentation is on file.