Please use this identifier to cite or link to this item:
http://hdl.handle.net/10609/109843
Full metadata record
DC Field | Value | Language |
---|---|---|
dc.contributor.author | Melenchón, Javier | - |
dc.contributor.author | Martínez Marroquín, Elisa | - |
dc.contributor.author | Torre Frade, Fernando de la | - |
dc.contributor.author | Montero Morales, José Antonio | - |
dc.contributor.other | Universitat Oberta de Catalunya. eLearning Innovation Center | - |
dc.contributor.other | Universitat Ramon Llull | - |
dc.contributor.other | Carnegie Mellon University | - |
dc.date.accessioned | 2020-02-18T08:24:04Z | - |
dc.date.available | 2020-02-18T08:24:04Z | - |
dc.date.issued | 2009-01 | - |
dc.identifier.citation | Melenchón Maldonado, J., Martínez Marroquín, E., De la Torre Frade, F. & Montero, J. (2009). Emphatic Visual Speech Synthesis. IEEE Transactions on Audio, Speech and Language Processing, 17(3), 459-468. doi: 10.1109/TASL.2008.2010213 | es |
dc.identifier.issn | 1558-7916MIAR | - |
dc.identifier.uri | http://hdl.handle.net/10609/109843 | - |
dc.description.abstract | The synthesis of talking heads has been a flourishing research area over the last few years. Since human beings have an uncanny ability to read people's faces, most related applications (e.g., advertising, video-teleconferencing) require absolutely realistic photometric and behavioral synthesis of faces. This paper proposes a person-specific facial synthesis framework that allows high realism and includes a novel way to control visual emphasis (e.g., level of exaggeration of visible articulatory movements of the vocal tract). There are three main contributions: a geodesic interpolation with visual unit selection, a parameterization of visual emphasis, and the design of minimum size corpora. Perceptual tests with human subjects reveal high realism properties, achieving similar perceptual scores as real samples. Furthermore, the visual emphasis level and two communication styles show a statistical interaction relationship. | en |
dc.language.iso | eng | - |
dc.publisher | IEEE Transactions on Audio, Speech and Language Processing | - |
dc.relation.ispartof | IEEE Transactions on Audio, Speech and Language Processing, 2009, 17(3) | - |
dc.relation.uri | https://doi.org/10.1109/TASL.2008.2010213 | - |
dc.subject | audiovisual speech synthesis | en |
dc.subject | emphatic visual-speech | en |
dc.subject | talking head | en |
dc.subject | síntesi audiovisual de la veu | ca |
dc.subject | síntesis audiovisual de la voz | es |
dc.subject | discurs visual emfàtic | ca |
dc.subject | discurso visual enfático | es |
dc.subject | tertulià | ca |
dc.subject | tertuliano | es |
dc.subject.lcsh | Speech processing systems | en |
dc.title | Emphatic visual speech synthesis | - |
dc.type | info:eu-repo/semantics/article | - |
dc.subject.lemac | Processament de la parla | ca |
dc.subject.lcshes | Procesamiento del habla | es |
dc.rights.accessRights | info:eu-repo/semantics/closedAccess | - |
dc.identifier.doi | 10.1109/TASL.2008.2010213 | - |
dc.gir.id | AR/0000004340 | - |
dc.relation.projectID | info:eu-repo/grantAgreement/TEC2006-08043/TCM | - |
Appears in Collections: | Articles Articles cientÍfics |
Files in This Item:
There are no files associated with this item.
Share:
Items in repository are protected by copyright, with all rights reserved, unless otherwise indicated.