Por favor, use este identificador para citar o enlazar este ítem: http://hdl.handle.net/10609/109843
Registro completo de metadatos
Campo DC Valor Lengua/Idioma
dc.contributor.authorMelenchón, Javier-
dc.contributor.authorMartínez Marroquín, Elisa-
dc.contributor.authorTorre Frade, Fernando de la-
dc.contributor.authorMontero Morales, José Antonio-
dc.contributor.otherUniversitat Oberta de Catalunya. eLearning Innovation Center-
dc.contributor.otherUniversitat Ramon Llull-
dc.contributor.otherCarnegie Mellon University-
dc.date.accessioned2020-02-18T08:24:04Z-
dc.date.available2020-02-18T08:24:04Z-
dc.date.issued2009-01-
dc.identifier.citationMelenchón Maldonado, J., Martínez Marroquín, E., De la Torre Frade, F. & Montero, J. (2009). Emphatic Visual Speech Synthesis. IEEE Transactions on Audio, Speech and Language Processing, 17(3), 459-468. doi: 10.1109/TASL.2008.2010213es
dc.identifier.issn1558-7916MIAR
-
dc.identifier.urihttp://hdl.handle.net/10609/109843-
dc.description.abstractThe synthesis of talking heads has been a flourishing research area over the last few years. Since human beings have an uncanny ability to read people's faces, most related applications (e.g., advertising, video-teleconferencing) require absolutely realistic photometric and behavioral synthesis of faces. This paper proposes a person-specific facial synthesis framework that allows high realism and includes a novel way to control visual emphasis (e.g., level of exaggeration of visible articulatory movements of the vocal tract). There are three main contributions: a geodesic interpolation with visual unit selection, a parameterization of visual emphasis, and the design of minimum size corpora. Perceptual tests with human subjects reveal high realism properties, achieving similar perceptual scores as real samples. Furthermore, the visual emphasis level and two communication styles show a statistical interaction relationship.en
dc.language.isoeng-
dc.publisherIEEE Transactions on Audio, Speech and Language Processing-
dc.relation.ispartofIEEE Transactions on Audio, Speech and Language Processing, 2009, 17(3)-
dc.relation.urihttps://doi.org/10.1109/TASL.2008.2010213-
dc.subjectaudiovisual speech synthesisen
dc.subjectemphatic visual-speechen
dc.subjecttalking headen
dc.subjectsíntesi audiovisual de la veuca
dc.subjectsíntesis audiovisual de la vozes
dc.subjectdiscurs visual emfàticca
dc.subjectdiscurso visual enfáticoes
dc.subjecttertuliàca
dc.subjecttertulianoes
dc.subject.lcshSpeech processing systemsen
dc.titleEmphatic visual speech synthesis-
dc.typeinfo:eu-repo/semantics/article-
dc.subject.lemacProcessament de la parlaca
dc.subject.lcshesProcesamiento del hablaes
dc.rights.accessRightsinfo:eu-repo/semantics/closedAccess-
dc.identifier.doi10.1109/TASL.2008.2010213-
dc.gir.idAR/0000004340-
dc.relation.projectIDinfo:eu-repo/grantAgreement/TEC2006-08043/TCM-
Aparece en las colecciones: Articles
Articles cientÍfics

Ficheros en este ítem:
No hay ficheros asociados a este ítem.
Comparte:
Exporta:
Consulta las estadísticas

Los ítems del Repositorio están protegidos por copyright, con todos los derechos reservados, a menos que se indique lo contrario.