End-to-end global to local convolutional neural network learning for hand pose recovery in depth data

Madadi, Meysam; Escalera, Sergio; Baró, Xavier; González, Jordi

Por favor, use este identificador para citar o enlazar este ítem: http://hdl.handle.net/10609/137009

Registro completo de metadatos

Campo DC	Valor	Lengua/Idioma
dc.contributor.author	Madadi, Meysam	-
dc.contributor.author	Escalera, Sergio	-
dc.contributor.author	Baró, Xavier	-
dc.contributor.author	González, Jordi	-
dc.contributor.other	Universitat Oberta de Catalunya (UOC)	-
dc.date.accessioned	2022-01-03T18:11:50Z	-
dc.date.available	2022-01-03T18:11:50Z	-
dc.date.issued	2021-08-12	-
dc.identifier.citation	Madadi, M., et al.: End-to-end global to local convolutional neural network learning for hand pose recovery in depth data. IET Comput. Vis. 1-17 (2021). https://doi.org/10.1049/cvi2.12064	-
dc.identifier.issn	1751-9632MIAR	-
dc.identifier.uri	http://hdl.handle.net/10609/137009	-
dc.description.abstract	Despite recent advances in 3-D pose estimation of human hands, thanks to the advent of convolutional neural networks (CNNs) and depth cameras, this task is still far from being solved in uncontrolled setups. This is mainly due to the highly non-linear dynamics of fingers and self-occlusions, which make hand model training a challenging task. In this study, a novel hierarchical tree-like structured CNN is exploited, in which branches are trained to become specialised in predefined subsets of hand joints called local poses. Further, local pose features, extracted from hierarchical CNN branches, are fused to learn higher order dependencies among joints in the final pose by end-to-end training. Lastly, the loss function used is also defined to incorporate appearance and physical constraints about doable hand motions and deformations. Finally, a non-rigid data augmentation approach is introduced to increase the amount of training depth data. Experimental results suggest that feeding a tree-shaped CNN, specialised in local poses, into a fusion network for modelling joints' correlations and dependencies, helps to increase the precision of final estimations, showing competitive results on NYU, MSRA, Hands17 and SyntheticHand datasets.	en
dc.language.iso	eng	-
dc.publisher	IET Computer Vision	-
dc.rights.uri	http://creativecommons.org/licenses/by-nc/3.0/es/	-
dc.subject	computer vision	en
dc.subject	data acquisition	en
dc.subject	human computer interaction	en
dc.subject	learning (artificial-intelligence)	en
dc.subject	pose estimation	en
dc.title	End-to-end global to local convolutional neural network learning for hand pose recovery in depth data	-
dc.type	info:eu-repo/semantics/article	-
dc.rights.accessRights	info:eu-repo/semantics/openAccess	-
dc.identifier.doi	https://doi.org/10.1049/cvi2.12064	-
dc.gir.id	AR/0000009134	-
Aparece en las colecciones:	Articles cientÍfics Articles

Ficheros en este ítem:

Fichero	Descripción	Tamaño	Formato
IET Computer Vision - 2021 - Madadi - End¿to¿end global to local convolutional neural network learning for hand pose.pdf		2,73 MB	Adobe PDF	Visualizar/Abrir

Mostrar el registro sencillo del ítem

Comparte:

Impacto:

Google Scholar

Microsoft Academic

Exporta:

Consulta las estadísticas

Este ítem está sujeto a una licencia Creative Commons Licencia Creative Commons