Toward sensitive document release with privacy guarantees

Sánchez Ruenes, David; Batet, Montserrat

Please use this identifier to cite or link to this item: http://hdl.handle.net/10609/93054

Full metadata record

DC Field	Value	Language
dc.contributor.author	Sánchez Ruenes, David	-
dc.contributor.author	Batet, Montserrat	-
dc.contributor.other	Universitat Rovira i Virgili (URV)	-
dc.contributor.other	Universitat Oberta de Catalunya. Internet Interdisciplinary Institute (IN3)	-
dc.date.accessioned	2019-04-11T07:53:58Z	-
dc.date.available	2019-04-11T07:53:58Z	-
dc.date.issued	2016-11	-
dc.identifier.citation	Sánchez, D. & Batet, M. (2017). Toward sensitive document release with privacy guarantees. Engineering Applications of Artificial Intelligence, 59(), 23-34. doi: 10.1016/j.engappai.2016.12.013	-
dc.identifier.issn	0952-1976MIAR	-
dc.identifier.uri	http://hdl.handle.net/10609/93054	-
dc.description.abstract	Privacy has become a serious concern for modern Information Societies. The sensitive nature of much of the data that are daily exchanged or released to untrusted parties requires that responsible organizations undertake appropriate privacy protection measures. Nowadays, much of these data are texts (e.g., emails, messages posted in social media, healthcare outcomes, etc.) that, because of their unstructured and semantic nature, constitute a challenge for automatic data protection methods. In fact, textual documents are usually protected manually, in a process known as document redaction or sanitization. To do so, human experts identify sensitive terms (i.e., terms that may reveal identities and/or confidential information) and protect them accordingly (e.g., via removal or, preferably, generalization). To relieve experts from this burdensome task, in a previous work we introduced the theoretical basis of C-sanitization, an inherently semantic privacy model that provides the basis to the development of automatic document redaction/sanitization algorithms and offers clear and a priori privacy guarantees on data protection; even though its potential benefits C-sanitization still presents some limitations when applied to practice (mainly regarding flexibility, efficiency and accuracy). In this paper, we propose a new more flexible model, named (C, g(C))-sanitization, which enables an intuitive configuration of the trade-off between the desired level of protection (i.e., controlled information disclosure) and the preservation of the utility of the protected data (i.e., amount of semantics to be preserved). Moreover, we also present a set of technical solutions and algorithms that provide an efficient and scalable implementation of the model and improve its practical accuracy, as we also illustrate through empirical experiments.	en
dc.language.iso	eng	-
dc.publisher	Engineering Applications of Artificial Intelligence	-
dc.relation.ispartof	Engineering Applications of Artificial Intelligence, 2017, 59()	-
dc.relation.uri	https://doi.org/10.1016/j.engappai.2016.12.013	-
dc.rights	CC BY-NC-ND	-
dc.rights.uri	http://creativecommons.org/licenses/by-nc-nd/3.0/es/	-
dc.subject	document redaction	en
dc.subject	sanitization	en
dc.subject	semantics	en
dc.subject	ontologies	en
dc.subject	privacy	en
dc.subject	redacción de documentos	es
dc.subject	desinfección	es
dc.subject	semántica	es
dc.subject	ontologías	es
dc.subject	privacidad	es
dc.subject	redacció de documents	ca
dc.subject	higienització	ca
dc.subject	semàntica	ca
dc.subject	ontologies	ca
dc.subject	privacitat	ca
dc.subject.lcsh	Data protection	en
dc.title	Toward sensitive document release with privacy guarantees	-
dc.type	info:eu-repo/semantics/article	-
dc.subject.lemac	Protecció de dades	ca
dc.subject.lcshes	Protección de datos	es
dc.rights.accessRights	info:eu-repo/semantics/openAccess	-
dc.identifier.doi	10.1016/j.engappai.2016.12.013	-
dc.gir.id	AR/0000005425	-
dc.type.version	info:eu-repo/semantics/submittedVersion	-
Appears in Collections:	Articles cientÍfics Articles

Files in This Item:

File	Description	Size	Format
towardsensitive.pdf	Preprint	454,17 kB	Adobe PDF	View/Open

Show simple item record

Share:

Impact:

Google Scholar

Microsoft Academic

Export:

View statistics