Project A5

Semantic and Statistical Linkability

Principal Investigators

Michael Backes

Gerhard Weikum

PhD Students

Praveen Manoharan

Joanna Biega

Project Summary

One of the biggest threats in modern digital habitats is identity linkability across disparate platforms (e.g., identifying a particular user in an anonymous health discussion forum). The long-term goal of this project is to devise a deep semantic understanding of, and tool support for the analysis of, such linkability. How can we characterize user profile linkability risks? How can we automatically analyze the risk at stake for a particular user, given the heterogeneous and unstructured nature of typical user platform content? Given such an understanding, how can we exploit it to best protect the user without affecting functionality? We will address these challenges through devising a user-centric privacy measure assessing a user’s ability to hide amongst her peers. We will employ text understanding technology, and connect to visual data analysis methods, to gain a technological grasp on linkability through user content, and we will investigate sources of linkability through network traffic. We will employ this understanding to inform users about linkability risks, and for guiding protection mechanisms against linking, in particular by automated re-phrasing and targeted (and hence cost-effective) anonymization.

