ROI: A Method for Identifying Organizations Receiving Personal Data

Published in Computing (Springer), Volume 106, Pages 163–184 (2024), 2023

Abstract

This paper introduces ROI (Receiver Organization Identifier), a method to identify the organizations receiving personal data from apps and digital services. Combining privacy policy analysis and WHOIS lookup, ROI achieves 95.71% precision, outperforming prior techniques like SSL inspection or WebXray.

To demonstrate ROI’s effectiveness in real-world scenarios, the method was applied to over 10,000 Android apps, identifying 1,112 unique domains receiving personal data. The study revealed that only 22% of apps properly disclose all recipients in their privacy policies, while 78% fail to meet transparency obligations.

Key Contributions

  • 📱 Applied ROI to 10,000 Android apps, identifying 40,000+ personal data flows.
  • 🏢 Identified recipients in 82% of flows, with Meta and Google leading.
  • ⚖️ Found that 78% of apps fail to properly disclose data recipients as required by the GDPR.
  • 📈 ROI combines named-entity recognition, privacy policy mining, and WHOIS for unmatched precision.
  • 🧪 Public datasets released with 1,112 recipient domains and their corresponding organizations.

👉 Read the full paper

Recommended citation: D. Rodriguez, J.M. Del Alamo, M. Cozar, B. García. "ROI: A Method for Identifying Organizations Receiving Personal Data." Computing 106, 163–184 (2024). https://doi.org/10.1007/s00607-023-01209-2
Download Paper