Document Type : Research Article

Authors

1 Information Systems and Security Lab. (ISSL), Sharif University of Tech., Tehran, Iran

2 Information Systems and Security Lab‎. ‎(ISSL)‎, ‎Sharif University of Tech.‎, ‎Tehran‎, ‎Iran

Abstract

To enhance the accuracy of learning models‎, ‎it becomes imperative to train them on more extensive datasets‎. ‎Unfortunately‎, ‎access to such data is often restricted because data providers are hesitant to share their data due to privacy concerns‎. ‎Hence‎, ‎it is critical to develop obfuscation techniques that empower data providers to transform their datasets into new ones that ensure the desired level of privacy‎. ‎In this paper‎, ‎we present an approach where data providers utilize a neural network based on the autoencoder architecture to safeguard the sensitive components of their data while preserving the utility of the remaining parts‎. ‎More specifically‎, ‎within the autoencoder framework and after the encoding process‎, ‎a classifier is used to extract the private feature from the dataset‎. ‎This feature is then decorrelated from the other remaining features and subsequently subjected to noise‎. ‎The proposed method is flexible‎, ‎allowing data providers to adjust their desired level of privacy by changing the noise level‎. ‎Additionally‎, ‎our approach demonstrates superior performance in achieving the desired trade-off between utility and privacy compared to similar methods‎, ‎all while maintaining a simpler structure‎.‎‎

Keywords

[1] Reza Shokri and Vitaly Shmatikov. Privacy-preserving deep learning. In Proceedings of the 22nd ACM SIGSAC conference on computer and communications security, pages 1310–1321, 2015.
[2] Hangyu Zhu, Rui Wang, Yaochu Jin, Kaitai Liang, and Jianting Ning. Distributed additive encryption and quantization for privacy preserv-ing federated deep learning. Neurocomputing, 463:309–327, 2021.
[3] Kang Wei, Jun Li, Ming Ding, Chuan Ma, Howard H Yang, Farhad Farokhi, Shi Jin, Tony QS Quek, and H Vincent Poor. Federated learning with differential privacy: Algorithms and performance analysis. IEEE Transactions on Information Forensics and Security, 15:3454–3469, 2020.
[4] Milad Nasr, Reza Shokri, and Amir Houmansadr. Comprehensive privacy analysis of deep learning: Passive and active white-box inference attacks against centralized and federated learning. In 2019 IEEE symposium on security and privacy
(SP), pages 739–753. IEEE, 2019.
[5] Minghong Fang, Xiaoyu Cao, Jinyuan Jia, and Neil Gong. Local model poisoning attacks to {Byzantine-Robust} federated learning. In 29th USENIX security symposium (USENIX Security 20), pages 1605–1622, 2020.
[6] Vale Tolpegin, Stacey Truex, Mehmet Emre Gursoy, and Ling Liu. Data poisoning attacks against federated learning systems. In Computer Security–ESORICS 2020: 25th European Symposium on Research in Computer Security, ESORICS 2020, Guildford, UK, September 14–18, 2020, Proceedings, Part I 25, pages 480–501. Springer, 2020.
[7] Mojtaba Shirinjani, Siavash Ahmadi, Taraneh Eghlidos, and Mohammad R Aref. Private federated learning: An adversarial sanitizing perspective. ISeCure, 15(3), 2023.
[8] Latanya Sweeney. k-anonymity: A model for protecting privacy. International journal of uncertainty, fuzziness and knowledge-based systems, 10(05):557–570, 2002.
[9] Ashwin Machanavajjhala, Daniel Kifer, Johannes Gehrke, and Muthuramakrishnan Venkitasub-ramaniam. ℓ-diversity: Privacy beyond kanonymity. ACM Transactions on Knowledge Discovery from Data (TKDD), 1(1):3–es, 2007.
[10] Ninghui Li, Tiancheng Li, and Suresh Venkata-subramanian. t-closeness: Privacy beyond kanonymity and ℓ-diversity. In 2007 IEEE 23rd international conference on data engineering, pages 106–115. IEEE, 2006.
[11] Ehsan Hesamifard, Hassan Takabi, and Mehdi Ghasemi. Cryptodl: Deep neural networks over encrypted data. arXiv preprint arXiv:1711.05189, 2017.
[12] Fatih Emekçi, Ozgur D Sahin, Divyakant Agrawal, and Amr El Abbadi. Privacy preserving decision tree learning over multiple parties. Data & Knowledge Engineering, 63(2):348–361, 2007.
[13] Payman Mohassel and Yupeng Zhang. Secureml: A system for scalable privacy-preserving machine learning. In 2017 IEEE symposium on security and privacy (SP), pages 19–38. IEEE, 2017.
[14] Anh-Tu Tran, The-Dung Luong, Jessada Karnjana, and Van-Nam Huynh. An efficient approach for privacy preserving decentralized deep learning models based on secure multi-party computation. Neurocomputing, 422:245–262, 2021.
[15] Cynthia Dwork, Aaron Roth, et al. The algorithmic foundations of differential privacy. Foundations and Trends® in Theoretical Computer Science, 9(3–4):211–407, 2014.
[16] Martin Abadi, Andy Chu, Ian Goodfellow, H Brendan McMahan, Ilya Mironov, Kunal Talwar, and Li Zhang. Deep learning with differential privacy. In Proceedings of the 2016 ACM SIGSAC conference on computer and communications security, pages 308–318, 2016.
[17] Nicolas Papernot, Martín Abadi, Ulfar Erlingsson, Ian Goodfellow, and Kunal Talwar. Semi-supervised knowledge transfer for deep learning from private training data. arXiv preprint arXiv:1610.05755, 2016.
[18] Chong Huang, Peter Kairouz, Xiao Chen, Lalitha Sankar, and Ram Rajagopal. Context-aware generative adversarial privacy. Entropy, 19(12):656, 2017.
[19] Peter Kairouz, Jiachun Liao, Chong Huang, Maunil Vyas, Monica Welfert, and Lalitha Sankar. Generating fair universal representations using adversarial models. IEEE Transactions on Information Forensics and Security, 17:1970–1985, 2022.
[20] Ang Li, Yixiao Duan, Huanrui Yang, Yiran Chen, and Jianlei Yang. Tiprdc: task-independent privacy-respecting data crowdsourcing framework for deep learning with anonymized intermediate representations. In Proceedings of the 26th ACM SIGKDD international conference on knowledge discovery & data mining, pages 824–832, 2020.
[21] Abhishek Singh, Ethan Garza, Ayush Chopra, Praneeth Vepakomma, Vivek Sharma, and Ramesh Raskar. Decouple-and-sample: Protecting sensitive information in task agnostic data release. In Computer Vision–ECCV 2022: 17th European Conference, Tel Aviv, Israel, October 23–27, 2022, Proceedings, Part XIII, pages 499–517. Springer, 2022.
[22] Seyed Ali Osia, Ali Taheri, Ali Shahin Shamsabadi, Kleomenis Katevas, Hamed Haddadi, and Hamid R Rabiee. Deep private-feature extraction. IEEE Transactions on Knowledge and Data Engineering, 32(1):54–66, 2018.
[23] Seyed Ali Osia, Ali Shahin Shamsabadi, Sina Sajadmanesh, Ali Taheri, Kleomenis Katevas, Hamid R Rabiee, Nicholas D Lane, and Hamed Haddadi. A hybrid deep learning architecture for privacy-preserving mobile analytics. IEEE Internet of Things Journal, 7(5):4505–4518, 2020.
[24] Hung Nguyen, Di Zhuang, Pei-Yuan Wu, and Morris Chang. Autogan-based dimension reduction for privacy preservation. Neurocomputing, 384:94–103, 2020.
[25] Bishwas Mandal, George Amariucai, and Shuangqing Wei. Uncertainty-autoencoder-based privacy and utility preserving data type conscious transformation. In 2022 International Joint Conference on Neural Networks (IJCNN), pages 1–8. IEEE, 2022.
[26] Mohammad Ali Jamshidi, Hadi Veisi, Mohammad Mahdi Mojahedian, and Mohammad Reza Aref. Adjustable privacy using autoencoder-based learning structure. arXiv preprint arXiv:2304.03538, 2023.
[27] Florian Tramer and Dan Boneh. Differentially private learning needs better features (or much more data). arXiv preprint arXiv:2011.11660, 2020.
[28] Ian Goodfellow. Nips 2016 tutorial: Generative adversarial networks. arXiv preprint arXiv:1701.00160, 2016.
[29] Naveen Kodali, Jacob Abernethy, James Hays, and Zsolt Kira. On convergence and stability of gans. arXiv preprint arXiv:1705.07215, 2017.
[30] Samuel A Barnett. Convergence problems with generative adversarial networks (gans). arXiv preprint arXiv:1806.11382, 2018.
[31] Nisarg Raval, Ashwin Machanavajjhala, and Landon P Cox. Protecting visual secrets using adversarial nets. In 2017 IEEE Conference on Computer Vision and Pattern Recognition Workshops (CVPRW), pages 1329–1332. IEEE, 2017.
[32] Ziwei Liu, Ping Luo, Xiaogang Wang, and Xiaoou Tang. Deep learning face attributes in the wild. In Proceedings of the IEEE international conference on computer vision, pages 3730–3738, 2015.
[33] Karen Simonyan and Andrew Zisserman. Very deep convolutional networks for large-scale image recognition. arXiv preprint arXiv:1409.1556, 2014.
[34] Adam Paszke, Sam Gross, Francisco Massa, Adam Lerer, James Bradbury, Gregory Chanan, Trevor Killeen, Zeming Lin, Natalia Gimelshein, Luca Antiga, et al. Pytorch: An imperative style, high-performance deep learning library. Advances in neural information processing systems, 32, 2019.
[35] Diederik P Kingma and Jimmy Ba. Adam: A method for stochastic optimization. arXiv preprint arXiv:1412.6980, 2014.