Submitted:
25 April 2024
Posted:
26 April 2024
You are already at the latest version
Abstract
Keywords:
1. Introduction
- To address the performance shortcomings of traditional PSI protocols in the face of significant differences in dataset sizes between participants, this paper introduces the first protocol, which is an unbalanced PSI protocol based on Cuckoo filters. This protocol successfully constructs the first unbalanced private intersection protocol by integrating exchange encryption technologies with Cuckoo filter functionalities for private information retrieval, followed by experimental analysis.
- To alleviate the computational and storage burden on clients in the first protocol, the paper further proposes an unbalanced PSI protocol based on single cloud assistance and conducts experimental analysis. This strategy effectively migrates computational and storage tasks to cloud services, significantly optimizing resource utilization efficiency.
- To safeguard against data leakage risks inherent in the unbalanced PSI protocol based on single cloud assistance which cannot resist collusion attacks, the paper further designs an unbalanced PSI protocol based on dual cloud assistance. By employing homomorphic encryption and other security technologies, this scheme resolves potential data leakage risks in the single-cloud protocol while effectively preventing potential collusion attacks.
- Building on the unbalanced PSI protocol based on dual cloud assistance, the research introduces the concept of a PSI network and formulates corresponding data update strategies, significantly enhancing the practicality of the protocol.
2. Related Works
2.1. Design framework of private set intersection protocol
2.1.1. Design Framework Based on Public Key Encryption
- Based on Diffie-Hellman (DH) theory: Meadows [5] used the DH key exchange mechanism, which is based on the discrete logarithm problem, to implement a PSI protocol. In contrast, Huberman [6] and his team explored the use of elliptic curve cryptography in PSI, noting its significant advantages in security and efficiency compared to traditional discrete logarithm-based PSI methods.
- Based on the RSA assumption: DeCristofaro and others [7] developed a semi-honest PSI protocol using RSA blind signature technology based on the integer factorization problem. Another study [8] showed that PSI schemes based on discrete logarithm cryptography demonstrated higher efficiency compared to those based on integer factorization cryptography.
- Based on homomorphic encryption: Freedman and his team [9] innovatively represented elements as roots of polynomials and encrypted the coefficients of these polynomials using Paillier homomorphic encryption technology, combined with zero-knowledge proofs, to implement a two-party PSI protocol resistant to malicious attacks. In 2016, Freedman et al. [10] further improved computational efficiency through the ElGamal encryption mechanism and reduced the protocol’s computational complexity using Cuckoo Hash technology [4]. Abadi et al. [11] introduced a set representation method based on point-value pairs of d-degree polynomials, implemented through the Paillier encryption scheme, reducing the multiplication complexity from O() to O(d) [4]. Kissner and other researchers [12] adopted different polynomial representation methods, significantly reducing computational costs to be linearly proportional to the number of participants. Jarecki and others [13] used additive homomorphic encryption and zero-knowledge proofs to implement pseudorandom functions (PRF). Hazay and others [14] developed an additive homomorphic encryption scheme that supports threshold decryption for implementing multi-party semi-honest PSI protocols. Dou Jiawei and others [15] combined Paillier encryption to propose a PSI protocol based on the formula for calculating the area of triangles and rational number encoding.
2.1.2. Design Framework Based on Garbled Circuits
2.1.3. Design Framework Based on Oblivious Transfer
2.2. Private contact discovery
3. Related theories and technologies
3.1. Multi-party secure computation security model
- Semi-honest model: In this model, participants adhere to the protocol’s execution rules but may attempt to gather other participants’ inputs, outputs, and any accessible information during the execution of the protocol. This model assumes that the participants do not deviate from the established procedural rules but will use all available information to deduce the private data of others.
- Malicious adversary model: Unlike the semi-honest model, the malicious adversary model accounts for the possibility that attackers may manipulate a subset of the participants to perform illicit actions, such as submitting incorrect input data or maliciously altering data to steal the private information of honest participants. Malicious adversaries might also disrupt the protocol by intentionally terminating its execution or by refusing to participate, thus preventing the protocol’s completion.
3.2. Cuckoo filter
3.3. Paillier homomorphic encryption
- Additive Homomorphism: If and , then . This allows for performing addition operations on ciphertexts without needing to decrypt them first.
- Scalar Multiplication Homomorphism: If , then . This means that it is possible to perform multiplication operations between a ciphertext and a plaintext scalar without decryption.
4. Unbalanced PSI Protocol Based on Cuckoo Filters
4.1. PSI Protocol Constructed Based on DH Key Exchange Mechanism
4.1.1. Preprocessing Stage:
- Hash Processing: Each participant applies the same hash function to each element in their dataset to form a hash-processed dataset, ready for subsequent encryption and computation processes.
4.1.2. Exchange and Computation Stage:
- Exponentiation: Participant A takes each element from its dataset, use a to perform exponentiation operations (where a is A’s private key) and forms a new set.
- Data Exchange: Participant A sends the above-computed set to Participant B.
- Auxiliary Dataset Construction: Upon receiving the dataset from A, Participant B use b to perform exponentiation operations (where b is B’s private key) to build an auxiliary dataset and sends the result set back to A.
- Exponentiation: At the same time, Participant B also use b to perform exponentiation operations of each element in its own dataset , which is also sent to A.
4.1.3. Intersection Identification Stage:
- Exponentiation and Comparison: After receiving two datasets from B, participant use a to perform exponentiation operations of the elements in the latter dataset received that has been powered by B. Then, A compares this result with another dataset received from B.
- Intersection Determination: If an element after being powered a times matches an element in the auxiliary dataset sent by B, then that element belongs to the intersection of datasets A and B.
4.1.4. Experimental Analysis
4.2. Unbalanced PSI Protocol Based on Cuckoo Filters
4.2.1. Definition of main participants and related symbols
- database server:Represents the database server that holds all user data.
- client:Represents the mobile client who wants to perform private contact discovery services.
- X and Y represent the dataset of the database server and the client, respectively.
- represents the private key of the database server in the Diffie-Hellman encryption algorithm.
- represents the random number generated by the client for the Diffie-Hellman encryption algorithm.
- H represents the hash function negotiated by the client and database server for use.
- represents Cuckoo Filter, represents the operation of adding an element to the Cuckoo filter, represents the operation of checking whether a specific element exists in the filter.
- represents the i-th element of set X. Similarly, , , etc., also represent similar meanings.
- represents the set containing ciphertexts sent by the client to the database server.
- represents the set containing ciphertexts sent by the database server to the client.
- represents the result obtained through a series of exchange and decryption operations, used to retrieve the filter to obtain the intersection.
4.2.2. Protocol Process
-
Preprocessing:In the preprocessing phase, the client and server need to perform a series of preparatory work to ensure the security and efficiency of subsequent interactions. The specific steps are as follows:
- Security parameter negotiation: The client and database server agree on the large prime number q used in the DH encryption algorithm and the hash function H used.
- Database Server Generates Private Key: The database server generates its own private key , used for the Diffie-Hellman (DH) encryption algorithm.
- Data Scrambling: The client and database server scramble their own datasets X and Y for randomization, enhancing data privacy and security.
- Client Data Preprocessing: The client calculates and generates random numbers , used for the Diffie-Hellman (DH) encryption algorithm.
- Creation of Cuckoo Filter: The database server generate a Cuckoo filter by using the operation , and sends the filter to the client for private set intersection queries with privacy protection.
-
Intersection:In the intersection phase, the client and database server perform a series of carefully designed encryption and decryption operations to blind the client’s elements securely and compute the intersection of the two sets. The specific operations are as follows:
- Element Blinding and Interactive Encryption Operations: The client and the database server interact through a series of asymmetric encryption and decryption operations to blind the client’s elements. Specifically, the client calculates and sends C to the database server. The database server uses its private key to compute and sends back to the client.
- Intersection Computation: After receiving , the client checks whether they belong to the filter through the check operation , thereby calculating the intersection of the sets. Specifically, after receiving sent by the database server, the client computes and uses the result to query the filter to obtain the intersection element .
4.2.3. Correctness Analysis
4.2.4. Security Analysis
4.2.5. Experimental Analysis
4.3. Summary of This Chapter
5. Unbalanced PSI Protocol Based on Single Cloud Assistance
5.1. Definition of main participants and related symbols
- database server:Represents the database server that holds all user data.
- client:Represents a mobile client that wants to discover private contacts.
- cloud server: Represents an auxiliary server that assists the client in performing intersection operations, undertaking most of the computational and storage pressures.
- X and Y respectively represent the dataset of the database server and the client dataset.
- represents the private key of the database server in the Diffie-Hellman encryption algorithm.
- represents the random number generated by the client, used to blind the data.
- represents the random number generated by the client for the Diffie-Hellman encryption algorithm.
- H represents the hash function negotiated for use by the client and database server.
- represents the Cuckoo Filter, represents the operation to add an element to the Cuckoo filter, represents the operation to check if a specified element exists in the filter.
- represents the i-th element of the set X. Similarly, , , etc., also represent similar meanings.
- represents the set of ciphertexts sent by the client to the database server.
- represents the set of ciphertexts sent by the database server to the client.
- represents the result obtained through a series of exchange and decryption operations, used to retrieve the filter to obtain the intersection.
5.2. Protocol Process
5.2.1. Preprocessing
- Security parameter negotiation: Each role discusses the necessary security parameters—all parties share the large prime q used in the DH cryptographic algorithm. The client and database server negotiate to generate and the hash function H.
- Database server generates a private key: The database server generates its own private key , for use in the Diffie-Hellman encryption algorithm.
- Data scrambling: The client and database server each scramble their own datasets X and Y.
- Client data preprocessing: The client calculates , generates random numbers , and calculates .
5.2.2. Outsourcing
- Database server sends data to the cloud server: The database server uses its private key to perform the operation , creates a Cuckoo filter , and sends it to the cloud server.
- Client sends data to the cloud server: The client sends the random numbers and to the cloud server. After receiving the data sent by the client, the cloud server calculates . At this point, the cloud server has saved the client’s blinded data.
5.2.3. Intersection
- Cloud server sends data: The client cloud server sends the blinded data to the database server.
- Database server processes data: Upon receiving , the database server uses its private key to calculate , and sends the result back to the cloud server.
- Cloud server processes data: After receiving from the database server, the cloud server calculates and uses the result to search . If exists in , it returns the index j of and sends j to the client.
- Obtaining the intersection: The client obtains the intersection element through the received index j.
5.3. Correctness Analysis
5.4. Security Analysis
5.5. Experimental Analysis
5.5.1. Data Storage Volume
5.5.2. Protocol Running Time
5.6. Summary of This Chapter
6. Unbalanced PSI Protocol Based on Dual cloud Assistance
6.1. Definition of main participants and related symbols
- database server: Represents the database server that holds all user data.
- client: Represents the mobile client that wishes to perform private contact discovery services.
- cloud server : Acts as an auxiliary server for the client, handling the majority of computation and storage pressures.
- cloud server : Another auxiliary server handling substantial computational and storage demands.
- X and Y: Represent the dataset of the database server and the client, respectively.
- : Represents the private key of the database server used in the Diffie-Hellman encryption algorithm.
- H: The hash function agreed upon by the client and the database server for use.
- : Represents the Cuckoo Filter, where denotes the operation to add elements, and checks for the presence of specific elements.
- : Random exponentials generated by the client for cloud server , for cloud server .
- a: A secret value held by the client.h
- : Random numbers used by the client for sending obfuscated data to cloud server , and for where .
- : The ciphertext collection sent from cloud server to the database server, and from ; and are specific elements within these collections.
- and : Processed ciphertext collections returned to and from the database server; and are specific elements within these collections.
- and : Final processed ciphertext collections at and after receiving data from the database server; and are specific elements within these collections.
- : Represents the result of multiplying and used to query the filter.
- j: Represents the index used by the client to obtain the intersection.
6.2. Protocol Process
6.2.1. Preprocessing
- Discuss security parameters: Each party discusses the necessary security parameters—the large prime q used in DH encryption and the client’s public key required for the Paillier encryption system. The client and the database server negotiate the creation of hash function H.
- Client sends : The client generates its private secret number a and sends to the database server.
- Database server generates private key: The database server creates its private key , used for the DH encryption algorithm.
- Data scrambling: The client and the database server each shuffle their respective datasets.
- Client calculates hashes and generates random numbers: The client computes and generates random numbers , , , , and computes , where .
6.2.2. Outsourcing
- Client sends data to cloud servers: The client sends , to cloud server , and , to cloud server . computes , and computes . At this point, and hold the client’s obfuscated data.
- Database server sends data to cloud servers: Using , the database server performs the filter insertion operation to generate a Cuckoo filter and sends it to cloud server . stores the filter sent by the database server.
6.2.3. Intersection
- and send data: and each send their respective collections and to the database server.
- Database server processes data: Upon receiving the data, the database server uses its private key to compute and sends the results back to . It also processes and sends the results back to .
- processes data: After receiving data from the database server, uses the random number to calculate and sends the results to .
- processes data: Upon receiving data from and the database server, calculates . checks if exists in . If it does, it returns the index j of and sends it to the client.
- Obtaining the intersection: The client receives the index j and retrieves the intersecting element .
6.3. Correctness Analysis
6.4. Security Analysis
- The client runs the preprocessing algorithm, sharing the cryptographic hash function H and the large prime q used in the protocol with the adversary.
- The client simulates the outsourcing algorithm and sends their (encrypted) input to the adversary.
- The client and the adversary simulate the intersection algorithm and discard any output.
- The adversary is asked to output a guess of the client’s input y.
- In step four of Figure 3, since and are unknown to the adversary, cannot be derived. The adversary can only attempt exhaustive guessing, thus making negligible.
- In subsequent steps, as A does not know the client’s private key for the Paillier encryption system, it is impractical to decrypt the ciphertexts, making it even more challenging to derive . For instance, , and since the private key used in Paillier’s system by the client is unknown, decrypting this compound is complex and hence remains secure.
6.5. Experimental Analysis
6.5.1. Data Computation Volume
- unbalanced PSI protocol based on Cuckoo filters: Two rounds of modular exponentiation operations and filter retrieval.
- unbalanced PSI protocol based on single cloud assistance: A single round of multiplication operations and outputting based on index j.
- unbalanced PSI protocol based on dual cloud assistance: Two rounds of multiplication operations and outputting based on index j.
- Modular Exponentiation Operation: Representing computation-intensive operations, modular exponentiation becomes particularly time-consuming when dealing with large numbers. On standard hardware setups, a single instance of modular exponentiation might take from a few milliseconds to several tens of milliseconds, depending mainly on the size of the numbers involved and the efficiency of the algorithm.
- Multiplication Operation: Compared to modular exponentiation, multiplication operations execute much faster on modern computing systems, even when involving large numbers, thanks to optimized algorithms that can keep times in the microsecond range. Therefore, whether it’s a single round of multiplication in the single-cloud protocol or two rounds in the dual-cloud protocol, the processing times are relatively short, typically ranging from a few microseconds to a few hundred microseconds.
- Cuckoo Filter Retrieval: Although relatively quick, the retrieval operation for a Cuckoo filter involves memory access, which may make it slightly slower than simple arithmetic operations. This type of operation typically takes from a few microseconds to several tens of microseconds, depending on the size of the filter and the efficiency of the implementation.
- Outputting Based on Index j: This operation involves retrieving an element from an array or list based on a specific index and is generally very fast, with processing times possibly ranging from a few nanoseconds to a few microseconds, primarily limited by memory access speeds.
- Unbalanced PSI Protocol Based On Cuckoo Filtersl: Primarily relies on two rounds of modular exponentiation, which are computation-intensive, especially when dealing with large numbers, making it the most time-consuming of all the operations reviewed.
- Unbalanced PSI Protocol Based On Single Cloud Assistance: By executing a single round of multiplication and an index-based data retrieval process, it significantly alleviates the computational burden on the client. Multiplication operations, even for large numbers, can be completed within the microsecond range (from a few to several hundred microseconds), and index-based data retrieval takes an extremely short time, usually just a few nanoseconds to a few microseconds.
- Unbalanced PSI Protocol Based On Dual cloud Assistance: Includes two rounds of multiplication operations and an index-based data retrieval process, also aiming to distribute the computational pressure on the client. Although it involves two rounds of multiplication, due to the inherent efficiency of the operation, the total processing time remains within an acceptable range.
6.5.2. Protocol Running Time
6.6. Summary of This Chapter
6.7. Extensions
6.7.1. PSI Network
- Access and Authentication of Cloud Servers: Any server can apply to become a cloud server, also known as a server assistant. These servers must undergo a series of certification processes (including hardware performance verification, security vulnerability scanning, and compliance checks) to ensure they meet security and performance standards. Servers that pass the certification but later violate regulations will be blacklisted and removed. The system maintains platform security and trust through mechanisms such as regular security scans and real-time monitoring, with any violations leading to immediate removal and further investigation of the server.
- Mechanism for Selecting Server Assistants: When needing to perform PSI, clients choose two cloud servers based on their performance (such as processing power, storage capacity, and network bandwidth), stability, security capabilities, and compliance with regulations, among other hard and soft factors. Cloud servers with high availability promises are preferred to minimize the risk of failures.
- Execution Mechanism for PSI Operations: The PSI network supports client flexibility and system scalability; clients can execute PSI on different database servers by merely changing , without needing to redesign the entire system. This design enhances client flexibility and the system’s efficiency, reliability, and security.
6.7.2. Summary of the PSI Network
6.7.3. Data Updates
-
Data Updates on the Database Server Side:As shown in Figure 6, the update details of the database server are as follows:Definition of main participants and related symbols:
- database server: Represents the database server that wants to encrypt and upload updated data to cloud server .
- cloud server ): Represents the cloud-assisted server that assists the database server in completing update operations.
- Z represents the set of data to be updated, represents the k-th element of Z.
- represents the load factor of the filter.
- represents the data after encryption processing.
- represents the operation index, used to determine whether the update operation is an insertion or deletion.
- U represents the set of data sent by the database server to the cloud-assisted server , represents the k-th element of U.
Update process:- The database server has a set of elements Z it wants to insert or delete. These elements are blinded before being sent to cloud server . Specifically, .
- In addition to sending the blinded elements, the database server also sends an identifier variable to inform the client whether the operation is an insertion or a deletion.
- During an insertion operation, first checks whether the current filter’s load factor exceeds 0.95.
- If the load factor is greater than 0.95, then must request the database server to generate a new filter using all elements to maintain high spatial and lookup efficiency of the filter.
- If the load factor is less than or equal to 0.95, then can directly insert the element into the current filter .
- In a deletion operation, removes the specified element from the filter , a process that does not require generating a new filter.
This section introduces the data update process for the database server under the unbalanced PSI protocol based on single cloud assistance. This series of update operations allows the database server to flexibly handle the insertion and deletion of elements based on the current state of the filter, ensuring the system’s efficiency and accuracy.. -
Data Updates on the Client Side:: As shown in Figure 7, the update details of the database server are as follows:Definition of main participants and related symbols:
- client: Represents the client who wants to perform data updates.
- cloud server : Represents the cloud-assisted server that assists the database server in completing update operations.
- cloud server Represents the cloud-assisted server that assists the database server in completing update operations.
- Z represents the set of data to be updated, represents the k-th element of Z.
- represents the data after being processed by the hash function H.
- k represents the data index, used to determine the type of update, either insertion or deletion, and to retrieve the updated data based on the index.
- When adding data, represents the data processed through the dual-cloud scheme and sent to the two cloud-assisted servers. When deleting, is null.
- V represents the set of data sent by the database server to the cloud-assisted server , represents the k-th element of V.
- represents the set of data sent by the database server to the cloud-assisted server , represents the k-th element of .
Update process:- The client has a set of elements Z it wants to insert or delete. In both cases, the client blinds each element and sends them to and respectively.
- The client sends a data index K to inform the cloud servers about the type of update, whether it is an insertion or a deletion. If the index is less than , it indicates a deletion operation. In this case, is null, and and delete the corresponding data based on the index.
- If the index is greater than , it indicates an addition operation, and the corresponding calculation results and index are saved.
- After completing a batch of deletion and addition operations, the relative order of the indices also needs to be adjusted. The update process is illustrated in Figure 5.
This section introduces a client data update process based on dual cloud assistance, designed to enhance the database’s dynamic management capabilities while ensuring data privacy and efficiency. This update protocol supports both data insertion and deletion operations, and through the cooperation of cloud-assisted servers and , it optimizes the speed and security of client data updates. - Summary of Data Updates: This section has explored two key data update processes based on the unbalanced PSI protocol based on dual cloud assistance the database server update process and the client update process. Both processes are designed to efficiently handle data insertions and deletions while ensuring data security , and to use cloud server resources to optimize overall operation efficiency.
7. Conclusions and Future Work
7.1. Work Summary
- Addressing the shortcomings of traditional private set intersection protocols when dealing with significant data size disparities among participants, this paper proposes the first protocol, namely the unbalanced PSI protocol based on Cuckoo filters. This protocol successfully constructs a novel private set intersection approach through encrypted exchanges and using Cuckoo filters for private information retrieval.
- Given the complexities of cryptographic operations and storage demands in the unbalanced PSI protocol based on Cuckoo filters, this paper introduces a unbalanced PSI protocol based on single cloud assistance. This protocol successfully offloads most of the client’s computational and storage burdens onto the cloud.
- In response to potential collusion between the cloud and database servers in the unbalanced PSI protocol based on single cloud assistance, this paper proposes a unbalanced PSI protocol based on dual cloud assistance with security mechanisms like homomorphic encryption, which effectively prevents collusion attacks while offloading computational and storage burdens.
- Concerning practical issues in the unbalanced PSI protocol based on dual cloud assistance, this paper also introduces a conceptually meaningful PSI network and a data update mode tailored for the unbalanced PSI protocol based on dual cloud assistance.
7.2. Protocol Summary
7.3. Future Outlook
- All protocols are designed for two-party unbalanced private set intersections. Extending these protocols to multi-party scenarios is an important future direction, given the practical needs for multi-party computations.
- The protocols are developed under a semi-honest security model. Extending their robustness to malicious models, where adversaries may actively attempt to undermine the protocols, represents a crucial area for further research.
- The current protocols focus solely on set intersection. In practical applications, there may be requirements to perform further computations on the intersection results. Developing functionalities to support such computations post-intersection is another significant direction for future work.
References
- Bald, P.; Baronio, R.; Cristofaro, E.; Gasti, P.; Tsudik, G. Efficient and secure testing of fully-sequenced human genomes. Biological Sciences Initiative 2000, 470, 7–10. [Google Scholar]
- Chen, H.; Laine, K.; Rindal, P. Fast private set intersection from homomorphic encryption. Proceedings of the 2017 ACM SIGSAC Conference on Computer and Communications Security, 2017, pp. 1243–1255.
- Nagaraja, S.; Mittal, P.; Hong, C.Y.; Caesar, M.; Borisov, N. {BotGrep}: Finding {P2P} Bots with Structured Graph Analysis. 19th USENIX Security Symposium (USENIX Security 10), 2010.
- Li, W.; Liu, J.; Zhang, L.; Wang, Q.; He, C. A Survey on Set Intersection Computation for Privacy Protection. Journal of Computer Research and Development 2022, 59, 1782–1799. [Google Scholar]
- Meadows, C. A More Efficient Cryptographic Matchmaking Protocol for Use in the Absence of a Continuously Available Third Party. Proc. of the 7th IEEE Symposium on Security and Privacy; IEEE Computer Society: Los Alamitos, CA, 1986; pp. 134–134. [Google Scholar] [CrossRef]
- Huberman, B.; Franklin, M.; Hogg, T. Enhancing Privacy and Trust in Electronic Communities. Proc. of the 1st ACM Conference on Electronic Commerce; ACM: New York, 1999; pp. 78–86. [Google Scholar]
- DeCristofaro, E.; Tsudik, G. Experimenting with Fast Private Set Intersection. Proc. of Int. Conf. on Trust and Trustworthy Computing; Springer: Berlin, 2012; pp. 55–73. [Google Scholar] [CrossRef]
- Pinkas, B.; Schneider, T.; Zohner, M. Faster Private Set Intersection Based on OT Extension. Proc. of the 23rd USENIX Security Symposium; USENIX Association: Berkeley, CA, 2014; pp. 797–812. [Google Scholar]
- Freedman, M.; Nissim, K.; Pinkas, B. Efficient Private Matching and Set Intersection. Proc. of the 23rd Int. Conf. on the Theory and Applications of Cryptographic Techniques; Springer: Berlin, 2004. [Google Scholar] [CrossRef]
- Freedman, M.J.; Hazay, C.; Nissim, K.; et al. Efficient Set Intersection with Simulation-Based Security. Journal of Cryptology 2016, 29, 115–155. [Google Scholar] [CrossRef]
- Abadi, A.; Terzis, S.; Dong, C. O-PSI: Delegated Private Set Intersection on Outsourced Datasets. Proc of the 27th IFIP International Information Security and Privacy Conference; Springer: Berlin, 2015; pp. 3–17. [Google Scholar] [CrossRef]
- Kissner, L.; Song, D. Privacy-Preserving Set Operations. Proc of the 25th Annual International Cryptology Conference; Springer: Berlin, 2005; pp. 241–257. [Google Scholar] [CrossRef]
- Jarecki, S.; Liu, X. Efficient Oblivious Pseudorandom Function with Applications to Adaptive OT and Secure Computation of Set Intersection. LNCS 5444: Proc of the 6th Theory of Cryptography Conference; Springer: Berlin, 2009; pp. 577–594. [Google Scholar] [CrossRef]
- Hazay, C.; Venkitasubramaniam, M. Scalable Multi-party Private Set-Intersection. Proc of the 20th IACR International Workshop on Public Key Cryptography; Springer: Berlin, 2017; pp. 175–203. [Google Scholar]
- Dou, J.; Liu, X.; Wang, W.; et al. Efficient and Secure Calculation of Two-Party Sets in the Field of Rational Numbers. Chinese Journal of Computers 2020, 43, 1397–1413. [Google Scholar]
- Damgård, I.; Pastro, V.; Smart, N.; et al. Multiparty Computation from Somewhat Homomorphic Encryption. Proceedings of the 32nd Annual Cryptology Conference; Name, E., Ed.; Springer: Berlin, 2012; pp. 643–662. [Google Scholar] [CrossRef]
- Yao, A.C. Protocols for Secure Computations. Proc of the 23rd Annual Symposium on Foundations of Computer Science (SFCS 1982); IEEE: Piscataway, NJ, 1982; pp. 160–164. [Google Scholar] [CrossRef]
- Micali, S.; Goldreich, O.; Wigderson, A. How to Play Any Mental Game. Proc of the 19th ACM Symposium on Theory of Computing; ACM: New York, 1987; pp. 218–229. [Google Scholar]
- Pinkas, B.; Schneider, T.; Segev, G.; et al. Phasing: Privateset intersectionusing permutation-basedhashing. Proceedings of the 24th USENIX Security Symposium; USENIX Association, 2015; pp. 515–530.
- Pinkas, B.; Schneider, T.; Weinert, C.; et al. Efficient circuit-based PSI via cuckoo hashing. Proceedings of the 38th Annual International Conference on the Theory and Applications of Cryptographic Techniques; Springer, 2018; pp. 125–157.
- Pinkas, B.; Schneider, T.; Tkachenko, O.; et al. Efficient circuit-based PSI with linear communication. Proceedings of the 39th Annual International Conference on the Theory and Applications of Cryptographic Techniques; Springer, 2019; pp. 122–153.
- Huang, Y.; Evans, D.; Katz, J. Private Set Intersection: Are Garbled Circuits Better Than Custom Protocols? Proc of the 19th Network and Distributed System Security Symposium; ISOC: Reston, VA, 2012. [Google Scholar]
- Naor, M.; Pinkas, B. Efficient oblivious transfer protocols. SODA 2001, 1, 448–457. [Google Scholar]
- Dong, C.; Chen, L.; Wen, Z. When private set intersection meets big data: an efficient and scalable protocol. Proceedings of the 2013 ACM SIGSAC conference on Computer & communications security, 2013, pp. 789–800.
- Rindal, P.; Rosulek, M. Improved private set intersection against malicious adversaries. Annual International Conference on the Theory and Applications of Cryptographic Techniques. Springer International Publishing, 2017, pp. 235–259.
- Zhang, E.; Liu, F.H.; Lai, Q.; et al. Efficient multi-party private set intersection against malicious adversaries. Proceedings of the 2019 ACM SIGSAC conference on cloud computing security workshop, 2019, pp. 93–104.
- Pinkas, B.; Rosulek, M.; Trieu, N.; et al. PSIfrom PaXoS: Fast, malicious private set intersection. Proceedings of the 39th Annual International Conference on the Theory and Applications of Cryptographic Techniques. Springer, 2020, pp. 739–767.
- Orrù, M.; Orsini, E.; Scholl, P. Actively secure 1-out-of-n OT extension with application to private set intersection. Proceedings of Cryptographers’ Track at the RSA Conference. Springer, 2017, pp. 381–396.
- Rindal, P.; Schoppmann, P. VOLE-PSI: Fast OPRF and circuit-PSI from vector-OLE. IACR Cryptology ePrint Archive, 2021. https://eprint.iacr.org/2021/266.
- Schoppmann, P.; Gascón, A.; Reichert, L.; et al. Distributed vector-OLE: Improved constructions and implementation. Proceedings of the 26th ACM SIGSAC Conference on Computer and Communications Security. ACM, 2019, pp. 1055–1072.
- Weng, C.; Yang, K.; Katz, J.; et al. Wolverine: Fast, scalable, and communication-efficient zero-knowledge proofs for Boolean and arithmetic circuits. Cryptology ePrint Archive, 2020. https://eprint.iacr.org/2020/925.
- Hill, K. Facebook Figured Out My Family Secrets, And It Won’t Tell Me How. Gizmodo 2017. Published on August 25, 2017.
- Marlinspike, M. Private Contact Discovery for Signal, 2017. Accessed on September 26, 2017.
- Mittal, P.; Papamanthou, C.; Song, D. Preserving Link Privacy in Social Network Based Systems. NDSS, 2013.
- Abebe, R.; Nakos, V. Private Link Prediction in Social Networks. Technical report, Harvard University, 2014.
- Karwa, V.; Raskhodnikova, S.; Smith, A.; Yaroslavtsev, G. Private Analysis of Graph Structure. PVLDB 2011, 4. [Google Scholar] [CrossRef]
- Dwork, C. A Firm Foundation for Private Data Analysis. Communications of the ACM 2011. [Google Scholar] [CrossRef]
- Erlingsson, Ú.; Pihur, V.; Korolova, A. RAPPOR: Randomized Aggregatable Privacy-Preserving Ordinal Response. Proc. of the ACM Conference on Computer and Communications Security (CCS), 2014.
- Brendel, W.; Han, F.; Marujo, L.; Jie, L.; Korolova, A. Practical privacy-preserving friend recommendations on social networks. Companion Proceedings of the The Web Conference 2018, 2018, 111–112. [Google Scholar]
- Su, G.; Xu, M. A Survey on Secure Multi-party Computation Technology and Applications. Information Communication Technologies and Policy 2019, 19–22. [Google Scholar]
- Li, A. Research on Multi-party Statistical Computations Based on Functional Encryption. PhD thesis, Wuhan University of Technology, Wuhan, 2017. [Google Scholar]
- Wang, H.; Dai, H.; Chen, S.; Chen, Z.; Chen, G. A Survey of Filter Data Structures. Computer Science 2024, 51, 35–40. [Google Scholar]
- Yu, M.; Fabrikant, A.; Rexford, J. BUFFALO: Bloom filter forwarding architecture for large organizations. Proceedings of International Conference on Emerging Networking Experiments and Technologies, 2009, pp. 313–324.
- Li, P.; Luo, B.; Zhu, W.; et al. Cluster-based distributed dynamic cuckoo filter system for Redis. International Journal of Parallel, Emergent and Distributed Systems 2020, 35, 340–353. [Google Scholar] [CrossRef]
- Wang, F.; Chen, H.; Liao, L.; et al. The power of better choice: Reducing relocations in cuckoo filter. Proceedings of International Conference on Distributed Computing Systems, 2019, pp. 358–367.
- Gur, L.; Lis, D.; Dai, H.; et al. Adaptive online cache capacity optimization via lightweight working set size estimation at scale. Proceedings of USENIX Annual Technical Conference, 2023, pp. 467–484.
- Reviriego, P.; Martínez, J.; Larrabeiti, D.; et al. Cuckoo Filters and Bloom Filters: Comparison and Application to Packet Classification. IEEE Transactions on Network and Service Management 2020, 17, 2690–2701. [Google Scholar] [CrossRef]







| Cardinality of Dataset from Participant One |
Cardinality of Dataset from Participant Two |
Protocol Runtime (seconds) |
| 1.7442 | ||
| 6.8024 | ||
| 55.2655 | ||
| 1849.2111 | ||
| 4.9248 | ||
| 10.0466 | ||
| 58.6051 | ||
| 1852.7098 | ||
| 20.0932 | ||
| 68.9472 | ||
| 1863.5443 | ||
| 165.4733 | ||
| 1964.6669 |
| Cardinality of Dataset from Participant One |
Cardinality of Dataset from Participant Two |
Original Protocol Runtime (seconds) |
New Protocol Runtime (seconds) |
| 1.7442 | 0.1539 | ||
| 6.8024 | 0.1569 | ||
| 55.2655 | 0.1616 | ||
| 1849.2111 | 0.1693 | ||
| 4.9247 | 4.9239 | ||
| 10.0465 | 5.0232 | ||
| 58.6050 | 5.1709 | ||
| 1852.7097 | 5.4172 | ||
| 20.0931 | 20.0930 | ||
| 68.9471 | 20.6841 | ||
| 1863.5442 | 21.6690 | ||
| 165.4732 | 165.4731 | ||
| 1964.6668 | 173.3531 |
| Data Set Count | Size of Cuckoo Filter (MB) |
| 0.535 | |
| 2.363 | |
| 21.678 | |
| 93.645 | |
| 194.436 | |
| 403.201 | |
| 3571.206 | |
| 7372.835 | |
| 15206.421 |
| Client Dataset Size | Database Server Dataset Size | Protocol 1 Running Time (seconds) | Protocol 2 Running Time (seconds) |
| 0.1539 | 0.1543 | ||
| 0.1569 | 0.1573 | ||
| 0.1616 | 0.1611 | ||
| 0.1693 | 0.1683 | ||
| 4.9239 | 3.8223 | ||
| 5.0232 | 3.9145 | ||
| 5.1709 | 4.0267 | ||
| 5.4172 | 4.2233 | ||
| 20.0930 | 15.6768 | ||
| 20.6841 | 16.1281 | ||
| 21.6690 | 16.8939 | ||
| 165.4731 | 129.0516 | ||
| 173.3531 | 135.2534 |
| Data Volume | Protocol I Running Time (s) |
Protocol II Running Time (s) |
Protocol III Running Time (s) |
| 0.1539 | 0.1543 | 0.1551 | |
| 0.1569 | 0.1573 | 0.1586 | |
| 0.1616 | 0.1611 | 0.1635 | |
| 0.1693 | 0.1683 | 0.1701 | |
| 4.9239 | 3.8223 | 4.3707 | |
| 5.0232 | 3.9145 | 4.4709 | |
| 5.1709 | 4.0267 | 4.6001 | |
| 5.4172 | 4.2233 | 4.8206 | |
| 20.0930 | 15.6768 | 17.8801 | |
| 20.6841 | 16.1281 | 18.4004 | |
| 21.6690 | 16.8939 | 19.2802 | |
| 165.4731 | 129.0516 | 147.2608 | |
| 173.3531 | 135.2534 | 154.3003 |
| ine Protocol | Security | Client Storage & Computational Burden | Runtime |
| ine Unbalanced PSI Protocol based on Cuckoo Filters | High Security (No collusion attacks) | Requires storing Cuckoo filters and intensive computation | Longest |
| ine Unbalanced PSI Protocol based on Single Cloud Assistance | Security Risks (Cannot resist collusion attacks) | Shifted to cloud server | Fastest |
| ine Unbalanced PSI Protocol based on Dual Cloud Assistance ine | High Security (Can resist collusion attacks) | Shifted to cloud server | Moderate |
Disclaimer/Publisher’s Note: The statements, opinions and data contained in all publications are solely those of the individual author(s) and contributor(s) and not of MDPI and/or the editor(s). MDPI and/or the editor(s) disclaim responsibility for any injury to people or property resulting from any ideas, methods, instructions or products referred to in the content. |
© 2024 by the authors. Licensee MDPI, Basel, Switzerland. This article is an open access article distributed under the terms and conditions of the Creative Commons Attribution (CC BY) license (http://creativecommons.org/licenses/by/4.0/).