A New Lightweight Symmetric Searchable Encryption Scheme for String Identification

ABSTRACT

In this paper, we provide an efficient and easy-to-implement symmetric searchable encryption scheme (SSE) for string search, which takes one round of communication, O(n) times of computations over n documents. Unlike previous schemes, we use hash-chaining instead of chain of encryption operations for index generation, which makes it suitable for lightweight applications. Unlike the previous SSE schemes for string search, with our scheme, server learns nothing about the frequency and the relative positions of the words being searched except what it can learn from the history. We are the first to propose probabilistic trapdoors in SSE for string search. We provide concrete proof of nonadaptive security of our scheme against honest-but-curious server based on the definitions. We also introduce a new notion of search pattern privacy, which gives a measure of security against the leakage from trapdoor. We have shown that our scheme is secure under search pattern indistinguishability definition. We show why SSE scheme for string search cannot attain adaptive indistinguishability criteria . We also propose modifications of our scheme so that the scheme can be used against active adversaries at the cost of more rounds of communications and memory space. We validate our scheme against two different commercial datasets .

EXISTING SYSTEM

The cloud is designed to hold a large number of encrypted documents. With the advent of cloud computing, growing number of clients and leading organizations have started adapting to the private storage outsourcing. This allows resource constrained clients to privately store large amounts of encrypted data in cloud at low cost. However, this prevents one from searching. This gives rise to a newly emerging field of research, called searchable encryption (SE). SE can be classified into symmetric searchable encryptions (SSE) and asymmetric searchable encryptions (ASE). In the SSE, the client encrypts the data and stores it on the cloud. It may be noted that client can organize the data in an arbitrary manner and can maintain additional data structures to achieve desired data efficiently. In this process, the initial client side computation is thus as large as the data, but subsequent computations to access data is less for both client and the cloud server.

PROBLEM STATEMENT

• The huge volumes of documents are stored in a cloud server, searching against a keyword may result into large number of documents.
• It causes unnecessary network traffic.
• The Heavy computations of public key encryption at the cost of small leakage of information.

PROPOSED SYSTEM

In this paper we have introduced the notion of search pattern security and have shown our scheme to be secure under search pattern indistinguishability definition. The novelty of our scheme is that although the index is generated by the client at the beginning, and remains same for the same dataset throughout the process and thus static in nature, however the trapdoors are dynamic in nature, making it more difficult for the eavesdroppers to understand the search patterns and thus is more secure against attacks like replay attacks, frequency analysis based attacks and many more. Our scheme achieves searches in one communication round and requires O(n) times computations for searching a string in n documents, which is optimal. Also the scheme requires no storage on the client side and O(n) storage on the server side for the n- document collection. Lastly, the scheme guarantees minimal leakage in a sense that server directly knows nothing about the frequency of the words being searched and their relative positions in the documents except what it can learn from the history of search. Unlike the index generation techniques (sequence of encryptions of a key), we use the hash-chain technique, which is faster, and is thus suitable for lightweight applications.

CONCLUSION

Searching for the desired document can be a difficult and resource intensive task. One solution may be to use symmetric searchable encryption (SSE) which allows one party to outsource the storage of its data to another party (a cloud) privately while enabling to search selectively over it. In this paper we revisited the security definitions of [12] and proposed a new lightweight SSE scheme _s;s for string search. We have shown that our scheme is secure under the non-adaptive indistinguishability definition [12]. For active adversary, we propose modification of the scheme _s;s at the additional cost of memory at client’s end and two rounds of communications for one modification of document collection. Towards this direction, future research can be performed to design efficient SSE scheme ideally with one round of communication. With our scheme, server does not learn the information related to word frequency and word positions except what it can learn from the history. We, for the first time, introduce new security notion in SSE, named, search pattern indistinguishability. It may be observed that with non adaptive indistinguishability security, although the keywords are guaranteed to be secure from the possible leakage from index, however it does not guarantee the security from the possible leakage from trapdoor. Towards this, we for the first time introduce probabilistic trapdoor and prove that our scheme is secure under such criterion. We have implemented our scheme for the first time to search over phone symbols and validated it using the TIMIT dataset. We have also implemented our scheme over DNA data of [1] and successfully achieve pattern matching functionality over encrypted domain. While dealing with string search, designing a SSE scheme satisfying adaptive-indistinguishability-security definition of [12] seems intuitively impossible. This is because to generate an index in advance which is consistent with future search, one unavoidable assumption needed is the presence of all possible strings in each document. This can be done by considering all permutations of keywords for every document which makes the index size exponential in n for n- document collection. According to the definition of [12], index size is linear in n which is essential from efficiency point of view. From the angle of this intuition, future research can be carried out to give a formal proof in support of non-existence of adaptively secure SSE scheme for string search. In this paper we have considered honest but- curious adversaries and active adversaries. Also, designing SSE scheme for string search with adaptive indistinguishability-security against some newly defined adversary can be a future research direction.