Data Leakage Detection

Only available on StudyMode
  • Topic: Leak, Object, Agent
  • Pages : 8 (2743 words )
  • Download(s) : 546
  • Published : March 4, 2012
Open Document
Text Preview
International Journal of Computer Applications in Engineering Sciences [VOL I, ISSUE II, JUNE 2011]

[ISSN: 2231-4946]

Development of Data leakage Detection Using Data Allocation Strategies Rudragouda G Patil
Dept of CSE, The Oxford College of Engg, Bangalore.

Abstract-A data distributor has given sensitive data to a set of supposedly trusted agents (third parties). If the data distributed to third parties is found in a public/private domain then finding the guilty party is a nontrivial task to distributor. Traditionally, this leakage of data is handled by water marking technique which requires modification of data. If the watermarked copy is found at some unauthorized site then distributor can claim his ownership. To overcome the disadvantages of using watermark [2], data allocation strategies are used to improve the probability of identifying guilty third parties. In this project, we implement and analyze a guilt model that detects the agents using allocation strategies without modifying the original data. The guilty agent is one who leaks a portion of distributed data. The idea is to distribute the data intelligently to agents based on sample data request and explicit data request in order to improve the chance of detecting the guilty agents. The algorithms implemented using fake objects will improve the distributor chance of detecting guilty agents. It is observed that by  minimizing the sum objective the chance of detecting guilty agents will increase. We also developed a framework for generating fake objects.  Keywords - sensitive data; fake objects; data allocation strategies; I. INTRODUCTION In the course of doing business, sometimes sensitive data must be handed over to supposedly trusted third parties. For example, a hospital may give patient records to researchers who will devise new treatments. Similarly, a company may have partnerships with other companies that require sharing customer data. We call owner of the data, the distributor and the supposedly trusted third parties the agents. The goal of project is to detect when the distributor’s sensitive data has been leaked by agents, and show the probability for identifying the agent that leaked the data. We study unobtrusive techniques for detecting leakage of a set of objects or records. Specifically, we study the following scenario: After giving a set of

objects to agents, the distributor discovers some of those same objects in an unauthorized place. (For example, the data may be found on a web site, or may be obtained through a legal discovery process.) At this point the distributor can assess the likelihood that the leaked data came from one or more agents, as opposed to having been independently gathered by other means. We develop a model for assessing the “guilt” of agents. We also present algorithms for distributing objects to agents, in a way that improves our chances of identifying a leakier. Finally, we also consider the option of adding “fake” objects to the distributed set. II. PROBLEM DEFINITION Suppose a distributor owns a set T = { , } of valuable data objects. The distributor wants to share some of the objects with a set of agents , ,…, but does wish the objects be leaked to other third parties. An agent receives a subset of objects which belongs to T, determined either by a sample request or an explicit request, Sample Request = SAMPLE ( T, ) : Any subset of records from T can be given to . Explicit Request = EXPLICIT ( T, ) : Agent receives all the T objects that satisfy . The objects in T could be of any type and size, e.g., they could be tuples in a relation, or relations in a database. After giving objects to agents, the distributor discovers that a set S of T has leaked. This means that some third party called the target has been caught in possession of S. For example, this target may be displaying S on its web site, or perhaps as part of a legal discovery process, the target turned over S to the...
tracking img