Preview

Data Leakage Detection

Better Essays
Open Document
Open Document
2743 Words
Grammar
Grammar
Plagiarism
Plagiarism
Writing
Writing
Score
Score
Data Leakage Detection
International Journal of Computer Applications in Engineering Sciences [VOL I, ISSUE II, JUNE 2011]

[ISSN: 2231-4946]

Development of Data leakage Detection Using Data Allocation Strategies
Rudragouda G Patil
Dept of CSE, The Oxford College of Engg, Bangalore. patilrudrag@gmail.com Abstract-A data distributor has given sensitive data to a set of supposedly trusted agents (third parties). If the data distributed to third parties is found in a public/private domain then finding the guilty party is a nontrivial task to distributor. Traditionally, this leakage of data is handled by water marking technique which requires modification of data. If the watermarked copy is found at some unauthorized site then distributor can claim his ownership. To overcome the disadvantages of using watermark [2], data allocation strategies are used to improve the probability of identifying guilty third parties. In this project, we implement and analyze a guilt model that detects the agents using allocation strategies without modifying the original data. The guilty agent is one who leaks a portion of distributed data. The idea is to distribute the data intelligently to agents based on sample data request and explicit data request in order to improve the chance of detecting the guilty agents. The algorithms implemented using fake objects will improve the distributor chance of detecting guilty agents. It is observed that by  minimizing the sum objective the chance of detecting guilty agents will increase. We also developed a framework for generating fake objects.  Keywords - sensitive data; fake objects; data allocation strategies; I. INTRODUCTION In the course of doing business, sometimes sensitive data must be handed over to supposedly trusted third parties. For example, a hospital may give patient records to researchers who will devise new treatments. Similarly, a company may have partnerships with other companies that require sharing customer data. We call owner of the data,



References: [1] P. Papadimitriou and H. Garcia-Molina, “Data leakage detection,” IEEE Transactions on Knowledge and Data Engineering, pages 51-63, volume 23, 2011. [2] S. Czerwinski, R. Fromm, and T. Hodes. Digital music distribution and audio watermarking. [3] L. Sweeney. Achieving k-anonymity privacy protection using generalization and suppression, 2002. [4] S. U. Nabar, B. Marthi, K. Kenthapadi, N. Mishra, and R. Motwani. Towards robustness in query auditing. In VLDB ’06. Hence, there are different allocations. In every allocation, the distributor can permute T objects and keep the same chances of guilty agent detection. The reason is that the guilt probability depends only on which agents have received the leaked objects and not on the identity of the leaked objects. Therefore, from the distributor’s perspective there are different allocations. An object allocation that satisfies requests and ignores the distributor’s objective is to give each agent a unique subset of T of size m. The s-max algorithm allocates to an agent the data record that yields the minimum increase of the maximum relative overlap among any pair of agents. The s-max algorithm is as follows. Step 1: Initialize Min_overlap ← 1, the minimum out of the maximum relative overlaps that the allocations of different objects to Step 2: for k ∈ {k | ∈ } do Initialize max_rel_ov ← 0, the maximum relative overlap between and any set that the allocation of to Step 3: for all j = 1,..., n : j = i and ∈ do Calculate absolute overlap as abs_ov ← | ∩ | + 1 Calculate relative overlap as rel_ov ← abs_ov / min ( , ) Step 4: Find maximum relative as max_rel_ov ← MAX (max_rel_ov, rel_ov) If max_rel_ov ≤ min_overlap then min_overlap ← max_rel_ov ret_k ← k Return ret_k It can be shown that algorithm s-max is optimal for the sum-objective and the max-objective in problems where M ≤ |T| and n < |T|. It is also optimal for the maxobjective if |T| ≤ M ≤ 2 |T| or all agents request data of the same size. It is observed that the relative performance of algorithm and main conclusion do not change. If p approaches to 0, it becomes easier to find guilty agents and algorithm performance converges. On the other hand, if p approaches 1, the relative differences among algorithms grow since more evidence is need to find an agent guilty. The algorithm presented implements a variety of data distribution strategies that can improve the distributor’s chances of identifying a leaker. It is shown that distributing objects judiciously can make a significant difference in identifying guilty agents, especially in cases where there is large overlap in the 200 | P a g e

You May Also Find These Documents Helpful

  • Good Essays

    Technology has rapidly advanced, affecting standards on privacy, telecommunications, and criminal law. Every day, we encounter unexpected consequences of data flows that could not have happened a few years ago.…

    • 786 Words
    • 4 Pages
    Good Essays
  • Better Essays

    It255 Project Part 1

    • 634 Words
    • 3 Pages

    References: David Kim., and Michael G. Solomon. Fundamentals of Information Systems Security , 2012: Sudbury, MA 2012…

    • 634 Words
    • 3 Pages
    Better Essays
  • Satisfactory Essays

    However, too many organizations fail to identify the potential threats from information unintentionally leaked, freely available over the Internet, and not normally identifiable from standard log file analysis. Most critically, an attacker can passively gather this information without ever coming into direct contact with the organizations servers – thus being essentially undetectable. Very little information has been publicly discussed about arguably one of the least understood, and most significant stages of penetration testing – the process of Passive Information Gathering. This technical paper and information gathering plan reviews the processes and techniques related to the discovery of leaked information. It also includes details on both the significance of the leaked information, and steps organizations should take to halt or limit their exposure to this threat.…

    • 501 Words
    • 2 Pages
    Satisfactory Essays
  • Good Essays

    Choicepoint Attack

    • 401 Words
    • 2 Pages

    The data theft identified by the various organizations is not uniform. Some of the steps that can be implement by organizations such as:…

    • 401 Words
    • 2 Pages
    Good Essays
  • Powerful Essays

    Nt1330 Unit 1 Assignment

    • 2207 Words
    • 9 Pages

    Undoubtedly in secure DBMS, it is feasible for clients to draw gatherings from the data they get from the database. The employee working in an organization when gets certain information they may try to elaborate it or draw some important aspects related to the database from past. There are two crucial events of the finding issue, which ordinarily climb in database systems.…

    • 2207 Words
    • 9 Pages
    Powerful Essays
  • Better Essays

    Zhang, Y., Yang, L. T., Zhou, Y., & Kuang, W. (2010). Information security underlying transparent computing: Impacts, visions and challenges. Web Intelligence & Agent Systems, 8(2), 203-217…

    • 962 Words
    • 4 Pages
    Better Essays
  • Good Essays

    Internet privacy can be considered as a subset of computer privacy. Computer privacy consists of the data privacy relating to the avoidance of the improper disclosure of the personally identifiable information collected and stored by websites. The effective sharing of data while protecting the private information is the real challenge.…

    • 884 Words
    • 4 Pages
    Good Essays
  • Good Essays

    Nt1310 Final Exam

    • 3599 Words
    • 15 Pages

    b. digital watermark (Incorrect. A digital watermark is used to identify proprietary data, but it does not protect privacy.)…

    • 3599 Words
    • 15 Pages
    Good Essays
  • Powerful Essays

    Data Breach Research Paper

    • 4412 Words
    • 18 Pages

    Numerous data breaches and computer intrusions have been disclosed by the nation’s largest data brokers, retailers, educational institutions, government agencies, health care entities, financial institutions, and Internet businesses. A data breach may occur when there is a loss or theft of, or other unauthorized access to, data containing sensitive personal information that results in the potential compromise of the confidentiality or integrity of data. Sensitive personal information generally includes an individual’s name, address, or telephone number, in conjunction with the individual’s Social Security number, driver’s license number, account number, credit or debit card number, or a personal identification number or password. Breach notification laws enacted by many states require the disclosure of security breaches involving sensitive personal information (Stevens, 2008).…

    • 4412 Words
    • 18 Pages
    Powerful Essays
  • Good Essays

    Internet Privacy.

    • 1148 Words
    • 5 Pages

    In the past thirty years computer technology has been developing very rapidly. Internet in last decade has revolutionized the way how we conduct our lives and businesses. Internet has become a daily necessity we cannot live without. Development of Internet and wireless technologies together with advancement in miniature technology has made it possible for us to have access the internet on the go. Every year we expect new and more advance models of smart phones, or new more sophisticated window systems, or new and more reliable internet security program in hope that this time advancements in technologies not only help to improve our life, but help to protect our privacy while using the internet. It seems the more advance computer and internet technologies get, the more our privacy gets violated. There is a parallel connection between internet privacy and advancements in computer technologies. Future of computer and digital technology is unlimited, and it is impossible to predict. However we should recognize the consequences of using the internet for different aspects of our lives, this will in turn help us to understand internet privacy and how to protect our privacy. Until then violation of the internet privacy will increase, and it is okay because we all learn in a process.…

    • 1148 Words
    • 5 Pages
    Good Essays
  • Good Essays

    Secom Case Study

    • 996 Words
    • 4 Pages

    Recent years, the issue of personal information leaks over the internet is becoming fiercer. Incursions by outsiders, criminal acts of insiders, along with careless actions hurt both the information holders and the companies, since the latter might loss business accordingly. That’s why security issue, for both software and hard ware, is important for any online business.…

    • 996 Words
    • 4 Pages
    Good Essays
  • Good Essays

    One very important task in defining the needed security for a system of data is first to understand the nature of that data and how it is used in a given system. Within any given organization there is a myriad of data that can all be categorized in a different way. We can use this opportunity to discuss the sensitivity of data within our organization and then break it into appropriate classifications to be used when implementing security measures. Additionally, this process will help the organization to conform to the ISO standards the company may be subjected to, in this case, ISO/IEC code 18028. This also directly relates to certain laws that also pertain to the security of information and finally how the organization will be able to test and measure how well these security practices are implemented and followed. Lastly, we can outline here how controls can be created and implemented to enforce these requirements as well as how auditing can validate the effectiveness of these implemented controls.…

    • 1069 Words
    • 5 Pages
    Good Essays
  • Powerful Essays

    1. Introduction Data hiding is an information security technology used for protecting information not to be perceptible, where a sender can use the data hiding technique to hide information from being detected, stolen, or damaged by unauthorized users during transmission via public networks. Data hiding is able to avoid detection that…

    • 6788 Words
    • 28 Pages
    Powerful Essays
  • Good Essays

    Phantom Anonymity Protocol

    • 36549 Words
    • 147 Pages

    Recent years, and especially this past year, have seen a notable upswing in developments toward anti-online privacy around the world, primarily in the form of draconian surveillance and censorship laws (both legislated and suggested) and ISPs being pressured into individually acting as both police and informants for various commercial interests. Once such first steps have been taken, it is of course also of huge concern how these newly created possibilities could be used outside of their originally stated bounds, and what the future of such developments may hold in store for online privacy. There are no signs of this trend being broken anytime soon. Combined with the ever growing online migration of everything in general, and privacy sensitive activities in particular (like e.g. voting, all nature of personal and interpersonal discussions, and various personal groupings), this trend will in turn unavoidably lead to a huge demand for online anonymization tools and similar means of maintaining privacy. However, if not carefully designed, such anonymization tools will, ultimately, be easy targets for additional draconian legislation and directed [il]legal pressure from big commercial and political interests. Therefore, a well-conceived, robust and theoretically secure design for such an anonymization protocol and infrastructure is needed, which is exactly what is set out to be done with this project. What is presented in this paper is the design of a protocol and complete system for anonymization, intended as a candidate for a free, open, community owned, de facto anonymization standard, aimed at improving on existing…

    • 36549 Words
    • 147 Pages
    Good Essays
  • Good Essays

    Consequently, protecting and safeguarding information has become a necessity which organizations cannot ignore. According to the central intelligence agency (CIA) (2000), to secure information effort must be made to ensure confidentiality, which is preventing disclosure of information to unauthorized individuals or systems. Also, the information must have integrity, maintenance and assuring the accuracy and consistency of data over its entire life cycle. This means that data cannot be modified unauthorized or undetected.…

    • 815 Words
    • 4 Pages
    Good Essays

Related Topics