Watermarking Relational Databases using Optimization Based Techniques Mohamed Shehab, Member, IEEE, Elisa Bertino, Fellow, IEEE, and Arif Ghafoor, Fellow, IEEE Abstract— Proving ownership rights on outsourced relational databases is a crucial issue in today internet-based application environments and in many content distribution applications. In this paper, we present a mechanism for proof of ownership based on the secure embedding of a robust imperceptible watermark in relational data. We formulate the watermarking of relational databases as a constrained optimization problem, and discuss efﬁcient techniques to solve the optimization problem and to handle the constraints. Our watermarking technique is resilient to watermark synchronization errors because it uses a partitioning approach that does not require marker tuples. Our approach overcomes a major weakness in previously proposed watermarking techniques. Watermark decoding is based on a threshold-based technique characterized by an optimal threshold that minimizes the probability of decoding errors. We implemented a proof of concept implementation of our watermarking technique and showed by experimental results that our technique is resilient to tuple deletion, alteration and insertion attacks. Index Terms— Watermarking, Digital Rights, Optimization.
I. I NTRODUCTION
HE rapid growth of internet and related technologies has offered an unprecedented ability to access and redistribute digital contents. In such a context, enforcing data ownership is an important requirement which requires articulated solutions, encompassing technical, organizational and legal aspects . Though we are still far from such comprehensive solutions, in the last years watermarking techniques have emerged as an important building block which plays a crucial role in addressing the ownership problem. Such techniques allow the owner of the data to embed an imperceptible watermark into the data. A watermark describes information that can be used to prove the ownership of data, such as the owner, origin, or recipient of the content. Secure embedding requires that the embedded watermark must not be easily tampered with, forged, or removed from the watermarked data . Imperceptible embedding means that the presence of the watermark is unnoticeable in the data. Furthermore, the watermark detection is blinded ,that is, it neither requires the knowledge of the original data nor the watermark. Watermarking techniques have been developed for video, images, audio, and text data , , , , and also for software and natural language text , . By contrast the problem of watermarking relational data has not been given appropriate attention. There are, however, many
M. Shehab is with the Department of Software and Information Systems, University of North Carolina at Charlotte, Charlotte, NC 28223. E-mail: email@example.com. E. Bertino is with the Department of Computer Sciences, Purdue University, 250 N. University Street, West Lafayette, IN 47906. Email: firstname.lastname@example.org. A. Ghafoor is with the School of Electrical and Computer Engineering, Purdue University, 465 Northwestern Ave. West Lafayette, IN 47907. Email: email@example.com.
application contexts for which data represent an important asset, the ownership of which must thus be carefully enforced. This is the case, for example, of weather data, stock market data, power consumption, consumer behavior data, medical and scientiﬁc data. Watermark embedding for relational data is made possible by the fact that real data can very often tolerate a small amount of error without any signiﬁcant degradation with respect to their usability. For example when dealing with weather data, changing some daily temperatures of 1 or 2 degrees is a modiﬁcation that leaves the data still usable. To date only a few approaches to the problem of...