With the rapid development of the speech, audio, image, and video compression methods, currently it is not a difficult task to spread digital multimedia over Internet. This makes the protections of digital intellectual property rights and content authentications have been a serious problem. Hence the technology of digital watermarking is received a large deal of attention. Generally, digital watermarking techniques are based on either spread spectrum methods or changing the least significant bits of selected coefficients of a certain signal transform. For speech watermarking, to ensure the embedded watermark is imperceptible, the audio marking phenomenon is considered together with these conventional techniques.
In addition, a speech watermarking system should be robust to various speech compression operations. The development of speech watermarking algorithms, therefore, involves a trade-off among speech fidelity, robustness, and watermark pattern embedding rate specifications. The speech watermarking techniques usually embed speech watermark in unnecessary parts of speech signal, or in human insensitivity auditory regions. Some of speech watermarking methods will change an interval to embed watermark. However, this kind of method has a drawback that is the unavoidably degradation of robustness.
In the other methods, the watermarks are embedded by the use of counterfeit human speech. It is unfortunate that such type of method also has the defect of weak robustness especially when the counterfeit human speech is destroyed. The distortion of the counterfeit human speech will also lead to the damage of the watermark
Fig 1.1: Block of General watermarking Scheme
Therefore, we can define watermarking systems as systems in which the hidden message is related to the host signal and non-watermarking systems in which the message is unrelated to the host signal. On the other hand, systems for embedding messages into host signals can be divided into steganographic systems, in which the existence of the message is kept secret, and non-steganographic systems, in which the presence of the embedded message does not have to be secret.
Audio watermarking algorithms are characterized by five essential properties, namely perceptual transparency, watermark bit rate, robustness, blind/informed watermark detection, and security
In most of the applications, the watermark-embedding algorithm has to insert additional data without affecting the perceptual quality of the audio host signal. The fidelity of the watermarking algorithm is usually defined as a perceptual similarity between the original and watermarked audio sequence. However, the quality of the watermarked audio is usually degraded, either intentionally by an adversary or unintentionally in the transmission process, before a person perceives it. In that case, it is more adequate to define the fidelity of a watermarking algorithm as a perceptual similarity between the watermarked audio and the original host audio at the point at which they are presented to a consumer.
Watermark bit rate
The bit rate of the embedded watermark is the number of the embedded bits within a unit of time and is usually given in bits per second (bps). Some audio watermarking applications, such as copy control, require the insertion of a serial number or author ID, with the average bit rate of up to 0.5 bps. For a broadcast monitoring watermark, the bit rate is higher, caused by the necessity of the embedding of an ID signature of a commercial within the first second at the start of the broadcast clip, with an average bit rate up to 15 bps. In some envisioned applications, e.g. hiding speech in audio or compressed audio stream in audio, algorithms have to be able to embed watermarks with the bit rate that is a significant fraction of the host audio bit...