Create README.md (#1)
Browse files- Create README.md (f82cc8665fef399bbe5de542d6ff78ae4cf5ae7c)
README.md
ADDED
|
@@ -0,0 +1,89 @@
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| 1 |
+
---
|
| 2 |
+
tags:
|
| 3 |
+
- audioseal
|
| 4 |
+
inference: false
|
| 5 |
+
---
|
| 6 |
+
# AudioSeal
|
| 7 |
+
|
| 8 |
+
We introduce AudioSeal, a method for speech localized watermarking, with state-of-the-art robustness and detector speed. It jointly trains a generator that embeds a watermark in the audio, and a detector that detects the watermarked fragments in longer audios, even in the presence of editing.
|
| 9 |
+
Audioseal achieves state-of-the-art detection performance of both natural and synthetic speech at the sample level (1/16k second resolution), it generates limited alteration of signal quality and is robust to many types of audio editing.
|
| 10 |
+
Audioseal is designed with a fast, single-pass detector, that significantly surpasses existing models in speed — achieving detection up to two orders of magnitude faster, making it ideal for large-scale and real-time applications.
|
| 11 |
+
|
| 12 |
+
# :mate: Installation
|
| 13 |
+
|
| 14 |
+
AudioSeal requires Python >=3.8, Pytorch >= 1.13.0, [omegaconf](https://omegaconf.readthedocs.io/), [julius](https://pypi.org/project/julius/), and numpy. To install from PyPI:
|
| 15 |
+
|
| 16 |
+
```
|
| 17 |
+
pip install audioseal
|
| 18 |
+
```
|
| 19 |
+
|
| 20 |
+
To install from source: Clone this repo and install in editable mode:
|
| 21 |
+
|
| 22 |
+
```
|
| 23 |
+
git clone https://github.com/facebookresearch/audioseal
|
| 24 |
+
cd audioseal
|
| 25 |
+
pip install -e .
|
| 26 |
+
```
|
| 27 |
+
|
| 28 |
+
# :gear: Models
|
| 29 |
+
|
| 30 |
+
We provide the checkpoints for the following models:
|
| 31 |
+
|
| 32 |
+
- AudioSeal Generator.
|
| 33 |
+
It takes as input an audio signal (as a waveform), and outputs a watermark of the same size as the input, that can be added to the input to watermark it.
|
| 34 |
+
Optionally, it can also take as input a secret message of 16-bits that will be encoded in the watermark.
|
| 35 |
+
- AudioSeal Detector.
|
| 36 |
+
It takes as input an audio signal (as a waveform), and outputs a probability that the input contains a watermark at each sample of the audio (every 1/16k s).
|
| 37 |
+
Optionally, it may also output the secret message encoded in the watermark.
|
| 38 |
+
|
| 39 |
+
Note that the message is optional and has no influence on the detection output. It may be used to identify a model version for instance (up to $2**16=65536$ possible choices).
|
| 40 |
+
|
| 41 |
+
**Note**: We are working to release the training code for anyone wants to build their own watermarker. Stay tuned !
|
| 42 |
+
|
| 43 |
+
# :abacus: Usage
|
| 44 |
+
|
| 45 |
+
Audioseal provides a simple API to watermark and detect the watermarks from an audio sample. Example usage:
|
| 46 |
+
|
| 47 |
+
```python
|
| 48 |
+
|
| 49 |
+
from audioseal import AudioSeal
|
| 50 |
+
|
| 51 |
+
# model name corresponds to the YAML card file name found in audioseal/cards
|
| 52 |
+
model = AudioSeal.load_generator("audioseal_wm_16bits")
|
| 53 |
+
|
| 54 |
+
# Other way is to load directly from the checkpoint
|
| 55 |
+
# model = Watermarker.from_pretrained(checkpoint_path, device = wav.device)
|
| 56 |
+
|
| 57 |
+
# a torch tensor of shape (batch, channels, samples) and a sample rate
|
| 58 |
+
# It is important to process the audio to the same sample rate as the model
|
| 59 |
+
# expectes. In our case, we support 16khz audio
|
| 60 |
+
wav, sr = ..., 16000
|
| 61 |
+
|
| 62 |
+
watermark = model.get_watermark(wav, sr)
|
| 63 |
+
|
| 64 |
+
# Optional: you can add a 16-bit message to embed in the watermark
|
| 65 |
+
# msg = torch.randint(0, 2, (wav.shape(0), model.msg_processor.nbits), device=wav.device)
|
| 66 |
+
# watermark = model.get_watermark(wav, message = msg)
|
| 67 |
+
|
| 68 |
+
watermarked_audio = wav + watermark
|
| 69 |
+
|
| 70 |
+
detector = AudioSeal.load_detector("audioseal_detector_16bits")
|
| 71 |
+
|
| 72 |
+
# To detect the messages in the high-level.
|
| 73 |
+
result, message = detector.detect_watermark(watermarked_audio, sr)
|
| 74 |
+
|
| 75 |
+
print(result) # result is a float number indicating the probability of the audio being watermarked,
|
| 76 |
+
print(message) # message is a binary vector of 16 bits
|
| 77 |
+
|
| 78 |
+
|
| 79 |
+
# To detect the messages in the low-level.
|
| 80 |
+
result, message = detector(watermarked_audio, sr)
|
| 81 |
+
|
| 82 |
+
# result is a tensor of size batch x 2 x frames, indicating the probability (positive and negative) of watermarking for each frame
|
| 83 |
+
# A watermarked audio should have result[:, 1, :] > 0.5
|
| 84 |
+
print(result[:, 1 , :])
|
| 85 |
+
|
| 86 |
+
# Message is a tensor of size batch x 16, indicating of the probability of each bit to be 1.
|
| 87 |
+
# message will be a random tensor if the detector detects no watermarking from the audio
|
| 88 |
+
print(message)
|
| 89 |
+
```
|