Making deepfake tools doesn’t have to be irresponsible. Here’s how. - MIT Technology Review
Synthetic media technologies—popularly known as deepfakes—have real potential for positive impact. Voice synthesis, for example, will allow us to speak in hundreds of languages in our own voice. Video synthesis may help us simulate self-driving-car accidents to avoid mistakes in the future. And text synthesis can accelerate our ability to write both programs and prose.
But these advances can come at a gargantuan cost if we aren’t careful: the same underlying technologies can also enable deception with global ramifications.
Thankfully, we can both enable the technology’s promise and mitigate its peril. It will just take some hard work.
Approach 2: Discouraging malicious use
For synthetic media tools that are general and may be made widely available, there are still many possible ways to reduce malicious use. Here are some examples.
– Clear disclosure: Request that synthesized media be clearly indicated as such—particularly material that might be used to mislead. Tools may be able to support this by including clear visual or audible notices in output files, such as visible warnings or spoken disclosures. At minimum, metadata should indicate how media was synthesized or manipulated.
– Consent protection: Require the consent of those being impersonated. The voice cloning tool Lyrebird requires users to speak particular phrases in order to model their voice. This makes it more difficult to impersonate someone without consent, which would be very possible if it simply generated voices using any provided data set. This, of course, is applicable only for tools that enable impersonation.
– Detection friendliness: Ensure that the synthesized media is not inordinately difficult to detect; keep detector tools up to date; collaborate with those working on detection to keep them in the loop on new developments.
– Hidden watermarks: Embed context about synthesis—or even the original media—through robust watermarks, both using methods that are accessible to anyone with the proper tools, and through approaches that are secret and difficult to remove. (For example, Modulate.ai watermarks audio that it generates, while products like Imatag and their open-source equivalents enable watermarking for imagery.)
– Usage logs: Store information about usage and media outputs in a way that researchers and journalists can access to identify if, for example, a video was probably synthesized using a particular tool. This could include storing timestamps of synthesis with a robust hash or media embedding.
Not all these strategies are applicable to every system. Some may have their risks, and none are perfect—or sufficient on their own. They are all part of a “defense in depth,” where more is more.