• Après le détournement de la reconnaissance automatique d’images par #deep_learning, la même chose pour le son…
    (vu via la chronique de Jean-Paul Delahaye dans Pour la Science, n°488 de juin 2018, Intelligences artificielles : un apprentissage pas si profond_ qui traite des images (déjà vues ici) mais aussi du son)

    [1801.01944] Audio #Adversarial_Examples : Targeted Attacks on Speech-to-Text

    Nicholas Carlini, David Wagner

    We construct targeted audio adversarial examples on automatic speech recognition. Given any audio waveform, we can produce another that is over 99.9% similar, but transcribes as any phrase we choose (recognizing up to 50 characters per second of audio). We apply our white-box iterative optimization-based attack to Mozilla’s implementation DeepSpeech end-to-end, and show it has a 100% success rate. The feasibility of this attack introduce a new domain to study adversarial examples.

    le pdf (technique) en ligne, sa présentation le 24 mai au IEEE Symposium on Security and Privacy
    (vers 9:00 les exemples audio,…)

    ou comment faire interpréter par Mozilla’ DeepSpeech :

    most of them were staring quietly at the big table


    ok google, browse to evil.com

    ou encore, transcrire de la pure musique en paroles (bidon !)…

    Et, sur le même thème

    [1801.00554] Did you hear that ? Adversarial Examples Against Automatic Speech Recognition

    Moustafa Alzantot, Bharathan Balaji, Mani Srivastava

    Speech is a common and effective way of communication between humans, and modern consumer devices such as smartphones and home hubs are equipped with deep learning based accurate automatic speech recognition to enable natural interaction between humans and machines. Recently, researchers have demonstrated powerful attacks against machine learning models that can fool them to produceincorrect results. However, nearly all previous research in adversarial attacks has focused on image recognition and object detection models. In this short paper, we present a first of its kind demonstration of adversarial attacks against speech classification model. Our algorithm performs targeted attacks with 87% success by adding small background noise without having to know the underlying model parameter and architecture. Our attack only changes the least significant bits of a subset of audio clip samples, and the noise does not change 89% the human listener’s perception of the audio clip as evaluated in our human study.

    avec un tableau de sons bricolés pour leur faire dire ce qu’on veut (ou presque)
    (les messages trompeurs sont très bruits, contrairement aux exemples précédents)

    Adversarial Speech Commands