technology:machine translation

  • Implementing a Sequence-to-Sequence Model
    https://hackernoon.com/implementing-a-sequence-to-sequence-model-45a6133958ca?source=rss----3a8

    Learn how to implement a sequence-to-sequence model in this article by Matthew Lamons, founder, and CEO of Skejul — the AI platform to help people manage their activities, and Rahul Kumar, an AI scientist, deep learning practitioner, and independent researcher.In this article, you’ll implement a seq2seq model (an encoder-decoder RNN) for a simple sequence-to-sequence question-answer task. This model can be trained to map an input sequence (questions) to an output sequence (answers), which are not necessarily of the same length as each other.This type of seq2seq model has shown impressive performance in various other tasks such as speech recognition, machine translation, question answering, Neural Machine Translation (NMT), and image caption generation.The following diagram helps you (...)

    #keras #python #deep-learning #tensorflow #machine-learning

  • 18 open source translation tools | Opensource.com
    https://opensource.com/article/17/6/open-source-localization-tools

    Localization plays a central role in the ability to customize an open source project to suit the needs of users around the world. Besides coding, language translation is one of the main ways people around the world contribute to and engage with open source projects.

    There are tools specific to the language services industry (surprised to hear that’s a thing?) that enable a smooth localization process with a high level of quality. Categories that localization tools fall into include:

    Computer-assisted translation (CAT) tools
    Machine translation (MT) engines
    Translation management systems (TMS)
    Terminology management tools
    Localization automation tools

  • The Shallowness of Google Translate - The Atlantic
    https://www.theatlantic.com/technology/archive/2018/01/the-shallowness-of-google-translate/551570

    Un excellent papier par Douglas Hofstadter (ah, D.H., Godel, Escher et Bach... !!!)

    As a language lover and an impassioned translator, as a cognitive scientist and a lifelong admirer of the human mind’s subtlety, I have followed the attempts to mechanize translation for decades. When I first got interested in the subject, in the mid-1970s, I ran across a letter written in 1947 by the mathematician Warren Weaver, an early machine-translation advocate, to Norbert Wiener, a key figure in cybernetics, in which Weaver made this curious claim, today quite famous:

    When I look at an article in Russian, I say, “This is really written in English, but it has been coded in some strange symbols. I will now proceed to decode.”

    Some years later he offered a different viewpoint: “No reasonable person thinks that a machine translation can ever achieve elegance and style. Pushkin need not shudder.” Whew! Having devoted one unforgettably intense year of my life to translating Alexander Pushkin’s sparkling novel in verse Eugene Onegin into my native tongue (that is, having radically reworked that great Russian work into an English-language novel in verse), I find this remark of Weaver’s far more congenial than his earlier remark, which reveals a strangely simplistic view of language. Nonetheless, his 1947 view of translation-as-decoding became a credo that has long driven the field of machine translation.

    Before showing my findings, though, I should point out that an ambiguity in the adjective “deep” is being exploited here. When one hears that Google bought a company called DeepMind whose products have “deep neural networks” enhanced by “deep learning,” one cannot help taking the word “deep” to mean “profound,” and thus “powerful,” “insightful,” “wise.” And yet, the meaning of “deep” in this context comes simply from the fact that these neural networks have more layers (12, say) than do older networks, which might have only two or three. But does that sort of depth imply that whatever such a network does must be profound? Hardly. This is verbal spinmeistery .

    I began my explorations very humbly, using the following short remark, which, in a human mind, evokes a clear scenario:

    In their house, everything comes in pairs. There’s his car and her car, his towels and her towels, and his library and hers.

    The translation challenge seems straightforward, but in French (and other Romance languages), the words for “his” and “her” don’t agree in gender with the possessor, but with the item possessed. So here’s what Google Translate gave me:

    Dans leur maison, tout vient en paires. Il y a sa voiture et sa voiture, ses serviettes et ses serviettes, sa bibliothèque et les siennes.

    We humans know all sorts of things about couples, houses, personal possessions, pride, rivalry, jealousy, privacy, and many other intangibles that lead to such quirks as a married couple having towels embroidered “his” and “hers.” Google Translate isn’t familiar with such situations. Google Translate isn’t familiar with situations, period. It’s familiar solely with strings composed of words composed of letters. It’s all about ultrarapid processing of pieces of text, not about thinking or imagining or remembering or understanding. It doesn’t even know that words stand for things. Let me hasten to say that a computer program certainly could, in principle, know what language is for, and could have ideas and memories and experiences, and could put them to use, but that’s not what Google Translate was designed to do. Such an ambition wasn’t even on its designers’ radar screens.

    It’s hard for a human, with a lifetime of experience and understanding and of using words in a meaningful way, to realize how devoid of content all the words thrown onto the screen by Google Translate are. It’s almost irresistible for people to presume that a piece of software that deals so fluently with words must surely know what they mean. This classic illusion associated with artificial-intelligence programs is called the “Eliza effect,” since one of the first programs to pull the wool over people’s eyes with its seeming understanding of English, back in the 1960s, was a vacuous phrase manipulator called Eliza, which pretended to be a psychotherapist, and as such, it gave many people who interacted with it the eerie sensation that it deeply understood their innermost feelings.

    To me, the word “translation” exudes a mysterious and evocative aura. It denotes a profoundly human art form that graciously carries clear ideas in Language A into clear ideas in Language B, and the bridging act not only should maintain clarity, but also should give a sense for the flavor, quirks, and idiosyncrasies of the writing style of the original author. Whenever I translate, I first read the original text carefully and internalize the ideas as clearly as I can, letting them slosh back and forth in my mind. It’s not that the words of the original are sloshing back and forth; it’s the ideas that are triggering all sorts of related ideas, creating a rich halo of related scenarios in my mind. Needless to say, most of this halo is unconscious. Only when the halo has been evoked sufficiently in my mind do I start to try to express it—to “press it out”—in the second language. I try to say in Language B what strikes me as a natural B-ish way to talk about the kinds of situations that constitute the halo of meaning in question.

    This process, mediated via meaning, may sound sluggish, and indeed, in comparison with Google Translate’s two or three seconds per page, it certainly is—but it is what any serious human translator does. This is the kind of thing I imagine when I hear an evocative phrase like “deep mind.”

    A friend asked me whether Google Translate’s level of skill isn’t merely a function of the program’s database. He figured that if you multiplied the database by a factor of, say, a million or a billion, eventually it would be able to translate anything thrown at it, and essentially perfectly. I don’t think so. Having ever more “big data” won’t bring you any closer to understanding, since understanding involves having ideas, and lack of ideas is the root of all the problems for machine translation today. So I would venture that bigger databases—even vastly bigger ones—won’t turn the trick.

    Another natural question is whether Google Translate’s use of neural networks—a gesture toward imitating brains—is bringing us closer to genuine understanding of language by machines. This sounds plausible at first, but there’s still no attempt being made to go beyond the surface level of words and phrases. All sorts of statistical facts about the huge databases are embodied in the neural nets, but these statistics merely relate words to other words, not to ideas. There’s no attempt to create internal structures that could be thought of as ideas, images, memories, or experiences. Such mental etherea are still far too elusive to deal with computationally, and so, as a substitute, fast and sophisticated statistical word-clustering algorithms are used. But the results of such techniques are no match for actually having ideas involved as one reads, understands, creates, modifies, and judges a piece of writing.

    Let me return to that sad image of human translators, soon outdone and outmoded, gradually turning into nothing but quality controllers and text tweakers. That’s a recipe for mediocrity at best. A serious artist doesn’t start with a kitschy piece of error-ridden bilgewater and then patch it up here and there to produce a work of high art. That’s not the nature of art. And translation is an art.

    In my writings over the years, I’ve always maintained that the human brain is a machine—a very complicated kind of machine—and I’ve vigorously opposed those who say that machines are intrinsically incapable of dealing with meaning. There is even a school of philosophers who claim computers could never “have semantics” because they’re made of “the wrong stuff” (silicon). To me, that’s facile nonsense. I won’t touch that debate here, but I wouldn’t want to leave readers with the impression that I believe intelligence and understanding to be forever inaccessible to computers. If in this essay I seem to come across sounding that way, it’s because the technology I’ve been discussing makes no attempt to reproduce human intelligence. Quite the contrary: It attempts to make an end run around human intelligence, and the output passages exhibited above clearly reveal its giant lacunas.

    From my point of view, there is no fundamental reason that machines could not, in principle, someday think, be creative, funny, nostalgic, excited, frightened, ecstatic, resigned, hopeful, and, as a corollary, able to translate admirably between languages. There’s no fundamental reason that machines might not someday succeed smashingly in translating jokes, puns, screenplays, novels, poems, and, of course, essays like this one. But all that will come about only when machines are as filled with ideas, emotions, and experiences as human beings are. And that’s not around the corner. Indeed, I believe it is still extremely far away. At least that is what this lifelong admirer of the human mind’s profundity fervently hopes.

    When, one day, a translation engine crafts an artistic novel in verse in English, using precise rhyming iambic tetrameter rich in wit, pathos, and sonic verve, then I’ll know it’s time for me to tip my hat and bow out.

    #Traduction #Google_translate #Deep_learning

  • The AI Takeover Is Coming. Let’s Embrace It.
    https://backchannel.com/the-ai-takeover-is-coming-lets-embrace-it-d764d61f83a

    [T]he White House released a chilling report on AI and the economy. It began by positing that “it is to be expected that machines will continue to reach and exceed human performance on more and more tasks,” and it warned of massive job losses.

    Yet to counter this threat, the government makes a recommendation that may sound absurd: we have to increase investment in AI. The risk to productivity and the US’s competitive advantage is too high to do anything but double down on it.

    [...]

    In September, Google announced an enormous upgrade in the performance of Google Translate, using a system it’s calling Google Neural Machine Translation (GNMT). Google’s Pereira called the jump in translation quality “something I never thought I’d see in my working life.”

    “We’d been making steady progress,” he added. “This is not steady progress. This is radical.”

    #Apprentissage_profond #Google_Neural_Machine_Translation #Intelligence_artificielle #Numérique #OpenAI #Économie

    • L’angle de l’article est râlant :

      Our medical system is deeply flawed; intelligent agents could spread affordable, high-quality healthcare to more people in more places. Our education infrastructure is not adequately preparing students for the looming economic upheaval; here, too, AI systems could chip in where teachers are spread too thin.

      Bah oui, réparons le monde à grands coup de technologie, ça va marcher, c’est sûr.

      To ignore this trend — to not plunge headlong into understanding it, shaping it, monitoring it — might well be the biggest mistake a country could make.

      Let’s make American AI great (again)

      Désolé pour la mauvaise humeur ^^

  • SYSTRAN announces the launch of its “Purely Neural MT” engine, a revolution for the machine translation market
    https://globenewswire.com/news-release/2016/08/30/868116/10164884/en/SYSTRAN-announces-the-launch-of-its-Purely-Neural-MT-engine-a-revolut

    Unlike statistical (SMT) or ruled-based (RMT) engines, NMT engines process the entire sentence, paragraph or document. The entire chain is processed end-to-end with no intermediate stages between the source sentence and the target. The NMT engine models the whole process of machine translation through a unique artificial neural network.

    #machine_learning #traduction @lewer