Font Size: a A A

Speech Based Machine Aided Human Translation for a Document Translation Task

Posted on:2013-04-07Degree:Ph.DType:Thesis
University:McGill University (Canada)Candidate:Reddy, AarthiFull Text:PDF
GTID:2455390008469527Subject:Engineering
Abstract/Summary:
Translating documents into multiple languages represents an extremely large expense for businesses, governments, and international agencies. In Canada, for example, it is a requirement that all official documents exist in both official languages, French and English. This has produced a large translation industry employing a large number of skilled professional translators.;The main contributions of this thesis are as follows. First, we describe novel algorithms that provide efficient and accurate transcriptions of dictations provided by the human translator. We show that by integrating information extracted from the source language document with statistical models used in the automatic speech recognition system, a more accurate transcription of the dictations can be obtained. Second, we use key information from the source language document like named entity tagged words and use acoustic, language and phonetic information to ensure that that information exists in the translated document as well. Third, we describe a system that is specific to document translation. The document translation task domain addressed here can be distinguished from tasks addressed in most previous MAHT research which has been focused on translating isolated sentences or phrases. Fourth, we created a new corpus, specifically for use in this thesis. This corpus was collected at McGill from professional translators dictating their translations and has been essential for characterizing the issues associated with the dictation-based MAHT task domain.;It is well known that the standards posed on the quality of translations for business and government documents are far too high to apply existing automatic machine translation technology to the document translation task. A large number of tools for increasing the efficiency of human translators at various stages of their work flow have become commercially available to translation bureaus. These human translators may directly enter translated text, dictate their translations so they may be automatically transcribed, or post-edit first draft translations produced by an automatic machine translation system. The work in this thesis is concerned with a machine aided human translation(MAHT) scenario where a human translator dictates translations of a source language document. Automatic techniques are developed for improving the quality of the transcriptions obtained from these dictated translations by simultaneously incorporating knowledge from the source language text and the target language speech.
Keywords/Search Tags:Translation, Document, Language, Speech, Human, Machine, Large
Related items