1st Edition
Machine Translation and Transliteration involving Related, Low-resource Languages
Preface. Introduction. Need for Machine Translation and Transliteration. Need for Machine Translation involving Related Languages. Language Relatedness: Origins and Key Properties. Do we need SMT approaches customized for Related Languages? Translation, Transliteration and Related Languages: The Connection. What does the monograph contain? Past Work on MT for Related Languages. Translation between Related Languages. Translation involving Related Languages and a Lingua franca. Neural Machine Translation and Related Languages. Rule-based MT Systems involving Related Languages. Summary. I Machine Translation. Utilizing Lexical Similarity by using Subword Translation Units. Motivation. Related Work. Translation Units for Related Languages. Training Subword-level Translation Models. Experimental Setup. Results and Discussion. Why are Subword Units better than other Translation Units? Summary and Future Work. Improving Subword-level Translation Quality. Effect of Resource Availability. Investigation of Design Choices and Hyperparameters. Improving Decoding Speed. Summary. Subword-level Pivot-based SMT. Motivation. Pivotbased SMT for Related Languages. Related Work. Experimental Setup. Results and Analysis. Using Multiple, Related Pivot Languages. Choice of Pivot Language and Language Relatedness. Summary and Future Directions. A Case Study on Indic Language Translation. A Primer on Indian Languages. Relatedness among Indian Languages. Dataset used for Study. Lexical Similarity between Indian Languages. Translation between Indian Languages. Translation from English to Indian Languages. II Machine Transliteration. Utilizing Orthographic Similarity for Unsupervised Transliteration. Motivation. Related Work. Unsupervised Substring-based Transliteration. Character-based Unsupervised Transliteration. Bootstrapping Substring-based models. Experimental Setup. Results and Discussion. Summary and Future Work. Multilingual Neural Transliteration. Motivation. Related Work. Multilingual Transliteration Learning. Experimental Setup. Results and Discussion. Zeroshot Transliteration. Incorporating Phonetic Information. Summary and Future Work. Conclusion and Future Directions. Conclusions. Future Work and Directions. Appendices. A Extended ITRANS Romanization Scheme. B Software and Data Resources. C Conferences/Workshops for Translation between Related Languages. Bibliography.
Biography
Dr. Anoop Kunchukuttan is a Senior Applied Researcher in the machine translation team at Microsoft India, Hyderabad. He received his Ph.D from the Indian Institute of Technology Bombay. He is broadly interested in natural language processing and machine learning. His research interests include multilingual learning, language relatedness, machine translation, machine transliteration and distributional semantics. He has also explored problems in information extraction, automated grammar correction, multiword expressions and crowdsourcing for NLP. These works have been published in top-tier Natural Language Processing (NLP) conferences and journals. He is passionate about building software and resources for NLP in Indian languages. He actively develops and maintains the Indic NLP Library and the Indic NLP Catalog, and has contributed to the development of resources like the AI4Bharat Indic NLP Suite and the IIT Bombay parallel corpus. He is a co-organizer of the Workshop on Asian Translation and a co-founder of the AI4Bharat NLP Initiative.
Dr. Pushpak Bhattacharyya is Professor of Computer Science and Engineering Department IIT Bombay. His research areas are Natural Language Processing, Machine Learning and AI (NLP-ML-AI). Prof. Bhattacharyya has published more than 350 research papers in various areas of NLP. His textbook ‘Machine Translation’ sheds light on all paradigms of machine translation with abundant examples from Indian Languages. Two recent monographs co-authored by him called 'Investigations in Computational Sarcasm' and 'Cognitively Inspired Natural Language Processing- An Investigation Based on Eye Tracking' describe cutting edge research in NLP and ML. Prof. Bhattacharyya is Fellow of Indian National Academy of Engineering (FNAE) and Abdul Kalam National Fellow. For sustained contribution to technology he received the Manthan Award of the Ministry of IT, P.K. Patwardhan Award of IIT Bombay and VNMM Award of IIT Roorkey. He is also a Distinguished Alumnus of IIT Kharagpur and past President of Association of Computational Linguistics.






