08-12, 11:15–12:45 (Asia/Yerevan), 214W PAB
Sequential models in natural language understanding are widely useful for everything from machine translation to speech recognition. In machine translation, encoder-decoder architecture, especially the Transformer, is one of the most prominent branches. The attention idea has been one of the most influential ideas in deep learning. By adopting this idea, we can make a model that takes the long sequence of data (for example, the words in the long sentence for translation) and divides them into small parts while looking at the others simultaneously to generate the output at the end.
The sequence-to-sequence models can be augmented using an attention mechanism. This algorithm will help your model understand where it should focus its attention given a sequence of inputs. This tutorial will introduce you to sequential and attention models by utilizing the neural machine translation (NMT) model implementation from scratch. Several sequence-to-sequence architectures will be presented under the attention models, including basic models, and intuitions under the attention model. We will then implement the model step by step together and see whether we can figure out the kind of model to translate (or transform) sequences of data such as texts and speech.
Only the basic knowledge of neural networks is needed, and any knowledge of machine translation or speech recognition is not required. The material of this talk would be available online and will share with the audience. During the implementation, you could access the codes and have hands-on experience with them.
Highlight Reference:
- Vaswani, Ashish, et al. "Attention is all you need." Advances in neural information processing systems 30 (2017).
No previous knowledge expected
Hadi is leading a software team as a chief engineer at the R&D department of TELIGHT, Czechia, France and a lecturer at the Institute for Advanced Studies in Basic Sciences (IASBS), Iran. He is a former researcher at the Institute of Formal and Applied Linguistics (ÚFAL) at Charles University, Prague and participated in several international projects in collaboration with the concentration of experts in the fields of CV/NLP/HLT/CL/ML/DL. His research focuses on multimodal learning inspired by neural models that are both linguistically motivated, and tailored to language and vision, visual reasoning and deep learning. His main research interests are Machine Learning, Deep Learning, Computer Vision, Multimodal Learning and Visual Reasoning while he is experienced in a wide variety of international projects on cutting-edge technologies. Currently, they are developing a new generation of the patented holographic microscope that utilises live-cell label-free imaging to turn invisible live cells into visible ones.