From Recurrent Models to the advent of Attention

Name: From Recurrent Models to the advent of Attention
Start: 2023-02-15T18:30:00+01:00
Location: eDreams ODIGEO Tech Hub

Abstract

All recent language technologies, such as BERT, GPT3, or ChatGPT, rely on complex neural networks called Transformers. But, what brought us here? What was there that Transformers fixed? In this talk, we will introduce Recurrent Neural Networks and discuss their main applications to language modeling, such as machine translation, image captioning, or multi-model text generation. Next, we will go through RNNs' weaknesses and motivate what brought to the birth of the Attention mechanism, the core innovation behind Transformers. We will conclude with a brief introduction to the Transformer architecture.

Date

Feb 15, 2023 6:30 PM

Event

Papers We Love Milano Chapter

Location

eDreams ODIGEO Tech Hub

Milano,

From Recurrent Models to the advent of Attention

Abstract

Giuseppe Attanasio

Postdoctoral Researcher