9 Week 9: Transformers

Slides

  • 11 Transformers (link or in Perusall)

Setup

For this week, our code is solely in Python and we will be running it through Google Colab. Click here to access the Jupyter Notebook for this week.

Google Colab

Google Colab is a “is a hosted Jupyter Notebook service that requires no setup to use and provides free access to computing resources, including GPUs and TPUs” (as per the Google Colab page). This is convenient for a number of reasons. The most important one is that is gives you GPU access for free (there is no such thing as free in late-stage capitalism). GPUs (graphics processing units) are designed to accelerate computer graphics and, more importantly for us, for parallel computing. This allows for more efficient computation and faster processing time when training machine-learning models. Transformers-based models can use a lot of GPU memory, especially is we want to further pre-train models, or if we have particularly difficult tasks that require larger batches.