Solvro Talks - Quick Start with ML Projects Using PyTorch Lighting

Kształt kółka
Kształt półkola
Ksztalt koła dekoracyjny
Solvro Talks - Quick Start with ML Projects Using PyTorch Lighting

Solvro Talks - Quick Start with ML Projects Using PyTorch Lighting

What is PyTorch Lightning and how does it differ from regular PyTorch?

PyTorch Lightning is nothing more than a Python framework for deep learning. Its name may rightly be associated with another framework, PyTorch. This is not coincidental, as Lightning is based on PyTorch and focuses mainly on the organization of code written in the mentioned framework. For this reason, it is well-suited for more complex model training processes. Due to all this, Lightning is often referred to as a wrapper for PyTorch.

Starting a New Project - Where to begin?

Starting a new project with Lightning is not much different from starting any other project written in Python. The most important thing is to create a separate virtual environment before installing the project. When installing PyTorch via pip, it is also worth adding the address of PyTorch's own site so that pip installs the framework directly from there, rather than from PyPi, along with a slew of dependencies.

How should you best structure your project? It mainly depends on how you want to train and optimize your model. The example presented below is the most universal version of the project, which can be easily expanded upon depending on current needs.

1 T1ehzqa Si8dn1 Z1 J5j Xc Ou Fm Jw H N6 Rk Uqci V17 De W O4hv Ih Lq

The entire template can be divided into three parts

  • data catalog, logs, and notebooks: items that are normally found in every ML project, but which Lightning does not have much impact on

  • configs catalog: this is where the project's configuration files are located. The project's configuration can be saved in one large .yaml file or distributed across several smaller files, which is more readable for larger projects

  • src catalog: this contains the project's source files. It can be divided into several basic modules: a module for training the model (modules), a module for data management (datamodules), a module with models (models) as well as additional modules with one's own metrics code or helper functions (e.g., in the case of natural language processing, these could be helper functions for tokenization)

The last part is the most important because it best showcases the advantages of Lightning.

 

Organization of the Training Process

155 W Qmhkb3g8 Tt Xhf W4r K8g53 Z Ww Rj Ore Qd Q V0n9ay T Laaxg

The illustration above is based on an animation in the Lightning documentation, which you can see here:

https://lightning.ai/docs/pytorch/stable/starter/introduction.html.

On the left is the code from PyTorch, and on the right from Lightning. Blocks of code responsible for the same functionalities are marked with colors. Writing the entire training process as one long script made it so that during experiments, any changes, e.g., in the model structure or training on different data, required changes to the entire script and its verification. In each project, the same activities were repeated, e.g., it was necessary to ensure that a given tensor was definitely loaded to the appropriate device by calling the .to(device) function. One of Lightning's goals was to structure this code to be more flexible and adaptable to more complex training processes. For this purpose, something called a system was defined.

It is worth mentioning that LightningModule is not the same as nn.Module from classic PyTorch. nn.Module defines a single model, whereas LightningModule consists of several models and more broadly defines the environment in which these models exist and the relationships between them.

Managing Data Sets

1 Nwe39 Pt Tdizadu Sugq I7a Pi C Pvt Y2c Mxe5vo Ghgp P Ao S4 V5 Eg

PyTorch defines two classes by default for data management - Dataset and DataLoader. Datasets are used to store data along with their labels, while Dataloaders enable iterating over data from Datasets. Before loading data into Dataloaders, they needed to be split into training, validation, and test sets, and appropriate transformations had to be performed on them. Lightning simplifies the entire data management process thanks to the LightningDataModule class.

The basic functions of this class include the setup function, in which transformations on data can be carried out and they can be divided into appropriate sets. Important functions also include train_dataloader, val_dataloader, and test_dataloader, which, as the names suggest, return the respective dataloaders.

144 H2 S P2 N Lwb0f Pc N7 X Qu Q Mo1 Se9v C W4s V2 Cjb I0hmbst6xg

Source: https://www.assemblyai.com/blog/pytorch-lightning-for-dummies/

Project Configuration via CLI

Changing parameters of training processes (in this context, model architecture, data set, or hyperparameters) via the console is often used to avoid the need to hard-code them directly into scripts. You can use Python's built-in Argparse class, which enables processing of script parameters.

In this area, Lightning also has its own LightningCLI class. It allows invoking the script with different configuration files (saved in the mentioned configs directory), as well as with default subcommands for training, validation, or testing the model.

16 Tc92d Kdr JS Pf Lxcn Xek F59 Qbs Tn Lifn D Zd Zmsti1 E76 Uu4b W

Above is an example of invoking a script with configuration and overriding a training parameter. The main advantage of LightningCLI is that the user does not have to implement their own CLI, but can immediately focus on the training parameters themselves.

Tracking Metrics with Custom or External Loggers

In the process of training and validating a model, tracking metrics (e.g., accuracy) and changes in loss function values is important to determine its quality. Lightning allows recording these things using the Logger class. Depending on the user's needs, one can dynamically decide which logger to use during training. Logs can be written to a .csv file, printed in the console, or passed to external logging tools. Most often this is TensorBoard or Wandb (Weights and Biases). Below is an example of code using Wandb

1 a Ypsf D02 Wcmz Ik Mx Jpvfc Av0 Tr L5 Lcff Peyu5v2 Rolk Z Rzf Tg

From the application's perspective, the results collected by the logger may look like this

16qm X3g Lb Wg V Odo J A9 Ys1j Lgx A3l Ou8 J Jwkwb F5r1g Qk869w Bq

Source: https://docs.wandb.ai/quickstart

In summary, PyTorch Lightning was mainly created with the idea of structuring and organizing code, which translates into greater flexibility in developing ML projects. I hope you enjoyed the presentation, and if you have any questions or comments, feel free to raise them. Any feedback is welcome.

Katarzyna Matuszek

 

Katarzyna Matuszek

AI Engineer
    Paradox Call To Action Image
    Paradox Shape Image

    Join our team!

    Contact us

    JoinJoin