2023-Q3-AI 7.Language models - LSTM

 

 

7.1. Video / Materials

Video (21 Jul 2023, 10:00): https://youtube.com/live/5WG6mJtQKLo?feature=share

Jamboard: https://jamboard.google.com/d/1525UGA3rOthYMUOr2hgAvtS7QHw3u5utk9B6NjPwtvs/edit?usp=sharing

Materials:
https://calvinfeng.gitbook.io/machine-learning-notebook/supervised-learning/recurrent-neural-network/recurrent_neural_networks/ https://lanwuwei.github.io/courses/SP19/3521_slides/11-Recurrent_Neural_Networks_2.pdf https://danijar.com/tips-for-training-recurrent-neural-networks/ https://medium.com/datadriveninvestor/attention-in-rnns-321fbcd64f05 https://arxiv.org/abs/1610.09513 https://wiki.pathmind.com/word2vec

 


Ir iedota pieeja jamboard un ar OBS jāveic screen streaming uz šādu setting

Youtube video key: h5bk-pucs-ft1w-e902-fcf9

Par katru uzdevumu dodam 100 punktus


 

Saturs:

  1. RNN (shared weights)

  2. Language modeling task

  3. Embedding dict

  4. [END] token

  5. ⚠️ [ANY] būtu labi pielikt token any pie visu vārdu vietā, kuri atkārtojas mazāk kā 3 reizes, lai saglabātu teikumu skaitu lielāku datu kopā

  6. Train VS Inference many-to-one (inference), many-to-many (training)

  7. Izstāstīt dažādu garumu teikumu apstrādi vienā batch

Iepriekšējā gada lekcija

Video https://youtu.be/-nuoRn1ohzI

Jamboard: https://jamboard.google.com/d/1nEQLzDVjXrK7RfkxifA9jlyVBLTA-x3vQPIFOYuQ1UU/edit?usp=sharing


 

7.2. Implement Vanilla RNN

Following the instructions in video 7.1., implement a Vanilla RNN without using built-in RNN functions from torch.nn.

Submit the source code and screenshots of the results.

Template:

http://share.yellowrobot.xyz/quick/2023-4-3-3F68F1D8-DF36-4D8C-BB56-F0407D2A6512.zip


7.3 Implement GRU

Using the template from task 7.1 and instructions from task 7.2, implement a GRU model. Replace the RNN cell with your own creation. You are not allowed to use built-in functions like torch.nn.GRU etc. Submit the source code and screenshots of the results.

GRU equation: http://share.yellowrobot.xyz/upic/8f34c76492d8b3a520255d023e962dc9_1680532330.jpg


7.4 Homework - Implement LSTM

Using the template from task 7.1 and instructions from task 7.2 and 7.3, modify the source code to implement the following:

  1. Create LSTM. Replace the RNN cell with your own creation. You are not allowed to use built-in functions like torch.nn.LSTM etc.

  2. Implement weight storage at the lowest test_loss value

  3. Implement a separate script where you can load the model weights and the user can enter the beginning of a sentence from several words in the console and the model will predict the end of the sentence

  4. Implement the built-in torch.nn.LSTM model and compare the results with your own model

  5. Submit the code and screenshots of the training and rollout results

LSTM equation: http://share.yellowrobot.xyz/upic/70d53425be0fec7c7dc0ebb246b6fecb_1680532356.jpg

 


 

Materiāli

RNN exectution

02AD359D-FED7-4551-96B4-4959BF2AF1B7

57EEBF66-AEC4-4E35-A77D-ED3657CE7AC6

 

Language modelling

8D97DEFB-AC17-4604-825A-B0AA32C4F178

2AF994F9-72C4-4BBA-9FD6-130254DA0854

Train VS Inference one-to-many

C334D63E-8E6F-4EA1-8AE7-06E749334940

 

Embeddings / Word tokens

6D107ECC-A1EF-4483-94E5-34D981805B1A

EB46FA4B-0CF5-4437-A27E-AB5ACA4EBF3D

image-20220426224626781

 

image-20220426224647903

A2EE798B-61DE-4059-9BC8-EB447F0186F3

 

image-20220426224706104

 

Model structure

RNN cell

A0C7C685-69ED-4DE4-B94D-338DD46AB1A7

 

Loss function CCE

(2)C=1.0ycountiycountjLCCE=1NC[y]log(y[y]+ϵ)

 

27348300-C0F3-46E0-A683-BBAF9F63608E

Different lengths in same batch

8FD81F0E-55DC-4DC0-A700-2F182FEC8D58

 

Dropout Regularization against overfit

76A5CDE8-117C-4957-B60E-7BBDE6D74D4E

 

LSTM

image-20220426224748074

image-20220426224803496

D2BEC799-F5CF-4CE1-B6B9-EDF8E476081E

 

 

SOTA LSTM

image-20220426224816486