2023-Q4-AI-EN 12.Time-Series, Recurrent Neural Networks, RNN, LSTM

 

12.1. Video / Materials

Video: https://youtube.com/live/xpLDWDdqGQ4?feature=share

Jamboard: https://jamboard.google.com/d/16pII0AOesnL8Igf27idEVNdpu4xZuVYgRcx169TDfs0/edit?usp=sharing

Materials:
https://calvinfeng.gitbook.io/machine-learning-notebook/supervised-learning/recurrent-neural-network/recurrent_neural_networks/ https://lanwuwei.github.io/courses/SP19/3521_slides/11-Recurrent_Neural_Networks_2.pdf https://danijar.com/tips-for-training-recurrent-neural-networks/ https://medium.com/datadriveninvestor/attention-in-rnns-321fbcd64f05 https://arxiv.org/abs/1610.09513 https://wiki.pathmind.com/word2vec

 


Jamboard given rights to: amir.zkn85@gmail.com Youtube OBS stream key: 2jfa-fqm7-g3sz-kaad-b8ys

Finished source code: http://share.yellowrobot.xyz/quick/2023-11-26-DB83680C-31DB-41B5-8DB7-46071B507C22.zip

Previous Year lecture (English): https://youtube.com/live/3tMxReZ7wrQ

Jamboard (previous): https://jamboard.google.com/d/1ryRdI08rThos-KY-2aV2jLxL8T0wt808dAc3V2iRR_A/edit?usp=sharing

 

 

 

Content:

  1. RNN (shared weights)

  2. Language modeling task

  3. Embedding dict

  4. [END] token

  5. [ANY] It would be good to add the any token instead of all words that repeat less than 3 times, to keep the sentence count larger in the data set

  6. Train VS Inference many-to-one (inference), many-to-many (training)

  7. Explain the processing of different length sentences in one batch

 


12.2. Implement Vanilla RNN

Following the instructions in video 12.1, implement a Vanilla RNN, without using built-in RNN functions from torch.nn.

Submit the source code and screenshots with the results.

Template: http://share.yellowrobot.xyz/quick/2023-4-3-3F68F1D8-DF36-4D8C-BB56-F0407D2A6512.zip


12.3 Implement GRU

Using the template from task 12.1 and the directions from task 12.2, implement a GRU model. Replace the RNN cell with your own creation. You must not use built-in torch.nn.GRU etc. Submit the source code and screenshots with the results.

GRU equation: http://share.yellowrobot.xyz/upic/8f34c76492d8b3a520255d023e962dc9_1680532330.jpg


12.4 Homework - Implement LSTM

Using the template from task 12.1 and the directions from tasks 12.2 and 12.3, modify the source code so that the following is implemented:

  1. Create an LSTM. Replace the RNN cell with your own creation. You must not use built-in torch.nn.LSTM etc.

  2. Implement weight saving at the lowest test_loss value

  3. Implement a separate script where you can load the model weights and the user can enter the beginning of a sentence from several words and the model will predict the end of the sentence

  4. Implement the built-in torch.nn.LSTM model and compare the results with your own model

  5. Submit the code and screenshots with training and rollout results

LSTM equation: http://share.yellowrobot.xyz/upic/70d53425be0fec7c7dc0ebb246b6fecb_1680532356.jpg


 

Materials

RNN exectution

02AD359D-FED7-4551-96B4-4959BF2AF1B7

57EEBF66-AEC4-4E35-A77D-ED3657CE7AC6

 

Language modelling

8D97DEFB-AC17-4604-825A-B0AA32C4F178

2AF994F9-72C4-4BBA-9FD6-130254DA0854

Train VS Inference one-to-many

C334D63E-8E6F-4EA1-8AE7-06E749334940

 

Embeddings / Word tokens

6D107ECC-A1EF-4483-94E5-34D981805B1A

EB46FA4B-0CF5-4437-A27E-AB5ACA4EBF3D

image-20220426224626781

 

image-20220426224647903

A2EE798B-61DE-4059-9BC8-EB447F0186F3

 

image-20220426224706104

 

Model structure

RNN cell

A0C7C685-69ED-4DE4-B94D-338DD46AB1A7

 

Loss function CCE

(1)C=1.0ycountiycountjLCCE=1NC[y]log(y[y]+ϵ)

 

27348300-C0F3-46E0-A683-BBAF9F63608E

Different lengths in same batch

8FD81F0E-55DC-4DC0-A700-2F182FEC8D58

 

Dropout Regularization against overfit

76A5CDE8-117C-4957-B60E-7BBDE6D74D4E

 

LSTM

image-20220426224748074

image-20220426224803496

D2BEC799-F5CF-4CE1-B6B9-EDF8E476081E

 

 

SOTA LSTM

image-20220426224816486