2024-Q1-AI-M 9. Auto-Encoders, AE, DAE, VAE, VQ-VAE

 

Saturs:

  1. Nepārraudzītās apmācības nozīme - Yann Le Cun Cake

  2. Parastais AE, Kļūdas funkcija

  3. Jēgas skaidrojums - anomāliju atpazīšana, nemarķēti dati, Generative features

  4. Latentā vektora attālumu mērīšana

    1. L1, L2 distances

    2. Kosīnusa distance

    3. Mahalanobis distance

    4. PLDA

  5. AE Work - Encoder

  6. Transposed Convolutions, Dilations

  7. Work - Decoder

  8. Test loss daļā izmantot real y_ground

  9. DAE Denoising Auto Encoder

  10. VAE

    1. Discuss the meaning of VAE, generative models, Re-Identification, Zero-Shot learning tasks

    2. Discuss the reparameterization trick (without it, it would be very inefficient, as one would have to collect data from the entire dataset to determine (\mu, \sigma) before generating could be done)

    3. Discuss KL, its logic, and derivation

    4. Implement a VAE model together

    5. Train together on Google Colab using GPU

  11. Homework - Modern AE, VQ-VAE

    1. Svaru saglabāšana

    2. Upsampling / Downsampling

    3. Group Norm, Layer norm, instance norm skaidrojums

 

Source code

AE Source code (finished code examples)

http://share.yellowrobot.xyz/quick/2023-11-2-9BFED98D-BBC1-41A0-B9D2-6A7E48F8102A.zip

 

VAE Finished code: http://share.yellowrobot.xyz/quick/2023-11-11-7C76C73A-BA2B-4FB2-A21F-9B89B7BAC1A1.zip

 

 

9.1. Video / Materials (3 Apr 2024, 18:00)

Video: https://youtube.com/live/U3qM0iGi4Yc?feature=share

Jamboard: https://jamboard.google.com/d/1GQlMyT-snhxOnezkkX7NVzjg9EAk7FQX3hSiy6XbZBQ/edit?usp=sharing

Sagatavošanās materiāli:

  1. AE / DAE: https://towardsdatascience.com/auto-encoder-what-is-it-and-what-is-it-used-for-part-1-3e5c6f017726

  2. Cosine / Euclidean / mahalanobis distances: https://cmry.github.io/notes/euclidean-v-cosine

  3. PCA whitening: http://ufldl.stanford.edu/tutorial/unsupervised/PCAWhitening/

  4. https://medium.com/@chhablani.gunjan/can-auto-encoders-generate-images-a3a16c83cf6a

  5. https://tiao.io/post/tutorial-on-variational-autoencoders-with-a-concise-keras-implementation/

  6. https://towardsdatascience.com/demystifying-kl-divergence-7ebe4317ee68

  7. https://arxiv.org/abs/1804.03599

  8. https://arxiv.org/pdf/1804.00104.pdf

 

^ Jamboard iedota pieeja stefan.dayneko@gmail.com

 

When starting the lection start Screen Recording using OBS software (you can choose a window to record using it + Mirror it on Wall Display in Class)

Stream key: r3jz-x9j1-9afp-ud8h-apt9

Before lecture test streaming NOT on this key, but create your own livestream in youtube to make sure this works

 

This is way how to make output of OBS visible on Display on Wall (left click)

image-20231026155842296

Setting stream image-20231026160000783

 

 

Previous year materials (check)

AE - Video: https://youtu.be/WOmH67RU33U

Jamboard: https://jamboard.google.com/d/1AC-5eRS5LV30lN7PmMHE3M7_L4XqemFPgFHMjOb6UZA/edit?usp=sharing

VAE - Video:

https://youtube.com/live/qDOP4jZSOf4?feature=share

Jamboard:

https://jamboard.google.com/d/1fVbR_NhYagSXEPLDoA9LJxvzFacnS8rEfsHxnggWnL8/edit?usp=sharing

 

Video (Latvian)

https://youtu.be/OUJ5S0N7ijg

Video (English)

https://www.youtube.com/watch?v=Sy5hqyBefv0

Jamboard:

https://jamboard.google.com/d/1SYZYgylWaLTf5jUvl9rdYGy1HTckRN_YH8wPAB9Rj0I/edit?usp=sharing

!! Explanation of VQ-VAE (for second task) https://www.youtube.com/watch?v=VZFVUrYcig0


 

 

9.2. Implemementēt iekodētāja daļu AE modelim

  1. Implemementēt iekodētāja daļu AE modelim, izmantojot konvolūciju formulu - samazināt dimensijas līdz izmēram z.shape = (B, 32, 1, 1), pēc noplacināšanas z.shape = (B, 32)

  2. Implementēt kļūdas funkciju pēc izvēles

Iesniegt pirmkodum, tzmantojot sagatavi: https://share.yellowrobot.xyz/quick/2024-4-1-09F993AE-D2D9-46B2-BB00-5B69EE8999CC.zip

 

9.3. Implementēt atkodētāja daļu AE modelim

Implementēt atkodētāja daļu AE modelim, izmantojot transposed convolution formulu. Atcerieties arī kādās robežās jābūt rezultāta vērtībai.

Iesniegt pirmkodum, izmantojot sagatavi no iepriekšējā uzdevuma.

9.4. Implementēt DAE

Implementēt DAE. Datu kopas programmas daļā ar torch.random.random(shape) vai kādu citu funkciju pēc jūsu izvēles iegūt, ka katrs pikselis x ievad-datos 50% gadījumu tiek aizstāts ar 0 vērtību. 50% gadījumu no ievades paraugiem arī izmantot orģinālo attēlu bez kropļojumiem.

Iesniegt pirmkodum, tzmantojot sagatavi no iepriekšējā uzdevuma. Pievienot ekrānšāviņus ar rezultātiem.

 

9.5. Implement a VAE Model

Implement a VAE model based on video instructions. Use the template: http://share.yellowrobot.xyz/quick/2023-11-11-8C7CB032-EAB5-4172-A4D6-DA24D7CAACF1.zip

 

9.6. Implement VQ-VAE Model (Optional)

Implement a VQ-VAE model based on video instructions. Use the template: http://share.yellowrobot.xyz/quick/2023-11-11-625DD308-641E-4183-BAFF-8DDC643AC7F5.zip

 

9.7. Implementēt jaunu DAE versiju (Mājasdarbs)

Balstoties uz 9.4 kodu implementēt modernāku encoder, decoder versiju

EncoderBlock: torch.nn.Conv2d(in_channels=, out_channels=, kernel_size=3, stride=1, padding=1), torch.nn.GroupNorm(num_groups=, num_channels=), torch.nn.Mish(), torch.nn.Upsample(size=),

DecoderBlock:

torch.nn.ConvTranspose2d(in_channels=, out_channels=, kernel_size=3, stride=1, padding=1), torch.nn.GroupNorm(num_groups=, num_channels=), torch.nn.Mish(), torch.nn.Upsample(size=)

torch.nn.Upsample var arī ielikt ik pa 3 blokiem, lai palielinātu parametru skaitu

  1. Pievienot testa kopai dotos piemērus un pārbaudīt vai tie atdalās kā anomālijas latentajā Z telpā (Iesniegt screenshots klāt pie pie pirmkoda),

    Lai šo varētu izdarīt visvienkāršāk būtu test iterācijas beigās izsaukt model.forward ar iepriekš sagatavotiem manuāli izvēlētiem paraugiem, dataset_full.labels var pievienot “anomaly”, lai attēlotos vizuāli. Ja vēlaties iekļaut datu kopā, tad jāpārraksta dataset_full kods, izmantojot torch.utils.data.Subset priekš train un test (test kopā iekļautas anomālijas, bet tās nav iekļautas train kopā)

    Implement MSA / L1 loss

  2. Implement saving models when best loss value achieved

  3. Pievienot testa kopai dotos piemērus un pārbaudīt vai tie atdalās kā anomālijas latentajā Z telpā (Iesniegt screenshots klāt pie pie pirmkoda)

Iesniegt pirmkodu un screenshots

 

9.8. Homework - Latent Vector Arithmetic

Using latent vector arithmetic (average value from one class and another class samples), generate new images representing semantic properties. For example, adding or removing a part to the letter.

Tasks:

  1. Choose a sample from the MNIST dataset, generate new samples from a normal (Gaussian distribution), using z_mu and z_sigma. Add the obtained result images and the source code to the submission. To get the original z_mu and z_sigma manually select similar samples.

  2. Choose a sample from the BALLS dataset, generate new samples from a normal (Gaussian distribution), using z_mu and z_sigma. Add the obtained result images and the source code to the submission. To get the original z_mu and z_sigma manually select similar samples.

  3. Perform Z latent vector arithmetic, add or subtract the resulting vectors of different classes. You can also use the average value from both class vectors. Add the obtained result images and the source code to the submission.

  4. Train your models using the hyper-parameters given below - add to the submission the training error curves, model weights, and the source code.

Example for tasks 1, 2, 3: http://share.yellowrobot.xyz/quick/2023-11-11-3AA4BE22-93D8-4D1A-B386-29DC57686891.zip

MNIST template: http://share.yellowrobot.xyz/quick/2023-11-11-E767405F-9A7A-4C59-8B2A-74CD83790DB4.zip

BALLS template: http://share.yellowrobot.xyz/quick/2023-11-11-956FC6B2-FC92-400F-8494-095421010EEA.zip

Trained MNIST weights: http://share.yellowrobot.xyz/quick/2023-11-11-BEEF5CBD-1ACA-4B1B-88D7-410DCB73BBB7.zip

BALLS dataset: http://share.yellowrobot.xyz/quick/2023-11-11-9CB72E5A-AFE9-4FA5-9A62-EB5B81D1E33F.zip

Apmācību hiperparametri:

BALLS mean MSE + mean KL "batch_size": 32, "learning_rate": 0.001, "vae_beta": 0.0001

MNIST mean MSE + mean KL "batch_size": 16, "learning_rate": 0.001, "vae_beta": 0.001,

 

homework_1

homework_3

 

 


 

 

Materials

 

 

image-20231027100320906

 

 

 

image-20231101235355358

 

 

 

image-20211103071136682

 

75F2E95F-9CCC-4E36-A2D0-EE6D1582B5A0

 

3FB8CE9F-901A-42E7-A962-52226E9878DF

(1)LL2=1NN(xx)2

 

F9EF31ED-D989-4A7F-BBC0-F1A4C1FAA68C

 

image-20211103071157906

image-20211103071221192

image-20211103071249113

 

Mērķis dabūt 2 pakāpi (sākotnēji 64)

encoder:100 => 64 => 4/4/4 decoder: 1×2×2×5×5 => SIGMOID!

dilation = 1 by default

rewrite equations without D

 

buggy - (w_in + 2 * p - k) / s + 1

(2)Wout=Win+2PKS+1Wout=S(Win1)2P+K

Deconvolutions / Transposed convolutions

https://towardsdatascience.com/what-is-transposed-convolutional-layer-40e5e6e31c11

Mākslīgi sagatavo lielāka izmēra input, saliekot strides un paddings

img

 

 

image-20211103110037378

 

image-20211103093757713

886E37B1-8C66-46C5-8FA5-5F2E2C7AE090

 

 

 

 

image-20211103071342663

E624D99C-1511-47E1-80DF-EA8F17600BEA

 

64775CB1-D9F1-4219-B680-3309ACF0132F

819A4147-A685-4545-A575-F88C411C547E

 

https://distill.pub/2016/deconv-checkerboard/

Checkerboard Artifacts

image-20211103095551704

 

image-20211103095642177

 

 

image-20220412115508920

image-20220412115523415

image-20220412115531102

 

image-20220412115535951

 

image-20220412115540712

image-20220412115546054

image-20220412115554164

image-20220412115603089

image-20220412115607694

 

image-20220412115614157

 

image-20220412115624681

image-20220412115632530

image-20220412115638345

 

image-20220412115645458

 

image-20220412115649480

 

image-20220412115653709

image-20220412115658138

image-20220412115705559

image-20220412115709591

image-20220412115714570

Main equations

 

(3)L=MSE(y,y^)+βKL(q(z|x),p(z))MSE(y,y^)=1NN(yy^)2KL(q(z|x)=1NN12(2log(σ+ϵ)μ2σ2+1)

 

 

image-20220412115732155

image-20220412115735847

 

(4)L=Lrecon+sg[ze(x)]zq(x)22+βsg[zq(x)]ze(x)22

^ Stop Grad means torch.detach()

 

image-20231111110816936

image-20231111110827289