2023-Q4-AI 10. VAE, VQ-VAE

10.1. Video / Materials

Video:

https://youtube.com/live/qDOP4jZSOf4?feature=share

Jamboard:

https://jamboard.google.com/d/1fVbR_NhYagSXEPLDoA9LJxvzFacnS8rEfsHxnggWnL8/edit?usp=sharing

 

Materials:

  1. https://medium.com/@chhablani.gunjan/can-auto-encoders-generate-images-a3a16c83cf6a

  2. https://tiao.io/post/tutorial-on-variational-autoencoders-with-a-concise-keras-implementation/

  3. https://towardsdatascience.com/demystifying-kl-divergence-7ebe4317ee68

  4. https://arxiv.org/abs/1804.03599

  5. https://arxiv.org/pdf/1804.00104.pdf

 


Example from the previous Year

Video (Latvian)

https://youtu.be/OUJ5S0N7ijg

Video (English)

https://www.youtube.com/watch?v=Sy5hqyBefv0

Jamboard:

https://jamboard.google.com/d/1SYZYgylWaLTf5jUvl9rdYGy1HTckRN_YH8wPAB9Rj0I/edit?usp=sharing

!! Explanation of VQ-VAE (for second task) https://www.youtube.com/watch?v=VZFVUrYcig0


Jamboard shared to: amir.zkn85@gmail.com

RTMP Key: 4cjw-mc89-du2v-xad6-3g3g

⚠️ Do not forget to start streaming and post tasks in ORTUS

 

Content

  1. Discuss the meaning of VAE, generative models, Re-Identification, Zero-Shot learning tasks

  2. Discuss the reparameterization trick (without it, it would be very inefficient, as one would have to collect data from the entire dataset to determine (\mu, \sigma) before generating could be done)

  3. Discuss KL, its logic, and derivation

  4. Implement a VAE model together

  5. Train together on Google Colab using GPU

Finished code: http://share.yellowrobot.xyz/quick/2023-11-11-7C76C73A-BA2B-4FB2-A21F-9B89B7BAC1A1.zip

9.2. Implement a VAE Model

Implement a VAE model based on video instructions. Use the template: http://share.yellowrobot.xyz/quick/2023-11-11-8C7CB032-EAB5-4172-A4D6-DA24D7CAACF1.zip

 

9.3. Implement VQ-VAE Model

Implement a VQ-VAE model based on video instructions. Use the template: http://share.yellowrobot.xyz/quick/2023-11-11-625DD308-641E-4183-BAFF-8DDC643AC7F5.zip

9.4. Homework - Latent Vector Arithmetic

Using latent vector arithmetic (average value from one class and another class samples), generate new images representing semantic properties. For example, adding or removing a part to the letter.

Tasks:

  1. Choose a sample from the MNIST dataset, generate new samples from a normal (Gaussian distribution), using z_mu and z_sigma. Add the obtained result images and the source code to the submission. To get the original z_mu and z_sigma manually select similar samples.

  2. Choose a sample from the BALLS dataset, generate new samples from a normal (Gaussian distribution), using z_mu and z_sigma. Add the obtained result images and the source code to the submission. To get the original z_mu and z_sigma manually select similar samples.

  3. Perform Z latent vector arithmetic, add or subtract the resulting vectors of different classes. You can also use the average value from both class vectors. Add the obtained result images and the source code to the submission.

  4. Train your models using the hyper-parameters given below - add to the submission the training error curves, model weights, and the source code.

Example for tasks 1, 2, 3: http://share.yellowrobot.xyz/quick/2023-11-11-3AA4BE22-93D8-4D1A-B386-29DC57686891.zip

MNIST template: http://share.yellowrobot.xyz/quick/2023-11-11-E767405F-9A7A-4C59-8B2A-74CD83790DB4.zip

BALLS template: http://share.yellowrobot.xyz/quick/2023-11-11-956FC6B2-FC92-400F-8494-095421010EEA.zip

Trained MNIST weights: http://share.yellowrobot.xyz/quick/2023-11-11-BEEF5CBD-1ACA-4B1B-88D7-410DCB73BBB7.zip

BALLS dataset: http://share.yellowrobot.xyz/quick/2023-11-11-9CB72E5A-AFE9-4FA5-9A62-EB5B81D1E33F.zip

Apmācību hiperparametri:

BALLS mean MSE + mean KL "batch_size": 32, "learning_rate": 0.001, "vae_beta": 0.0001

MNIST mean MSE + mean KL "batch_size": 16, "learning_rate": 0.001, "vae_beta": 0.001,

 

homework_1

homework_3

 


 

image-20220412115356694

 


Materiāli

image-20220412115508920

image-20220412115523415

image-20220412115531102

 

image-20220412115535951

 

image-20220412115540712

image-20220412115546054

image-20220412115554164

image-20220412115603089

image-20220412115607694

 

image-20220412115614157

 

image-20220412115624681

image-20220412115632530

image-20220412115638345

 

image-20220412115645458

 

image-20220412115649480

 

image-20220412115653709

image-20220412115658138

image-20220412115705559

image-20220412115709591

image-20220412115714570

Main equations

 

(1)L=MSE(y,y^)+βKL(q(z|x),p(z))MSE(y,y^)=1NN(yy^)2KL(q(z|x)=1NN12(2log(σ+ϵ)μ2σ2+1)

 

 

 

image-20220412115732155

image-20220412115735847

 

 

(2)L=Lrecon+sg[ze(x)]zq(x)22+βsg[zq(x)]ze(x)22

^ Stop Grad means torch.detach()

 

image-20231111110816936

image-20231111110827289