Saturs:
Nepārraudzītās apmācības nozīme - Yann Le Cun Cake
Parastais AE, Kļūdas funkcija
Jēgas skaidrojums - anomāliju atpazīšana, nemarķēti dati, Generative features
Latentā vektora attālumu mērīšana
L1, L2 distances
Kosīnusa distance
Mahalanobis distance
PLDA
AE Work - Encoder
Transposed Convolutions, Dilations
Work - Decoder
Test loss daļā izmantot real y_ground
DAE Denoising Auto Encoder
VAE
Discuss the meaning of VAE, generative models, Re-Identification, Zero-Shot learning tasks
Discuss the reparameterization trick (without it, it would be very inefficient, as one would have to collect data from the entire dataset to determine (\mu, \sigma) before generating could be done)
Discuss KL, its logic, and derivation
Implement a VAE model together
Train together on Google Colab using GPU
Homework - Modern AE, VQ-VAE
Svaru saglabāšana
Upsampling / Downsampling
Group Norm, Layer norm, instance norm skaidrojums
AE Source code (finished code examples)
http://share.yellowrobot.xyz/quick/2023-11-2-9BFED98D-BBC1-41A0-B9D2-6A7E48F8102A.zip
VAE Finished code: http://share.yellowrobot.xyz/quick/2023-11-11-7C76C73A-BA2B-4FB2-A21F-9B89B7BAC1A1.zip
Video: https://youtube.com/live/U3qM0iGi4Yc?feature=share
Jamboard: https://jamboard.google.com/d/1GQlMyT-snhxOnezkkX7NVzjg9EAk7FQX3hSiy6XbZBQ/edit?usp=sharing
Sagatavošanās materiāli:
AE / DAE: https://towardsdatascience.com/auto-encoder-what-is-it-and-what-is-it-used-for-part-1-3e5c6f017726
Cosine / Euclidean / mahalanobis distances: https://cmry.github.io/notes/euclidean-v-cosine
PCA whitening: http://ufldl.stanford.edu/tutorial/unsupervised/PCAWhitening/
https://medium.com/@chhablani.gunjan/can-auto-encoders-generate-images-a3a16c83cf6a
https://tiao.io/post/tutorial-on-variational-autoencoders-with-a-concise-keras-implementation/
https://towardsdatascience.com/demystifying-kl-divergence-7ebe4317ee68
^ Jamboard iedota pieeja stefan.dayneko@gmail.com
When starting the lection start Screen Recording using OBS software (you can choose a window to record using it + Mirror it on Wall Display in Class)
Stream key: r3jz-x9j1-9afp-ud8h-apt9
Before lecture test streaming NOT on this key, but create your own livestream in youtube to make sure this works
This is way how to make output of OBS visible on Display on Wall (left click)
Setting stream
AE - Video: https://youtu.be/WOmH67RU33U
Jamboard: https://jamboard.google.com/d/1AC-5eRS5LV30lN7PmMHE3M7_L4XqemFPgFHMjOb6UZA/edit?usp=sharing
VAE - Video:
https://youtube.com/live/qDOP4jZSOf4?feature=share
Jamboard:
https://jamboard.google.com/d/1fVbR_NhYagSXEPLDoA9LJxvzFacnS8rEfsHxnggWnL8/edit?usp=sharing
Video (Latvian)
Video (English)
https://www.youtube.com/watch?v=Sy5hqyBefv0
Jamboard:
https://jamboard.google.com/d/1SYZYgylWaLTf5jUvl9rdYGy1HTckRN_YH8wPAB9Rj0I/edit?usp=sharing
!! Explanation of VQ-VAE (for second task) https://www.youtube.com/watch?v=VZFVUrYcig0
Implemementēt iekodētāja daļu AE modelim, izmantojot konvolūciju formulu - samazināt dimensijas līdz izmēram z.shape = (B, 32, 1, 1), pēc noplacināšanas z.shape = (B, 32)
Implementēt kļūdas funkciju pēc izvēles
Iesniegt pirmkodum, tzmantojot sagatavi: https://share.yellowrobot.xyz/quick/2024-4-1-09F993AE-D2D9-46B2-BB00-5B69EE8999CC.zip
Implementēt atkodētāja daļu AE modelim, izmantojot transposed convolution formulu. Atcerieties arī kādās robežās jābūt rezultāta vērtībai.
Iesniegt pirmkodum, izmantojot sagatavi no iepriekšējā uzdevuma.
Implementēt DAE. Datu kopas programmas daļā ar torch.random.random(shape) vai kādu citu funkciju pēc jūsu izvēles iegūt, ka katrs pikselis x ievad-datos 50% gadījumu tiek aizstāts ar 0 vērtību. 50% gadījumu no ievades paraugiem arī izmantot orģinālo attēlu bez kropļojumiem.
Iesniegt pirmkodum, tzmantojot sagatavi no iepriekšējā uzdevuma. Pievienot ekrānšāviņus ar rezultātiem.
Implement a VAE model based on video instructions. Use the template: http://share.yellowrobot.xyz/quick/2023-11-11-8C7CB032-EAB5-4172-A4D6-DA24D7CAACF1.zip
Implement a VQ-VAE model based on video instructions. Use the template: http://share.yellowrobot.xyz/quick/2023-11-11-625DD308-641E-4183-BAFF-8DDC643AC7F5.zip
Balstoties uz 9.4 kodu implementēt modernāku encoder, decoder versiju
EncoderBlock: torch.nn.Conv2d(in_channels=, out_channels=, kernel_size=3, stride=1, padding=1), torch.nn.GroupNorm(num_groups=, num_channels=), torch.nn.Mish(), torch.nn.Upsample(size=),
DecoderBlock:
torch.nn.ConvTranspose2d(in_channels=, out_channels=, kernel_size=3, stride=1, padding=1), torch.nn.GroupNorm(num_groups=, num_channels=), torch.nn.Mish(), torch.nn.Upsample(size=)
torch.nn.Upsample var arī ielikt ik pa 3 blokiem, lai palielinātu parametru skaitu
Pievienot testa kopai dotos piemērus un pārbaudīt vai tie atdalās kā anomālijas latentajā Z telpā (Iesniegt screenshots klāt pie pie pirmkoda),
Lai šo varētu izdarīt visvienkāršāk būtu test iterācijas beigās izsaukt model.forward ar iepriekš sagatavotiem manuāli izvēlētiem paraugiem, dataset_full.labels var pievienot “anomaly”, lai attēlotos vizuāli. Ja vēlaties iekļaut datu kopā, tad jāpārraksta dataset_full kods, izmantojot torch.utils.data.Subset priekš train un test (test kopā iekļautas anomālijas, bet tās nav iekļautas train kopā)
Implement MSA / L1 loss
Implement saving models when best loss value achieved
Pievienot testa kopai dotos piemērus un pārbaudīt vai tie atdalās kā anomālijas latentajā Z telpā (Iesniegt screenshots klāt pie pie pirmkoda)
Iesniegt pirmkodu un screenshots
Using latent vector arithmetic (average value from one class and another class samples), generate new images representing semantic properties. For example, adding or removing a part to the letter.
Tasks:
Choose a sample from the MNIST dataset, generate new samples from a normal (Gaussian distribution), using z_mu and z_sigma. Add the obtained result images and the source code to the submission. To get the original z_mu and z_sigma manually select similar samples.
Choose a sample from the BALLS dataset, generate new samples from a normal (Gaussian distribution), using z_mu and z_sigma. Add the obtained result images and the source code to the submission. To get the original z_mu and z_sigma manually select similar samples.
Perform Z latent vector arithmetic, add or subtract the resulting vectors of different classes. You can also use the average value from both class vectors. Add the obtained result images and the source code to the submission.
Train your models using the hyper-parameters given below - add to the submission the training error curves, model weights, and the source code.
Example for tasks 1, 2, 3: http://share.yellowrobot.xyz/quick/2023-11-11-3AA4BE22-93D8-4D1A-B386-29DC57686891.zip
MNIST template: http://share.yellowrobot.xyz/quick/2023-11-11-E767405F-9A7A-4C59-8B2A-74CD83790DB4.zip
BALLS template: http://share.yellowrobot.xyz/quick/2023-11-11-956FC6B2-FC92-400F-8494-095421010EEA.zip
Trained MNIST weights: http://share.yellowrobot.xyz/quick/2023-11-11-BEEF5CBD-1ACA-4B1B-88D7-410DCB73BBB7.zip
BALLS dataset: http://share.yellowrobot.xyz/quick/2023-11-11-9CB72E5A-AFE9-4FA5-9A62-EB5B81D1E33F.zip
Apmācību hiperparametri:
BALLS mean MSE + mean KL "batch_size": 32, "learning_rate": 0.001, "vae_beta": 0.0001
MNIST mean MSE + mean KL "batch_size": 16, "learning_rate": 0.001, "vae_beta": 0.001,
Mērķis dabūt 2 pakāpi (sākotnēji 64)
encoder:100 => 64 => 4/4/4 decoder: 1×2×2×5×5 => SIGMOID!
dilation = 1 by default
rewrite equations without D
buggy - (w_in + 2 * p - k) / s + 1
Deconvolutions / Transposed convolutions
https://towardsdatascience.com/what-is-transposed-convolutional-layer-40e5e6e31c11
Mākslīgi sagatavo lielāka izmēra input, saliekot strides un paddings
https://distill.pub/2016/deconv-checkerboard/
Checkerboard Artifacts
Main equations
^ Stop Grad means torch.detach()