2025-03-31 FaradAI Dataset V01. Summary2. Labelling methodology3. Examples3.1. Military Watercraft: Ship3.2. Military Watercraft: Submarine3.3. Military Vehicle: APC3.4. Military Vehicle: Tank3.5. Military Vehicle: Multiple launch rocket system3.6. Military aircraft: fighter jet3.7. Military aircraft: helicopter3.8. Military aircraft: Drones and drone footage3.9. Infrared footage4. Further work
Total unfiltered videos: 184162, 3.4T (currently still downloading more) Number of filtered videos: 10141, 150GB (5% usable so far) Containing people: 3183 (~30%)
Extract 3 frames (beginning 10%, middle, end 10%)
Military Object Detector based on Yolo11 trained on
Datasets:
Classes 0: person 1: military vehicle 2: civilian vehicle 3: military aircraft 4: civilian aircraft 5: military watercraft 6: civilian watercraft
vit-gpt2-image-captioning -> Zero-shot image captioning
LLM2CLIP-Llama-3-8B-Instruct-CC-Finetuned -> Predefined text labels to match images (https://share.yellowrobot.xyz/quick/2025-4-24-4CD8B1B3-8572-409B-ACB9-5E8A4644C92F.zip)
Keyword filtering using manually crafted algorithm
Manual filtering using manual labeller
Some of examples of dataset, each video contains 3 labelled frames, defaced (blured) video and JSON label file. Below are randomly selected samples from the dataset!
apwagner_7846_0943136f-b229-42f7-8ce9-ee6ed778691a.mp4
boris_rozhin_92803_42b7975a-6784-4690-aff8-3b267a777445.MP4
boris_rozhin_141267_4a5d4536-6352-4bda-9205-e1347f6a500c.mp4
boris_rozhin_48823_9f7938c6-3856-4366-80cd-9e085f2b0198.mp4
ab3army_780_49507586-cb0d-4437-8d6f-82da4f3c7c3e.MOV
apwagner_21098_624d67a3-70c4-4658-a371-77ce6710334d.mp4
boris_rozhin_33060_1d15c89b-07d9-4d1b-81af-141bedaf599c_defaced.mp4
wargonzo_21991_7161e01a-ee8b-43ef-a078-836fc3c7ff13_defaced.mp4
RVvoenkor_82008_01aee5e5-fd03-41f3-9c95-310b00db2cd7.mp4
RVvoenkor_18612_0c0966a3-c0d8-4e81-a24c-cfd0c89fd975.mp4
apwagner_10044_c0796b28-39cc-45c9-972e-3af40987985c.mp4
apwagner_16988_be589a28-c472-4684-ab3e-576b6a7b4276.mp4
boris_rozhin_34403_de178228-2da9-4fbb-bc06-7c27da3a0e29.mp4
boris_rozhin_32147_f4928272-3568-4fee-9bf3-aa7181d88bab.mp4
boris_rozhin_35111_27b7f0bd-2f84-45d5-b47a-34703afd6f55.mp4
boris_rozhin_34808_08ce9d86-8187-46e5-9edd-806d7464b919.mp4
apwagner_31792_529b364b-cda7-4d72-8ae0-aa48a77c1a7d_defaced.mp4
ab3army_3123_af378ed7-7663-45af-88a6-4e2f2dff84a4.MP4
boris_rozhin_92736_0997a867-c9ec-42b7-bf4f-8e7761ba502d.MP4
apwagner_10471_9ac97b63-d006-45f5-9431-feb58418f424.mp4
boris_rozhin_123417_b1d66471-f4d5-44f9-8e15-0ca05b88623a.mp4
Currently still scraping data from military conflicts in Ukraine and Middle-East. Improving quality of the dataset by manual filtering and testing out multi-modal Gemma 3, Qwen 2.5 VL, Llama 4 image captioning capabilities. Also it has been planned to test instead of LLM2CLIP other embbeding based mathing models which do not have blocked military content like JinaCLIP, CLIPBert and others.