2025-03-31 FaradAI Dataset V0

 

1. Summary

Total unfiltered videos: 184162, 3.4T (currently still downloading more) Number of filtered videos: 10141, 150GB (5% usable so far) Containing people: 3183 (~30%)

 

image-20250331123434804

 

image-20250331130027291

image-20250331125810945

 

image-20250331132916307

 

image-20250331144900264

 

 

2. Labelling methodology

  1. Extract 3 frames (beginning 10%, middle, end 10%)

  2. Military Object Detector based on Yolo11 trained on

    1. Datasets:

      1. https://data.mendeley.com/datasets/njdjkbxdpn/1

      2. https://universe.roboflow.com/zeki-8vreq/vehicle-a5vis/browse?queryText=&pageSize=50&startingIndex=0&browseQuery=true

      3. https://universe.roboflow.com/lee1/hj-hnab7/browse?queryText=&pageSize=50&startingIndex=0&browseQuery=true

      4. https://universe.roboflow.com/dk-uun3s/danger-fxawo

      5. https://universe.roboflow.com/ships-kev8a/military-ship-detection/browse?queryText=&pageSize=50&startingIndex=0&browseQuery=true

      6. https://universe.roboflow.com/hanif-noer-r/military-ships/browse?queryText=&pageSize=50&startingIndex=0&browseQuery=true

      7. https://universe.roboflow.com/seaobjects-2rxjz/8_final

      8. https://universe.roboflow.com/ttest/bb-ajv15

      9. https://universe.roboflow.com/object-detection-1y7d0/military-aircraft-iaddd/browse?queryText=&pageSize=50&startingIndex=0&browseQuery=true

      10. https://universe.roboflow.com/aiden-lee-roboflow/aircraft-detection-thesis/browse?queryText=&pageSize=50&startingIndex=0&browseQuery=true

      11. https://universe.roboflow.com/planes-zmdv1/aircraft-classification-2/browse?queryText=class:commercial&pageSize=50&startingIndex=0&browseQuery=true

    2. Classes 0: person 1: military vehicle 2: civilian vehicle 3: military aircraft 4: civilian aircraft 5: military watercraft 6: civilian watercraft

  3. vit-gpt2-image-captioning -> Zero-shot image captioning

  4. LLM2CLIP-Llama-3-8B-Instruct-CC-Finetuned -> Predefined text labels to match images (https://share.yellowrobot.xyz/quick/2025-4-24-4CD8B1B3-8572-409B-ACB9-5E8A4644C92F.zip)

  5. Keyword filtering using manually crafted algorithm

  6. Manual filtering using manual labeller

 

3. Examples

Some of examples of dataset, each video contains 3 labelled frames, defaced (blured) video and JSON label file. Below are randomly selected samples from the dataset!

3.1. Military Watercraft: Ship

apwagner_7846_0943136f-b229-42f7-8ce9-ee6ed778691a.mp4