2023-03-22 Deploy API Manual Instructions Docker

 

1. Test full production setup locally, getting code ready

  1. Merge stable branches in master, resolve issues

  2. Copy locally all ./pretrained_models to correct locations

  3. For testing purposes make sure that your host machine connects to real database (do not use this for development). Put breakpoint and run this script to make sure it works correctly image-20230322110932981

  4. Make sure that production database contains ALL changes in DB necessary, modifications in columns from apidev or workers data etc. If new features added need to add also to production. For example: image-20230322115041362

  5. Run in separate CMD window locally python ./worker_feature_coordinator.py

    • To avoid disrupting production workers column host_name only workers with same host_name gets grouped and assigned files that are available in system controllers.controller_database.ControllerDatabase.get_tasks_by_feature, very important check is_file_available_on_this_machine

  6. Run in another separate CMD window locally python ./api.py

     

  7. Submit example task form BPO calls, pick randomly call with +2min and known speaker ID, unknown language, select ALL audio features to test, for example BPO api_key: 3e995f31-f686-450c-8364-010ca85262fb http://127.0.0.1:8000/docs#/default/task_submit_task_submit_post

    https://dev.pitchpatterns.com/conversation/a3481319-094a-48b6-ab44-baa8d6bc474b Use specific known user_id , Download audio form side panel in Dev mode image-20230322120656448

    When submitting set email to receive final result Copy down task_uuid image-20230322121344525

  8. When video models available, then also pick random video Call with +2min from Asya team to process ALL audio + video features asya team api_key: AFB6B3E4-6688-471E-AE6C-12C5242B61C5

  9. Run one by one each of workers in required order (they will appear in coordinator and should process input), determined by features order_idx image-20230322132132572

    image-20230322121638903

    If everything is correct you should be able to run one by one each worker and coordinator in another CMD tab. Run and stop each worker

    1. worker_audio_denoise.py

    2. worker_audio_diarisation.py

    3. worker_audio_tempo.py

    4. worker_audio_energy.py

    5. worker_audio_emotions.py

    6. worker_audio_language_detection.py

    7. worker_audio_text.py

  10. If task gets FAILED during testing you can reset it’s state in tasksand features_in_task tables by task_uuid, Fix any bugs and commit in GIT

  11. Validate result in test javascript tool

 


 

2. Building docker images

  1. Before building make mamba environments and test locally mamba env create -f env_common.yaml, Maintain environment_common.yaml for each worker if necessary (might be that some workers might not work with same environment) in worker startup script include info which env need to be used. For example environment_audio_energy.yaml is used for audio_energy

  2. Make sure you change correctly all records in model_versions.json

  3. Make sure that environment running docker_builder.py have basic requirements pip install docker-py

  4. run script python ./docker_builder.py -feature {feature}, replace {feature} with each of

    1. audio_denoise

    2. audio_diarisation

    3. audio_tempo

    4. audio_energy

    5. audio_emotions

    6. audio_language_detection

    7. audio_text

  5. Result tar file will be located in ./dockers/audio_denoise.tar (for example)

  6. 📗 You can also use docker_builder.sh to build multiple docker images at once

  7. ⚠️ This file can 10GB+ large

  8. Useful commands will be generated in CMD line, copy for usage

 


3. Deploying docker images

  1. Copy ./dockers/audio_denoise.tar to target machine

  2. On target machine make sure you install apt-get docker NOT snap or other kind. Must use nvidia-version of docker

    Docker must work without sudo

     

  3. target machine must have also nvidia-smi drivers and cuda installed, as well as fuse3

  4. Make sure docker is running

    On target machine load in docker image

  5. Launch docker:

     

  6. When running connect to container:

  7. Check if worker available in features_workers with correct hostname api.asya.ai wait for coordinator to give job to worker image-20230328163927967

  8. If existing container running kill it using

    1. Go inside and terminate worker, so that it removes it from api DB, do not just force kill it!

    2. from target machine docker kill audio_text (audio_text is docker name not image)

    3. To delete existing container docker rm audio_text (or some other using docker container ls)

Orther docker commands available here: http://share.yellowrobot.xyz/quick/2023-3-28-08441FD2-72B9-4879-8D97-BE4FF565F10E.html


4. Balancing how many workers are need

 


Troubleshooting

  1. If cuda not working in docker can check using torch

  2. If docker works only as sudo can enable all users using

     

 


 

Dockerfile explanations

Only copy models that are required by controllers and do not include huggingface, torch hub/cloud models they will be downloaded when starting workers.

Only include in copy files that are needed {copy_models}

SSH keys only work with chomd 600 RUN chmod 600 /app/keys/oracle-4-p100.key

 

RUN - shell commands executed in sequence, better to combine as many as possible to have option to cache data

 

COPY - coppies files and directories to target image, executed in sequence, ⚠️ Very important to use with .dockerignore , because docker will include all files in ./Dockerfile root without it regardless of COPY commands

 

CMD - startup script shell

there can only be one CMD instruction in a Dockerfile. If more than one CMD instruction is listed, only the last CMD will take effect only single last CMD will be executed in

 

Cleanup as much as you can in docker image to reduce size

 

Mofiyable parameters

then wehn running can pass with -e

Size and performance

Important to have .dockerignore, othewise will copy ALL files from directory in container