Skip to content

Dockerize

Azure Batch can be a great tool for instant batch processing as it creates and manages a pool of compute nodes (virtual machines), installs the applications you want to run, and schedules jobs to run on the nodes.

Getting Started

Azure Container Registry

  • Go to Container Registries Create container registry with prefix name cr
  • Add the information of this registry, for example with name: cr-ba-python-dev Click Create for registry creation
  • After registry creation, Go to cr-ba-python-dev registry On Access Keys
  • Click Enable on Admin user option
  • Save these values, Login server, Username, and Password
  • Go to your local terminal for prepare docker file
  • Create your Dockerfile file

    Dockerfile
    FROM python:3.9-slim
    
    WORKDIR /app
    
    COPY main.py ./
    
    RUN mkdir -p ./output
    
    CMD ["python", "./main.py"]
    

    The main Python file that run on this Docker

    ./main.py
    print("This is a Main file for testing Dockerize.")
    with open('/output/docker_test.txt', 'w') as f:
        f.write("Write Output file on Dockerize.")
    
  • Test this Docker image able to run on the Local

    docker build -t "python-ba" . --no-cache
    docker run --name python-btch -v "${pwd}\output:/output" python-btch
    
  • Push your Docker image to Azure Container Registries

    docker login cr-ba-python-dev.azurecr.io
    docker tag python-btch:latest cr-ba-python-dev.azurecr.io/btch/python-btch:0.0.1-test
    docker push cr-ba-python-dev.azurecr.io/btch/python-btch:0.0.1-test
    

Azure Batch Accounts

  • Go to your Azure Batch Accounts Click Pools Add new pool that Supports Container
  • Click Enable to Custom on Container configuration option
  • Go to Container registries Add cr-ba-python-dev registry from ACR values
  • Create Pool with name is btch-pool-cntn
  • Go to Jobs Create new job in btch-pool-cntn pool with name btch-job-cntn
  • Go to Tasks Create new task in this job

    • Go to Image name Add cr-ba-python-dev.azurecr.io/btch/python-btch:0.0.1-test
    • Go to Container run options and add below command

      --rm --workdir /app
      
  • Create the Automate Script file

    Package image version from local to Azure Container Registries

    @echo off
    set "version=%~1"
    if defined version (
        echo Start package docker image version: %version% ...
        call docker build -t python-test:latest . --no-cache
        call docker tag python-test:latest cr-ba-python-dev.azurecr.io/poc/python-test:%version%
        call docker push cr-ba-python-dev.azurecr.io/poc/python-test:%version%
        call docker rmi cr-ba-python-dev.azurecr.io/poc/python-test:%version%
    
        for /f "tokens=1-3" %%c IN ('docker image ls ^| Findstr /r "^cr-ba-python-dev.azurecr.io* ^<none>"')
        do (
            echo Start remove image: `%%c:%%d` with ID: %%e
            if "%%d" equ "<none>" (
                echo Delete image with id ...
                call docker rmi %%e > nul 2>&1
            ) else (
                echo Delete image with name:tag ...
                call docker rmi "%%c:%%d" > nul 2>&1
            )
        )
    )
    

    Additional, run this Task with JSON

    {
      "id": "container-job-10",
      "commandLine": "",
      "containerSettings": {
          "containerRunOptions": "--rm --workdir /app",
          "imageName": "cr-ba-python-dev.azurecr.io/poc/python-test:0.0.8",
          "workingDirectory": "taskWorkingDirectory"
      },
      "userIdentity": {
          "autoUser": {
              "scope": "pool",
              "elevationLevel": "admin"
          }
      }
    }
    

Note

About mounting volumn to Azure Batch Node,

--mount type=bind,source=/datadisks/disk1,target=/data

-v {<volume_id>}:{<path>}

Warning

Azure Data Factory does not support for run Azure Batch with a Docker container in the Custom Activity currently, Read More

Run with Mount Volume

Read Mores