Create
type: "io.kestra.plugin.azure.batch.job.Create"
Create a Azure Batch job with tasks.
Examples
id: azure_batch_job_create
namespace: company.team
tasks:
- id: create
type: io.kestra.plugin.azure.batch.job.Create
endpoint: https://***.francecentral.batch.azure.com
account: <batch-account>
accessKey: <access-key>
poolId: <pool-id>
job:
id: <job-name>
tasks:
- id: env
commands:
- 'echo t1=$ENV_STRING'
environments:
ENV_STRING: "{{ inputs.first }}"
- id: echo
commands:
- 'echo t2={{ inputs.second }} 1>&2'
- id: for
commands:
- 'for i in $(seq 10); do echo t3=$i; done'
- id: vars
commands:
- echo '::{"outputs":{"extract":"'$(cat files/in/in.txt)'"}::'
resourceFiles:
- httpUrl: https://unittestkt.blob.core.windows.net/tasks/***?sv=***&se=***&sr=***&sp=***&sig=***
filePath: files/in/in.txt
- id: output
commands:
- 'mkdir -p outs/child/sub'
- 'echo 1 > outs/1.txt'
- 'echo 2 > outs/child/2.txt'
- 'echo 3 > outs/child/sub/3.txt'
outputFiles:
- outs/1.txt
outputDirs:
- outs/child
Use a container to start the task, the pool must use a
microsoft-azure-batch
publisher.
id: azure_batch_job_create
namespace: company.team
tasks:
- id: create
type: io.kestra.plugin.azure.batch.job.Create
endpoint: https://***.francecentral.batch.azure.com
account: <batch-account>
accessKey: <access-key>
poolId: <pool-id>
job:
id: <job-name>
tasks:
- id: echo
commands:
- 'python --version'
containerSettings:
imageName: python
Properties
delete
- Type: boolean
- Dynamic: ❌
- Required: ✔️
- Default:
true
Whether the job should be deleted upon completion.
endpoint
- Type: string
- Dynamic: ✔️
- Required: ✔️
The blob service endpoint.
job
- Type: Job
- Dynamic: ✔️
- Required: ✔️
The job to create.
poolId
- Type: string
- Dynamic: ✔️
- Required: ✔️
The ID of the pool.
resume
- Type: boolean
- Dynamic: ❌
- Required: ✔️
- Default:
true
Whether to reconnect to the current job if it already exists.
tasks
- Type: array
- SubType: Task
- Dynamic: ❌
- Required: ✔️
The list of tasks to be run.
accessKey
- Type: string
- Dynamic: ❓
- Required: ❌
account
- Type: string
- Dynamic: ❓
- Required: ❌
completionCheckInterval
- Type: string
- Dynamic: ❌
- Required: ❌
- Default:
1.000000000
- Format:
duration
The frequency with which the task checks whether the job is completed.
maxDuration
- Type: string
- Dynamic: ❌
- Required: ❌
- Format:
duration
The maximum total wait duration.
If null, there is no timeout and the task is delegated to Azure Batch.
Outputs
outputFiles
- Type: object
- SubType: string
- Required: ❌
The output files' URIs in Kestra's internal storage.
vars
- Type: object
- Required: ❌
The values from the output of the commands.
Definitions
io.kestra.plugin.azure.batch.models.OutputFileBlobContainerDestination
Properties
containerUrl
- Type: string
- Dynamic: ❓
- Required: ✔️
The URL of the container within Azure Blob Storage to which to upload the file(s).
If not using a managed identity, the URL must include a Shared Access Signature (SAS) granting write permissions to the container.
identityReference
- Type: ComputeNodeIdentityReference
- Dynamic: ❓
- Required: ❌
The reference to the user assigned identity to use to access Azure Blob Storage specified by containerUrl
.
The identity must have write access to the Azure Blob Storage container.
path
- Type: string
- Dynamic: ✔️
- Required: ❌
The destination blob or virtual directory within the Azure Storage container.
If
filePattern
refers to a specific file (i.e. contains no wildcards), thenpath
is the name of the blob to which to upload that file. IffilePattern
contains one or more wildcards (and therefore may match multiple files), thenpath
is the name of the blob virtual directory (which is prepended to each blob name) to which to upload the file(s). If omitted, file(s) are uploaded to the root of the container with a blob name matching their file name.
io.kestra.plugin.azure.batch.models.ContainerRegistry
Properties
identityReference
- Type: ComputeNodeIdentityReference
- Dynamic: ✔️
- Required: ❌
The reference to the user assigned identity to use to access the Azure Container Registry instead of username and password.
password
- Type: string
- Dynamic: ✔️
- Required: ❌
The password to log into the registry server.
registryServer
- Type: string
- Dynamic: ✔️
- Required: ❌
The registry server URL.
If omitted, the default is "docker.io".
userName
- Type: string
- Dynamic: ✔️
- Required: ❌
The user name to log into the registry server.
io.kestra.plugin.azure.batch.models.OutputFileUploadOptions
Properties
uploadCondition
- Type: string
- Dynamic: ❌
- Required: ✔️
- Default:
taskcompletion
- Possible Values:
TASK_SUCCESS
TASK_FAILURE
TASK_COMPLETION
The conditions under which the Task output file or set of files should be uploaded.
io.kestra.plugin.azure.batch.models.ComputeNodeIdentityReference
Properties
resourceId
- Type: string
- Dynamic: ✔️
- Required: ❌
The ARM resource ID of the user assigned identity.
io.kestra.plugin.azure.batch.models.ResourceFile
Properties
autoStorageContainerName
- Type: string
- Dynamic: ✔️
- Required: ❌
The storage container name in the auto storage Account.
The
autoStorageContainerName
,storageContainerUrl
andhttpUrl
properties are mutually exclusive, and one of them must be specified.
blobPrefix
- Type: string
- Dynamic: ✔️
- Required: ❌
The blob prefix to use when downloading blobs from the Azure Storage container.
Only the blobs whose names begin with the specified prefix will be downloaded. The property is valid only when
autoStorageContainerName
orstorageContainerUrl
is used. This prefix can be a partial file name or a subdirectory. If a prefix is not specified, all the files in the container will be downloaded.
fileMode
- Type: string
- Dynamic: ✔️
- Required: ❌
The file permission mode attribute in octal format.
This property applies only to files being downloaded to Linux Compute Nodes. It will be ignored if it is specified for a
resourceFile
which will be downloaded to a Windows Compute Node. If this property is not specified for a Linux Compute Node, then a default value of0770
is applied to the file.
filePath
- Type: string
- Dynamic: ✔️
- Required: ❌
The location on the Compute Node to which to download the file(s), relative to the Task's working directory.
If the
httpUrl
property is specified, thefilePath
is required and describes the path which the file will be downloaded to, including the file name. Otherwise, if theautoStorageContainerName
orstorageContainerUrl
property is specified,filePath
is optional and is the directory to download the files to. In the case wherefilePath
is used as a directory, any directory structure already associated with the input data will be retained in full and appended to the specifiedfilePath
directory. The specified relative path cannot break out of the Task's working directory (for example by using..
).
httpUrl
- Type: string
- Dynamic: ✔️
- Required: ❌
The URL of the file to download.
The
autoStorageContainerName
,storageContainerUrl
andhttpUrl
properties are mutually exclusive, and one of them must be specified. If the URL points to Azure Blob Storage, it must be readable from compute nodes. There are three ways to get such a URL for a blob in Azure storage: include a Shared Access Signature (SAS) granting read permissions on the blob, use a managed identity with read permission, or set the ACL for the blob or its container to allow public access.
identityReference
- Type: ComputeNodeIdentityReference
- Dynamic: ✔️
- Required: ❌
The reference to the user assigned identity to use to access Azure Blob Storage specified by storageContainerUrl
or httpUrl
.
storageContainerUrl
- Type: string
- Dynamic: ✔️
- Required: ❌
The URL of the blob container within Azure Blob Storage.
The
autoStorageContainerName
,storageContainerUrl
andhttpUrl
properties are mutually exclusive, and one of them must be specified. This URL must be readable and listable from compute nodes. There are three ways to get such a URL for a container in Azure storage: include a Shared Access Signature (SAS) granting read and list permissions on the container, use a managed identity with read and list permissions, or set the ACL for the container to allow public access.
io.kestra.plugin.azure.batch.models.TaskContainerSettings
Properties
imageName
- Type: string
- Dynamic: ✔️
- Required: ✔️
The Image to use to create the container in which the Task will run.
This is the full Image reference, as would be specified to
docker pull
. If no tag is provided as part of the Image name, the tag:latest
is used as a default.
containerRunOptions
- Type: string
- Dynamic: ✔️
- Required: ❌
Additional options to the container create command.
These additional options are supplied as arguments to the
docker create
command, in addition to those controlled by the Batch Service.
registry
- Type: ContainerRegistry
- Dynamic: ❌
- Required: ❌
The private registry which contains the container image.
This setting can be omitted if was already provided at Pool creation.
workingDirectory
- Type: string
- Dynamic: ❌
- Required: ❌
- Possible Values:
TASK_WORKING_DIRECTORY
CONTAINER_IMAGE_DEFAULT
The location of the container Task working directory.
The default is
taskWorkingDirectory
. Possible values include:taskWorkingDirectory
,containerImageDefault
.
io.kestra.plugin.azure.batch.models.Task
Properties
commands
- Type: array
- SubType: string
- Dynamic: ✔️
- Required: ✔️
The command line of the Task.
For multi-instance Tasks, the command line is executed as the primary Task, after the primary Task and all subtasks have finished executing the coordination command line. The command line does not run under a shell, and therefore cannot take advantage of shell features such as environment variable expansion. If you want to take advantage of such features, you should invoke the shell in the command line, for example, using
cmd /c MyCommand
in Windows or/bin/sh -c MyCommand
in Linux. If the command line refers to file paths, it should use a relative path (relative to the Task working directory), or use the Batch provided environment variable.
Command will be passed as /bin/sh -c "command"
by default.
id
- Type: string
- Dynamic: ✔️
- Required: ✔️
- Max length:
64
A string that uniquely identifies the Task within the Job.
The ID can contain any combination of alphanumeric characters including hyphens and underscores, and cannot contain more than 64 characters. The ID is case-preserving and case-insensitive (that is, you may not have two IDs within a Job that differ only by case). If not provided, a random UUID will be generated.
interpreter
- Type: string
- Dynamic: ❌
- Required: ✔️
- Default:
/bin/sh
- Min length:
1
Interpreter to be used.
constraints
- Type: TaskConstraints
- Dynamic: ❌
- Required: ❌
The execution constraints that apply to this Task.
containerSettings
- Type: TaskContainerSettings
- Dynamic: ✔️
- Required: ❌
The settings for the container under which the Task runs.
If the Pool that will run this Task has
containerConfiguration
set, this must be set as well. If the Pool that will run this Task doesn't havecontainerConfiguration
set, this must not be set. When this is specified, all directories recursively below the AZ_BATCH_NODE_ROOT_DIR (the root of Azure Batch directories on the node) are mapped into the container, all Task environment variables are mapped into the container, and the Task command line is executed in the container. Files produced in the container outside of AZ_BATCH_NODE_ROOT_DIR might not be reflected to the host disk, meaning that Batch file APIs will not be able to access those files.
displayName
- Type: string
- Dynamic: ✔️
- Required: ❌
- Max length:
1024
A display name for the Task.
The display name need not be unique and can contain any Unicode characters up to a maximum length of 1024.
environments
- Type: object
- SubType: string
- Dynamic: ✔️
- Required: ❌
A list of environment variable settings for the Task.
interpreterArgs
- Type: array
- SubType: string
- Dynamic: ❌
- Required: ❌
- Default:
[-c]
Interpreter args to be used.
outputDirs
- Type: array
- SubType: string
- Dynamic: ❌
- Required: ❌
Output directories list that will be uploaded to the internal storage.
List of keys that will generate temporary directories. In the command, you can use a special variable named
outputDirs.key
. If you add a file with["myDir"]
, you can use the special variableecho 1 >> {{ outputDirs.myDir }}/file1.txt
andecho 2 >> {{ outputDirs.myDir }}/file2.txt
, and both files will be uploaded to the internal storage. Then, you can use them on other tasks using{{ outputs.taskId.files['myDir/file1.txt'] }}
outputFiles
- Type: array
- SubType: string
- Dynamic: ❌
- Required: ❌
Output file list that will be uploaded to the internal storage.
List of keys that will generate temporary files. In the command, you can use a special variable named
outputFiles.key
. If you add a file with["first"]
, you can use the special variableecho 1 >> {{ outputFiles.first }}
on this task, and reference this file on others tasks using{{ outputs.taskId.outputFiles.first }}
.
requiredSlots
- Type: integer
- Dynamic: ❌
- Required: ❌
The number of scheduling slots that the Task requires to run.
The default is 1. A Task can only be scheduled to run on a compute node if the node has enough free scheduling slots available. For multi-instance Tasks, this must be 1.
resourceFiles
- Type: array
- SubType: ResourceFile
- Dynamic: ✔️
- Required: ❌
A list of files that the Batch service will download to the Compute Node before running the command line.
For multi-instance Tasks, the resource files will only be downloaded to the Compute Node on which the primary Task is executed. There is a maximum size for the list of resource files. When the max size is exceeded, the request will fail and the response error code will be RequestEntityTooLarge. If this occurs, the collection of ResourceFiles must be reduced in size. This can be achieved using .zip files, Application Packages, or Docker Containers.
uploadFiles
- Type: array
- SubType: OutputFile
- Dynamic: ✔️
- Required: ❌
A list of files that the Batch service will upload from the Compute Node after running the command line.
For multi-instance Tasks, the files will only be uploaded from the Compute Node on which the primary Task is executed.
io.kestra.plugin.azure.batch.models.OutputFile
Properties
destination
- Type: OutputFileDestination
- Dynamic: ❌
- Required: ✔️
The destination for the output file(s).
uploadOptions
- Type: OutputFileUploadOptions
- Dynamic: ❌
- Required: ✔️
- Default:
{uploadCondition=taskcompletion}
Additional options for the upload operation, including the conditions under which to perform the upload.
filePattern
- Type: string
- Dynamic: ✔️
- Required: ❌
A pattern indicating which file(s) to upload.
Both relative and absolute paths are supported. Relative paths are relative to the Task working directory. The following wildcards are supported:
*
matches 0 or more characters (for example, patternabc*
would matchabc
orabcdef
),**
matches any directory,?
matches any single character,[abc]
matches one character in the brackets, and[a-c]
matches one character in the range. Brackets can include a negation to match any character not specified (for example,[!abc]
matches any character buta
,b
, orc
). If a file name starts with"."
it is ignored by default but may be matched by specifying it explicitly (for example*.gif
will not match.a.gif
, but.*.gif
will). A simple example:**\*.txt
matches any file that does not start in '.' and ends with.txt
in the Task working directory or any subdirectory. If the filename contains a wildcard character it can be escaped using brackets (for example,abc[*]
would match a file namedabc*
). Note that both\
and/
are treated as directory separators on Windows, but only/
is on Linux.Environment variables (%var%
on Windows or$var
on Linux) are expanded prior to the pattern being applied.
io.kestra.plugin.azure.batch.models.Job
Properties
id
- Type: string
- Dynamic: ✔️
- Required: ✔️
- Max length:
64
**A string that uniquely identifies the Job within the Account. **
The ID can contain any combination of alphanumeric characters including hyphens and underscores, and cannot contain more than 64 characters. The ID is case-preserving and case-insensitive (that is, you may not have two IDs within an Account that differ only by case).
displayName
- Type: string
- Dynamic: ✔️
- Required: ❌
- Max length:
1024
The display name for the Job.
The display name need not be unique and can contain any Unicode characters up to a maximum length of 1024.
labels
- Type: object
- SubType: string
- Dynamic: ✔️
- Required: ❌
Labels to attach to the created job.
maxParallelTasks
- Type: integer
- Dynamic: ❌
- Required: ❌
The maximum number of tasks that can be executed in parallel for the Job.
The value of
maxParallelTasks
must be -1 or greater than 0, if specified. If not specified, the default value is -1, which means there's no limit to the number of tasks that can be run at once. You can update a job'smaxParallelTasks
after it has been created using the update job API.
priority
- Type: integer
- Dynamic: ❌
- Required: ❌
The priority of the Job.
Priority values can range from -1000 to 1000, with -1000 being the lowest priority and 1000 being the highest priority. The default value is 0.
io.kestra.plugin.azure.batch.models.TaskConstraints
Properties
maxTaskRetryCount
- Type: integer
- Dynamic: ❌
- Required: ❌
The maximum number of times the Task may be retried.
The Batch service retries a Task if its exit code is nonzero. Note that this value specifically controls the number of retries for the Task executable due to a nonzero exit code. The Batch service will try the Task once, and may then retry up to this limit. For example, if the maximum retry count is 3, Batch tries the Task up to 4 times (one initial try and 3 retries). If the maximum retry count is 0, the Batch service does not retry the Task after the first attempt. If the maximum retry count is -1, the Batch service retries the Task without limit.
maxWallClockTime
- Type: string
- Dynamic: ❌
- Required: ❌
- Format:
duration
The maximum elapsed time that the Task may run, measured from the time the Task starts.
If the Task does not complete within the time limit, the Batch service terminates it. If this is not specified, there is no time limit on how long the Task may run.
retentionTime
- Type: string
- Dynamic: ❌
- Required: ❌
- Format:
duration
The minimum time to retain the Task directory on the Compute Node where it ran, from the time it completes execution.
After this time, the Batch service may delete the Task directory and all its contents. The default is 7 days, i.e. the Task directory will be retained for 7 days unless the Compute Node is removed or the Job is deleted.
io.kestra.plugin.azure.batch.models.OutputFileDestination
Properties
container
- Type: OutputFileBlobContainerDestination
- Dynamic: ✔️
- Required: ✔️
A location in Azure Blob Storage to which the files are uploaded.
Was this page helpful?