Ingestion Ingestion

yaml
type: "io.kestra.plugin.datahub.Ingestion"

DataHub ingestion

Examples

Run DataHub ingestion

yaml
id: datahub_cli
namespace: company.name

tasks:
  - id: cli
    type: io.kestra.plugin.datahub.Ingestion
    recipe:
      source:
        type: mysql
        config:
          host_port: 127.0.0.1:3306
          database: dbname
          username: root
          password: "{{ secret('MYSQL_PASSWORD') }}"
      sink:
        type: datahub-rest
        config:
          server: http://datahub-gms:8080

Run DataHub ingestion using local recipe file

yaml
id: datahub_cli
namespace: company.name

tasks:
  - id: cli
    type: io.kestra.plugin.datahub.Ingestion
    recipe: "{{ input('recipe_file') }}"

Properties

recipe

  • Type: object
  • Dynamic:
  • Required: ✔️

The Ingestion DataHub Recipe.

containerImage

  • Type: string
  • Dynamic: ✔️
  • Required:
  • Default: acryldata/datahub-ingestion:head

The Ingestion DataHub docker image.

env

  • Type: object
  • SubType: string
  • Dynamic: ✔️
  • Required:

The environments for Ingestion DataHub.

inputFiles

  • Type:
    • object
    • string
  • Dynamic: ✔️
  • Required:

The files to create on the local filesystem. It can be a map or a JSON object.

namespaceFiles

Inject namespace files.

Inject namespace files to this task. When enabled, it will, by default, load all namespace files into the working directory. However, you can use the include or exclude properties to limit which namespace files will be injected.

outputFiles

  • Type: array
  • SubType: string
  • Dynamic: ✔️
  • Required:

The files from the local filesystem to send to Kestra's internal storage.

Must be a list of glob expressions relative to the current working directory, some examples: my-dir/**, my-dir/*/** or my-dir/my-file.txt.

taskRunner

  • Type: TaskRunner
  • Dynamic:
  • Required:
  • Default: {type=io.kestra.plugin.scripts.runner.docker.Docker}

The task runner to use.

Outputs

exitCode

  • Type: integer
  • Required: ✔️
  • Default: 0

The exit code of the entire flow execution.

outputFiles

  • Type: object
  • SubType: string
  • Required:

The output files' URIs in Kestra's internal storage.

vars

  • Type: object
  • Required:

The value extracted from the output of the executed commands.

Definitions

io.kestra.core.models.tasks.NamespaceFiles

Properties

enabled
  • Type: boolean
  • Dynamic:
  • Required:
  • Default: true

Whether to enable namespace files to be loaded into the working directory. If explicitly set to true in a task, it will load all Namespace Files into the task's working directory. Note that this property is by default set to true so that you can specify only the include and exclude properties to filter the files to load without having to explicitly set enabled to true.

exclude
  • Type: array
  • SubType: string
  • Dynamic:
  • Required:

A list of filters to exclude matching glob patterns. This allows you to exclude a subset of the Namespace Files from being downloaded at runtime. You can combine this property together with include to only inject a subset of files that you need into the task's working directory.

include
  • Type: array
  • SubType: string
  • Dynamic:
  • Required:

A list of filters to include only matching glob patterns. This allows you to only load a subset of the Namespace Files into the working directory.

io.kestra.core.models.tasks.runners.TaskRunner

Properties

type
  • Type: string
  • Dynamic:
  • Required: ✔️
  • Validation regExp: \p{javaJavaIdentifierStart}\p{javaJavaIdentifierPart}*(\.\p{javaJavaIdentifierStart}\p{javaJavaIdentifierPart}*)*
  • Min length: 1

Was this page helpful?