Pulling by definition

Every data loading (pulling) task is defined using a definition file named haulwork.yaml.

See: layout

The definition file must include task id and file itself must reside in folder which path corresponds to task id. Double-check.

# segment from file .../mydefs/from_arno/ods_arno/school/haulwoark.yaml
id: from_arno.ods_arno.school

Task ID

Task id is composed from three parts:

  • route code
  • target schema
  • taget table

Parts are separated with dot (point).

All parts must be lowercase and without special characters (cannot include whitespace)!

Warning

Last 2 parts together must be unique for target database

Samples of task_id

Sample and explanation follows.

from_arno.education.dim_school

This points out that data will come from route with code "from_armo" ("arno" is education-management system) and data will be put into schema education into table dim_school (where dim_ stands for dimension).

from_arno.ods_arno.school

Example above shows that data will come from route "from_arno" and will be put into schema "ods_arno" where ods_ stands for "operational data storage" which points out that data will (almost) 1:1 to source. Just such name convension.

inner.education.dim_school

Example above shows that data will be pulled from our target database (it doesn't say from where directly -- sql-file is for that purpose) and data will be put into "education"."dim_school".

By desing You cannot use all 3 examples for same target database! Look previous warning! Advisable is to use last two (and not first one).

Sections

  • id
  • run
  • actions
  • upgrade_nondata

Section "run"

Section "actions"

Section "upgrade_nondata"