35
loading...
This website collects cookies to deliver better user experience
kedro catalog create
to boost my productivity by automatically generating yaml catalog entries for me. It will create new yaml files for each pipeline, fill in missiing catalog entries, and respect already existing👆 Unsure what kedro is? Check out this post.
kedro catalog create --pipeline history_nodes
CONF_ROOT
settings when it creates a new catalog file, or looks for existing catalog files. You can change the location of your configuration files by editing your CONF_ROOT
variable in your projects. settings.py
.# settings.py
# default settings
CONF_ROOT = "conf"
# I like to package my configuration
CONF_ROOT = str(Path(__file__).parent / "conf")
I prefer to keep my configuration packaged inside of my project. This is partly due to how my team operates and deploys pipelines.
kedro catalog create
command will look for a yaml
file based on the name of the pipeline (CONF_ROOT/catalog/<pipeline-name>.yml
). If it does not⚠️ It will not look in all of your existing catalog files for entries, only the one in the exact file for your pipeline.
kedro catalog create
you get MemoryDataSet
, that's it. As of 0.17.4
its hard coded into the library and not configurable.range12:
type: MemoryDataSet
pandas.CSVDataSet
so that the file gets stored and we can pick up and read the file without re-running the whole pipeline.range12:
type: pandas.CSVDataSet
filepath: data/range12.csv
range13
.kedro catalog create --pipeline history_nodes
range12
entry alone and created range13
for us.range12:
type: pandas.CSVDataSet
filepath: data/range12.csv
range13:
type: MemoryDataSet
kedro catalog create
empty lines will berange12:
type: pandas.CSVDataSet
range13:
type: MemoryDataSet
range12:
type: pandas.CSVDataSet
filepath: data/range12.csv
range121:
type: MemoryDataSet
range13:
type: MemoryDataSet
range121
comes before range13
. This is all based on how pythons yaml.safe_dump
works, kedro has set the default_flow_style
to False
. You can see where they write your file in the source code currently here