Create a YAML config file for Python

Should I use YAML config files for my data science project?

Storing your variables in a single config file improves your quality of life by reducing the time to make changes and also improves your code quality in two key ways: it makes your code more readable for others, and it reduces the chance of bugs occuring in your code due to mistyped variable names.

How do I create a config file for my data science project in Python?

One of the easiest, and most common, ways to create a config file is to create a YAML file which stores the values and then read this into the Python file. Here I will demonstrate a simple of example of how this can be done.

First we will create our YAML file, config.yaml, to hold the variables:

project_name: "My Project"
variable_list: [1,2,3]
boolean: True

Then in order to use these variables in our Python file, run.py, we will do the following:

import yaml

config_file = yaml.safe_load(open("config.yaml", "rb"))

project_name=config_file.get("project_name")

Creating more complex Python config files

This method can be of course expanded to use more complex YAML data structures and multiple Python files, further improving your code quality and speed of development.


Nested list comprehension

References

YAML file structure
YAML vs JSON

Stephen Allwright

Stephen Allwright

I'm a Data Scientist currently working for Oda, an online grocery retailer, in Oslo, Norway. These posts are my way of sharing some of the tips and tricks I've picked up along the way.
Oslo, Norway