How can I parse a YAML file in Python?
10 Answers
The easiest and purest method without relying on C headers is PyYaml (documentation), which can be installed via pip install pyyaml
:
#!/usr/bin/env python
import yaml
with open("example.yaml", "r") as stream:
try:
print(yaml.safe_load(stream))
except yaml.YAMLError as exc:
print(exc)
And that's it. A plain yaml.load()
function also exists, but yaml.safe_load()
should always be preferred to avoid introducing the possibility for arbitrary code execution. So unless you explicitly need the arbitrary object serialization/deserialization use safe_load
.
Note the PyYaml project supports versions up through the YAML 1.1 specification. If YAML 1.2 specification support is needed, see ruamel.yaml as noted in this answer.
Also, you could also use a drop in replacement for pyyaml, that keeps your yaml file ordered the same way you had it, called oyaml. View synk of oyaml here

- 519
- 6
- 17

- 62,090
- 32
- 125
- 150
-
160I would add that unless you wish to serialize/deserialize arbitrary objects, it is better to use ``yaml.safe_load`` as it cannot execute arbitrary code from the YAML file. – ternaryOperator Mar 07 '14 at 08:58
-
4Yaml yaml = new Yaml(); Object obj = yaml.load("a: 1\nb: 2\nc:\n - aaa\n - bbb"); – MayTheSchwartzBeWithYou Jul 15 '14 at 11:01
-
2I like the article by moose: http://martin-thoma.com/configuration-files-in-python/ – SaurabhM Aug 19 '15 at 02:28
-
4You may need to install the PyYAML package first `pip install pyyaml`, see this post for more options https://stackoverflow.com/questions/14261614/how-do-i-install-the-yaml-package-for-python – Romain Sep 26 '18 at 09:03
-
25What's the point of capturing the exception in this example? It's going to print anyway, and it just makes the example more convoluted.. – naught101 Jan 22 '19 at 23:05
-
Also, plain `yaml.load` has become deprecated and it should be used with Loader argument, eg. `yaml.load(input, Loader=yaml.FullLoader)`. `yaml.safe_load(input)` is still ok. https://github.com/yaml/pyyaml/wiki/PyYAML-yaml.load(input)-Deprecation – Zuku Feb 17 '20 at 11:50
-
2I feel like this should really be a 1 liner. Why isn't there just a `yaml.read()` api or the like that returns a dict? – alex Jun 11 '20 at 02:23
-
See [Munch](https://pypi.org/project/munch/), https://stackoverflow.com/questions/52570869/load-yaml-as-nested-objects-instead-of-dictionary-in-python `import yaml; from munch import munchify; f = munchify(yaml.load(…)); print(fo.d.try)` – Hans Ginzel Jun 21 '20 at 20:39
-
stay away from pyyaml. There is no documentation, and nobody seems to be maintaining it; there are over a hundred unresolved issues. – JayEye Jul 13 '21 at 23:10
-
`YAML 1.1 specification` link is dead == https://yaml.org/spec/1.1/ – alper Jul 24 '21 at 22:09
-
@ternaryOperator What do you mean by `serialize/deserialize arbitrary objects`? Is it not safe to use `yaml.safe_load` if I am not using arbitrary objects? – alper Jul 24 '21 at 22:11
-
A drop in replacement for pyyaml, that keeps your yaml file ordered *the same way you had it*, is [oyaml](https://github.com/wimglenn/oyaml). View [synk of oyaml here](https://snyk.io/advisor/python/oyaml) – Brad Parks Aug 27 '21 at 12:24
-
This answer doesnt protect you from `billion laught yml` attack. – L F Mar 17 '22 at 21:56
Read & Write YAML files with Python 2+3 (and unicode)
# -*- coding: utf-8 -*-
import yaml
import io
# Define data
data = {
'a list': [
1,
42,
3.141,
1337,
'help',
u'€'
],
'a string': 'bla',
'another dict': {
'foo': 'bar',
'key': 'value',
'the answer': 42
}
}
# Write YAML file
with io.open('data.yaml', 'w', encoding='utf8') as outfile:
yaml.dump(data, outfile, default_flow_style=False, allow_unicode=True)
# Read YAML file
with open("data.yaml", 'r') as stream:
data_loaded = yaml.safe_load(stream)
print(data == data_loaded)
Created YAML file
a list:
- 1
- 42
- 3.141
- 1337
- help
- €
a string: bla
another dict:
foo: bar
key: value
the answer: 42
Common file endings
.yml
and .yaml
Alternatives
- CSV: Super simple format (read & write)
- JSON: Nice for writing human-readable data; VERY commonly used (read & write)
- YAML: YAML is a superset of JSON, but easier to read (read & write, comparison of JSON and YAML)
- pickle: A Python serialization format (read & write) ⚠️ Using pickle with files from 3rd parties poses an uncontrollable arbitrary code execution risk.
- MessagePack (Python package): More compact representation (read & write)
- HDF5 (Python package): Nice for matrices (read & write)
- XML: exists too *sigh* (read & write)
For your application, the following might be important:
- Support by other programming languages
- Reading / writing performance
- Compactness (file size)
See also: Comparison of data serialization formats
In case you are rather looking for a way to make configuration files, you might want to read my short article Configuration files in Python

- 124,992
- 159
- 614
- 958
-
What encoding does the file have? Your you sure it is utf-8 encoded? – Martin Thoma Aug 08 '19 at 21:27
-
1Thanks for suggestion. My file has utf-8 encoding. I had to change your code line to `io.open(doc_name, 'r', encoding='utf8')` to read the special character. YAML version 0.1.7 – Cloud Cho Aug 08 '19 at 21:53
-
Huh, interesting. I will try to reproduce that tomorrow and will adjust the question if I can. Thank you! – Martin Thoma Aug 09 '19 at 06:18
-
1You can use the built-in `open(doc_name, ..., encodung='utf8')` for read and write, without importing `io`. – dexteritas Aug 13 '19 at 09:29
-
26You use `import yaml`, but that isn't a built-in module, and you don't specify which package it is. Running `import yaml` on a fresh Python3 install results in `ModuleNotFoundError: No module named 'yaml'` – cowlinator Nov 19 '19 at 00:05
-
I think this is referring to package pyyaml which gets imported as yaml – cbailiss Dec 13 '21 at 17:40
-
Personally I'd add a big disclaimer to pickle if you list it, namely that it poses a significant arbitrary code execution risk and shouldn't be used with untrusted data. [The official Python docs for pickle](https://docs.python.org/3/library/pickle.html) say as much in a big red box: "Warning The pickle module is not secure. Only unpickle data you trust.". – bob May 26 '22 at 16:34
-
1
-
If you have YAML that conforms to the YAML 1.2 specification (released 2009) then you should use ruamel.yaml (disclaimer: I am the author of that package). It is essentially a superset of PyYAML, which supports most of YAML 1.1 (from 2005).
If you want to be able to preserve your comments when round-tripping, you certainly should use ruamel.yaml.
Upgrading @Jon's example is easy:
import ruamel.yaml as yaml
with open("example.yaml") as stream:
try:
print(yaml.safe_load(stream))
except yaml.YAMLError as exc:
print(exc)
Use safe_load()
unless you really have full control over the input, need it (seldom the case) and know what you are doing.
If you are using pathlib Path
for manipulating files, you are better of using the new API ruamel.yaml provides:
from ruamel.yaml import YAML
from pathlib import Path
path = Path('example.yaml')
yaml = YAML(typ='safe')
data = yaml.load(path)

- 69,918
- 32
- 186
- 246
-
Hello @Anthon. I was usiing ruamel's but got an issue with documents that are not ascii compliant (`UnicodeDecodeError: 'ascii' codec can't decode byte 0xe7 in position 926: ordinal not in range(128)`). I've tried to set yaml.encoding to utf-8 but didn't work as the load method in YAML still uses the ascii_decode. Is this a bug? – SnwBr Jan 07 '20 at 17:53
First install pyyaml using pip3.
Then import yaml module and load the file into a dictionary called 'my_dict':
import yaml
with open('filename.yaml') as f:
my_dict = yaml.safe_load(f)
That's all you need. Now the entire yaml file is in 'my_dict' dictionary.

- 920
- 6
- 6
-
3If your file contains the line "- hello world" it is inappropriate to call the variable my_dict, as it is going to contain a list. If that file contains specific tags (starting with ``!!python``) it can also be unsafe (as in complete harddisc wiped clean) to use `yaml.load()`. As that is clearly documented you should have repeated that warning here (in almost all cases `yaml.safe_load()` can be used). – Anthon Aug 23 '18 at 17:11
-
6You use `import yaml`, but that isn't a built-in module, and you don't specify which package it is. Running `import yaml` on a fresh Python3 install results in `ModuleNotFoundError: No module named 'yaml'` – cowlinator Nov 19 '19 at 00:08
-
See [Munch](https://pypi.org/project/munch/), https://stackoverflow.com/questions/52570869/load-yaml-as-nested-objects-instead-of-dictionary-in-python `import yaml; from munch import munchify; f = munchify(yaml.load(…)); print(fo.d.try)` – Hans Ginzel Jun 21 '20 at 20:41
To access any element of a list in a YAML file like this:
global:
registry:
url: dtr-:5000/
repoPath:
dbConnectionString: jdbc:oracle:thin:@x.x.x.x:1521:abcd
You can use following python script:
import yaml
with open("/some/path/to/yaml.file", 'r') as f:
valuesYaml = yaml.load(f, Loader=yaml.FullLoader)
print(valuesYaml['global']['dbConnectionString'])

- 351
- 3
- 3
Example:
defaults.yaml
url: https://www.google.com
environment.py
from ruamel import yaml
data = yaml.safe_load(open('defaults.yaml'))
data['url']

- 19,677
- 20
- 102
- 125
-
1
-
I thought it is, but is it? related: https://stackoverflow.com/questions/49512990/does-python-gc-close-files-too – J Kluseczka Jan 23 '21 at 23:02
-
@qrtLs It is definitely not safe. You should explicitly close the descriptor every time and this have some reasons: https://stackoverflow.com/a/25070939/3338479 – lucidyan Jul 14 '21 at 21:19
I use ruamel.yaml. Details & debate here.
from ruamel import yaml
with open(filename, 'r') as fp:
read_data = yaml.load(fp)
Usage of ruamel.yaml is compatible (with some simple solvable problems) with old usages of PyYAML and as it is stated in link I provided, use
from ruamel import yaml
instead of
import yaml
and it will fix most of your problems.
EDIT: PyYAML is not dead as it turns out, it's just maintained in a different place.

- 57
- 4
-
@Oleksander: PyYaml has commits in the last 7 months, and the most recent closed issue was 12 days ago. Can you please define "long dead?" – abalter Mar 20 '18 at 00:18
-
@abalter I apologize, seems that I got the info from their official site or the post right here https://stackoverflow.com/a/36760452/5510526 – Oleksandr Zelentsov Mar 20 '18 at 16:48
-
@OleksandrZelentsov I can see the confusion. There was a loooong period when it was dead. https://github.com/yaml/pyyaml/graphs/contributors. However, their site IS up and shows releases posted AFTER the SO post referring to PyYaml's demise. So it is fair to say that at this point it is still alive, although it's direction relative to ruamel is clearly uncertain. ALSO, there was a lengthy discussion here with recent posts. I added a comment, and now mine is the only one. I guess I don't understand how closed issues work. https://github.com/yaml/pyyaml/issues/145 – abalter Mar 20 '18 at 17:52
-
1@abalter FWIW, when that answer was posted, there had been a total of 9 commits in the past... just under 7 years. One of those was an automated "fix" of bad grammar. Two involved releasing a barely-changed new version. The rest were relatively tiny tweaks, mostly made _five_ years before the answer. All but the automated fix were done by one person. I wouldn't judge that answer harshly for calling PyYAML "long dead". – Nic Jun 14 '19 at 15:01
I made my own script for this. Feel free to use it, as long as you keep the attribution. The script can parse yaml from a file (function load
), parse yaml from a string (function loads
) and convert a dictionary into yaml (function dumps
). It respects all variable types.
# © didlly AGPL-3.0 License - github.com/didlly
def is_float(string: str) -> bool:
try:
float(string)
return True
except ValueError:
return False
def is_integer(string: str) -> bool:
try:
int(string)
return True
except ValueError:
return False
def load(path: str) -> dict:
with open(path, "r") as yaml:
levels = []
data = {}
indentation_str = ""
for line in yaml.readlines():
if line.replace(line.lstrip(), "") != "" and indentation_str == "":
indentation_str = line.replace(line.lstrip(), "").rstrip("\n")
if line.strip() == "":
continue
elif line.rstrip()[-1] == ":":
key = line.strip()[:-1]
quoteless = (
is_float(key)
or is_integer(key)
or key == "True"
or key == "False"
or ("[" in key and "]" in key)
)
if len(line.replace(line.strip(), "")) // 2 < len(levels):
if quoteless:
levels[len(line.replace(line.strip(), "")) // 2] = f"[{key}]"
else:
levels[len(line.replace(line.strip(), "")) // 2] = f"['{key}']"
else:
if quoteless:
levels.append(f"[{line.strip()[:-1]}]")
else:
levels.append(f"['{line.strip()[:-1]}']")
if quoteless:
exec(
f"data{''.join(str(i) for i in levels[:line.replace(line.lstrip(), '').count(indentation_str) if indentation_str != '' else 0])}[{key}]"
+ " = {}"
)
else:
exec(
f"data{''.join(str(i) for i in levels[:line.replace(line.lstrip(), '').count(indentation_str) if indentation_str != '' else 0])}['{key}']"
+ " = {}"
)
continue
key = line.split(":")[0].strip()
value = ":".join(line.split(":")[1:]).strip()
if (
is_float(value)
or is_integer(value)
or value == "True"
or value == "False"
or ("[" in value and "]" in value)
):
if (
is_float(key)
or is_integer(key)
or key == "True"
or key == "False"
or ("[" in key and "]" in key)
):
exec(
f"data{'' if line == line.strip() else ''.join(str(i) for i in levels[:line.replace(line.lstrip(), '').count(indentation_str) if indentation_str != '' else 0])}[{key}] = {value}"
)
else:
exec(
f"data{'' if line == line.strip() else ''.join(str(i) for i in levels[:line.replace(line.lstrip(), '').count(indentation_str) if indentation_str != '' else 0])}['{key}'] = {value}"
)
else:
if (
is_float(key)
or is_integer(key)
or key == "True"
or key == "False"
or ("[" in key and "]" in key)
):
exec(
f"data{'' if line == line.strip() else ''.join(str(i) for i in levels[:line.replace(line.lstrip(), '').count(indentation_str) if indentation_str != '' else 0])}[{key}] = '{value}'"
)
else:
exec(
f"data{'' if line == line.strip() else ''.join(str(i) for i in levels[:line.replace(line.lstrip(), '').count(indentation_str) if indentation_str != '' else 0])}['{key}'] = '{value}'"
)
return data
def loads(yaml: str) -> dict:
levels = []
data = {}
indentation_str = ""
for line in yaml.split("\n"):
if line.replace(line.lstrip(), "") != "" and indentation_str == "":
indentation_str = line.replace(line.lstrip(), "")
if line.strip() == "":
continue
elif line.rstrip()[-1] == ":":
key = line.strip()[:-1]
quoteless = (
is_float(key)
or is_integer(key)
or key == "True"
or key == "False"
or ("[" in key and "]" in key)
)
if len(line.replace(line.strip(), "")) // 2 < len(levels):
if quoteless:
levels[len(line.replace(line.strip(), "")) // 2] = f"[{key}]"
else:
levels[len(line.replace(line.strip(), "")) // 2] = f"['{key}']"
else:
if quoteless:
levels.append(f"[{line.strip()[:-1]}]")
else:
levels.append(f"['{line.strip()[:-1]}']")
if quoteless:
exec(
f"data{''.join(str(i) for i in levels[:line.replace(line.lstrip(), '').count(indentation_str) if indentation_str != '' else 0])}[{key}]"
+ " = {}"
)
else:
exec(
f"data{''.join(str(i) for i in levels[:line.replace(line.lstrip(), '').count(indentation_str) if indentation_str != '' else 0])}['{key}']"
+ " = {}"
)
continue
key = line.split(":")[0].strip()
value = ":".join(line.split(":")[1:]).strip()
if (
is_float(value)
or is_integer(value)
or value == "True"
or value == "False"
or ("[" in value and "]" in value)
):
if (
is_float(key)
or is_integer(key)
or key == "True"
or key == "False"
or ("[" in key and "]" in key)
):
exec(
f"data{'' if line == line.strip() else ''.join(str(i) for i in levels[:line.replace(line.lstrip(), '').count(indentation_str) if indentation_str != '' else 0])}[{key}] = {value}"
)
else:
exec(
f"data{'' if line == line.strip() else ''.join(str(i) for i in levels[:line.replace(line.lstrip(), '').count(indentation_str) if indentation_str != '' else 0])}['{key}'] = {value}"
)
else:
if (
is_float(key)
or is_integer(key)
or key == "True"
or key == "False"
or ("[" in key and "]" in key)
):
exec(
f"data{'' if line == line.strip() else ''.join(str(i) for i in levels[:line.replace(line.lstrip(), '').count(indentation_str) if indentation_str != '' else 0])}[{key}] = '{value}'"
)
else:
exec(
f"data{'' if line == line.strip() else ''.join(str(i) for i in levels[:line.replace(line.lstrip(), '').count(indentation_str) if indentation_str != '' else 0])}['{key}'] = '{value}'"
)
return data
def dumps(yaml: dict, indent="") -> str:
"""A procedure which converts the dictionary passed to the procedure into it's yaml equivalent.
Args:
yaml (dict): The dictionary to be converted.
Returns:
data (str): The dictionary in yaml form.
"""
data = ""
for key in yaml.keys():
if type(yaml[key]) == dict:
data += f"\n{indent}{key}:\n"
data += dumps(yaml[key], f"{indent} ")
else:
data += f"{indent}{key}: {yaml[key]}\n"
return data
print(load("config.yml"))
Example
config.yml
level 0 value: 0
level 1:
level 1 value: 1
level 2:
level 2 value: 2
level 1 2:
level 1 2 value: 1 2
level 2 2:
level 2 2 value: 2 2
Output
{'level 0 value': 0, 'level 1': {'level 1 value': 1, 'level 2': {'level 2 value': 2}}, 'level 1 2': {'level 1 2 value': '1 2', 'level 2 2': {'level 2 2 value': 2 2}}}
-
it so cool! But i does not working with lists like `one:\n - two\n - three ` – Maksym Sivash Jun 28 '22 at 16:14
#!/usr/bin/env python
import sys
import yaml
def main(argv):
with open(argv[0]) as stream:
try:
#print(yaml.load(stream))
return 0
except yaml.YAMLError as exc:
print(exc)
return 1
if __name__ == "__main__":
sys.exit(main(sys.argv[1:]))

- 149
- 9

- 321
- 2
- 4
-
3This code doesn't actually do anything. Did you mean to comment out code? – cowlinator Nov 19 '19 at 00:10
-
i think its expecting input. i.e. python main.py example.yaml. and maybe print(yaml.safe_load(stream)) for the print? – mirageglobe Dec 20 '21 at 18:30
read_yaml_file function returning all data into a dictionary.
def read_yaml_file(full_path=None, relative_path=None):
if relative_path is not None:
resource_file_location_local = ProjectPaths.get_project_root_path() + relative_path
else:
resource_file_location_local = full_path
with open(resource_file_location_local, 'r') as stream:
try:
file_artifacts = yaml.safe_load(stream)
except yaml.YAMLError as exc:
print(exc)
return dict(file_artifacts.items())

- 88,126
- 95
- 281
- 483

- 4,623
- 1
- 42
- 50