0

I am trying to convert yaml to json. BUT, Before converting it I need to check if incoming file is yaml or not(this check is mandatory)

I found some piece of code here in Is there a way to determine whether a file is in YAML or JSON format? and found below:

import re
from pathlib import Path

commas = re.compile(r',(?=(?![\"]*[\s\w\?\.\"\!\-\_]*,))(?=(?![^\[]*\]))')
"""
Find all commas which are standalone 
 - not between quotes - comments, answers
 - not between brackets - lists
"""
file_path = Path("example_file.cfg")
signs = commas.findall(file_path.open('r').read())

return "json" if len(signs) > 0 else "yaml"

but my input file is not like :

example_file.cfg

My input would be either example.yaml or example.json

So I need such comparison without example_file.cfg

Thankful if anything found helpful.

mck
  • 40,932
  • 13
  • 35
  • 50
  • 2
    Why don't you just check the file extension of your input? – mck Feb 03 '21 at 13:08
  • If the file extension is not reliable in your case, the accepted answer for the question you linked is what I would do: try to parse it as JSON and YAML, and catch the exceptions. – naicolas Feb 03 '21 at 13:14
  • If you can use the extension, why try to detect anything? Besides, this regex is overcomplicated. A JSON document can only start with `[` or `{`. All you need to do is check if the first non-space character is `[` or `{`. All you need to do is read the first non-space character from the file. You don't even need to read the entire text – Panagiotis Kanavos Feb 03 '21 at 13:16
  • Every valid JSON file is also a valid YAML 1.2 file [[1\]](https://yaml.org/spec/1.2/spec.html#id2759572), why don't you just load the file as YAML and output it as JSON? – Jasmijn Feb 03 '21 at 13:47
  • @Jasmijn Thanks for suggestion but we can not load every file as yaml file. Because we need to give both file options to customer YAML or JSON but if customer is giving YAML file then we need to convert it to json internally and proceed ahead!. Thats why we need one check before proceeding. Check should contain the code to check if incoming file YAML or not if yes then convert it to json first and proceed ahead – Diksha Yadgire Feb 03 '21 at 13:49
  • I guess I don't see the problem. `json.dumps(yaml.loads(string_containing_json_or_yaml))` will always produce valid JSON if the string contains valid JSON or YAML. – Jasmijn Feb 04 '21 at 10:54

1 Answers1

0

Something like the below should work (assuming file can be only json or yaml)

import json


def yaml_or_json(file_name):
    with open(file_name) as f:
        try:
            json.load(f)
            return 'json'
        except Exception:
            return 'yaml'
balderman
  • 22,927
  • 7
  • 34
  • 52