3

I want to split the following, nested dictionary into different dictionaries by language AND create a new JSON-File/Dictionary for each language.

Afterwards I would like to merge them back together.

Grateful for any suggestions how to continue!

Example:

{
  "All": {
    "label_es_ES": "Todo",
    "label_it_IT": "Tutto",
    "label_en_EN": "All", 
    "label_fr_FR": "Tout"
  },
  "Searchprofile": {
    "label_es_ES": "Perfil de búsqueda",
    "label_it_IT": "Profilo di ricerca",
    "label_en_EN": "Search profile", 
    "label_fr_FR": "Profil de recherche"
  },

What I got so far:

import json

store_file = open( 'test.txt' , "w" )

with open('translations.json') as json_file:
   data = json.load(json_file)
       for label, translations in data.items():
           for key in translations:
               if key==('label_en_EN'):
                   json.dump(???, store_file)
            .....'''
  • 1
    Can you elaborate on how you want the individual JSON files formatted? Creating a new JSON file for each language would just be a single key, value pair judging by the input you provided. IE. `"label_es_ES": "Perfil de búsqueda"` – Kyle Dixon Oct 22 '21 at 15:09
  • Thank you for your quick reply!! The format of the single dictionaries should stay the same, while only containing one language: { "All": { "label_es_ES": "Todo", }, "Searchprofile": { "label_es_ES": "Perfil de búsqueda", }, – heroldonkey Oct 22 '21 at 15:10

2 Answers2

2

Going through your dictionary with a loop:

from pprint import pprint

data = {
  "All": {
    "label_es_ES": "Todo",
    "label_it_IT": "Tutto",
    "label_en_EN": "All", 
    "label_fr_FR": "Tout"
  },
  "Searchprofile": {
    "label_es_ES": "Perfil de búsqueda",
    "label_it_IT": "Profilo di ricerca",
    "label_en_EN": "Search profile", 
    "label_fr_FR": "Profil de recherche"
  }
}
new_data = dict()
for word,transl_dict in data.items():
    for lbl, transl in transl_dict.items():
        if not(lbl in new_data.keys()):
            new_data[lbl] = dict()
        new_data[lbl][word] = transl

pprint(new_data)

Output:

{'label_en_EN': {'All': 'All', 'Searchprofile': 'Search profile'},
 'label_es_ES': {'All': 'Todo', 'Searchprofile': 'Perfil de búsqueda'},
 'label_fr_FR': {'All': 'Tout', 'Searchprofile': 'Profil de recherche'},
 'label_it_IT': {'All': 'Tutto', 'Searchprofile': 'Profilo di ricerca'}}

You can of course dump the label_... dictionaries to files individually.

Edit: to output your original expected dictionaries it's even shorter if you already know which labels there are:

labels = ["label_es_ES", "label_it_IT", "label_en_EN", "label_fr_FR"]
for label in labels:
    label_dict = {x: {label: data[x][label]} for x in data}
    pprint(label_dict)
    # or dump directly to files;
    with open(f"{label}.json", "w", encoding="utf-8") as f:
        json.dump(label_dict, f, indent=4, ensure_ascii=False)

Json files are written in utf-8 format so that you see special characters in the json. Don't forget to specify the encoding (utf-8) while opening the file later!

Tranbi
  • 11,407
  • 6
  • 16
  • 33
  • 1
    I've just read your comment and it seems it's not exactly what you wanted. Doesn't it make more sense though? it avoids the repetition of label_... as a key – Tranbi Oct 22 '21 at 15:20
  • I should've clarified, the repetition is wanted in that case. Each json file, .txt file or whatever, should countain one single language and the format { "All": { "label_es_ES": "Todo", }, "Searchprofile": { "label_es_ES": "Perfil de búsqueda", } and so forth Thank you very much, I'm looking at your proposal right now, maybe I can adapt it – heroldonkey Oct 22 '21 at 15:22
  • Kind of stuck. In any case the label "label_es_ES" should be repeated, as I want to merge all the files again and end up with the same format. – heroldonkey Oct 22 '21 at 15:29
  • This looks really close, THANKS! But how can I create a new file for each language? Also, the apostrophs of e.g. Spanish seem to be substituted with Unicode (?) placeholders. – heroldonkey Oct 22 '21 at 15:46
  • 1
    It should already dump the files to `label_.json` (I've added the dump part a minute later so you might have missed it). Can you specify what character causes problem? – Tranbi Oct 22 '21 at 15:56
  • 1
    I edited my answer so that json is now written as utf-8. Special characters should now be read without issue – Tranbi Oct 22 '21 at 16:06
  • Oh, I indeed missed it by a minute…works perfectly!! For example: "Sinónimos" is dumped as "Sin\u00f3nimos" in the JSON file. – heroldonkey Oct 22 '21 at 16:06
  • Last question: how would you merge it again? To the starting format? – heroldonkey Oct 22 '21 at 16:09
  • 1
    take a look at [this](https://stackoverflow.com/questions/38987/how-do-i-merge-two-dictionaries-in-a-single-expression-taking-union-of-dictiona). it should be straightforward with `update` once you have loaded your dictionaries. If you have difficulties doing so, please open a new questions – Tranbi Oct 22 '21 at 16:14
1
from itertools import islice

def chunks(data, SIZE=10000):
    it = iter(data)
    for i in range(0, len(data), SIZE):
        yield {k:data[k] for k in islice(it, SIZE)}
        
for item in chunks({i:i for i in range(10)}, 3):
    print item
    
  • 1
    Your answer could be improved with additional supporting information. Please [edit] to add further details, such as citations or documentation, so that others can confirm that your answer is correct. You can find more information on how to write good answers [in the help center](/help/how-to-answer). – Community Oct 22 '21 at 18:53