0

Here's my json file:

(base) -bash-4.2$ cat busco_genome/busco_output/run_basidiomycota_odb10/short_summary.json
{"one_line_summary": "C:82.7%[S:82.7%,D:0.0%],F:4.8%,M:12.5%,n:1764", "C": 1459, "S": 1459, "D": 0, "F": 84, "M": 221, "input_file": "/local/ifs3_scratch/CORE/jespinoz/VEBA/veba_output/binning/eukaryotic/S005_R2_POST-PE-N728-S516-1_S23/intermediate/3__metaeuk/genomes/S005_R2_POST-PE-N728-S516-1_S23__METABAT2__E.1__bin.10.fa", "mode": "genome", "gene_predictor": "metaeuk", "dataset": "/usr/local/scratch/CORE/jespinoz/db/busco/odb10/lineages/basidiomycota_odb10", "dataset_creation_date": "2020-09-10", "dataset_number_species": "133", "dataset_total_buscos": "1764"}

For some reason it isn't working when I'm trying to read with pandas

>>> pd.read_json("busco_genome/busco_output/run_basidiomycota_odb10/short_summary.json")
Traceback (most recent call last):
  File "<stdin>", line 1, in <module>
  File "/usr/local/devel/ANNOTATION/jespinoz/anaconda3/lib/python3.8/site-packages/pandas/util/_decorators.py", line 207, in wrapper
    return func(*args, **kwargs)
  File "/usr/local/devel/ANNOTATION/jespinoz/anaconda3/lib/python3.8/site-packages/pandas/util/_decorators.py", line 311, in wrapper
    return func(*args, **kwargs)
  File "/usr/local/devel/ANNOTATION/jespinoz/anaconda3/lib/python3.8/site-packages/pandas/io/json/_json.py", line 614, in read_json
    return json_reader.read()
  File "/usr/local/devel/ANNOTATION/jespinoz/anaconda3/lib/python3.8/site-packages/pandas/io/json/_json.py", line 748, in read
    obj = self._get_object_parser(self.data)
  File "/usr/local/devel/ANNOTATION/jespinoz/anaconda3/lib/python3.8/site-packages/pandas/io/json/_json.py", line 770, in _get_object_parser
    obj = FrameParser(json, **kwargs).parse()
  File "/usr/local/devel/ANNOTATION/jespinoz/anaconda3/lib/python3.8/site-packages/pandas/io/json/_json.py", line 885, in parse
    self._parse_no_numpy()
  File "/usr/local/devel/ANNOTATION/jespinoz/anaconda3/lib/python3.8/site-packages/pandas/io/json/_json.py", line 1139, in _parse_no_numpy
    self.obj = DataFrame(
  File "/usr/local/devel/ANNOTATION/jespinoz/anaconda3/lib/python3.8/site-packages/pandas/core/frame.py", line 614, in __init__
    mgr = dict_to_mgr(data, index, columns, dtype=dtype, copy=copy, typ=manager)
  File "/usr/local/devel/ANNOTATION/jespinoz/anaconda3/lib/python3.8/site-packages/pandas/core/internals/construction.py", line 464, in dict_to_mgr
    return arrays_to_mgr(
  File "/usr/local/devel/ANNOTATION/jespinoz/anaconda3/lib/python3.8/site-packages/pandas/core/internals/construction.py", line 119, in arrays_to_mgr
    index = _extract_index(arrays)
  File "/usr/local/devel/ANNOTATION/jespinoz/anaconda3/lib/python3.8/site-packages/pandas/core/internals/construction.py", line 625, in _extract_index
    raise ValueError("If using all scalar values, you must pass an index")
ValueError: If using all scalar values, you must pass an index
O.rka
  • 29,847
  • 68
  • 194
  • 309
  • I dont know your expect output but pass orient='index' maybe? Like: `pd.read_json("busco_genome/busco_output/run_basidiomycota_odb10/short_summary.json",orient='index')` – anky Mar 09 '22 at 18:27
  • 2
    Can you try `df = pd.read_json("./sample_json.json", typ='series')`? – Kabilan Mohanraj Mar 09 '22 at 18:31

1 Answers1

1

Voted to close b/c it's a duplicate pointed out by @Kabilan-Mohanraj who also answered the question as well. All you need to do is do typ='series' (typ not type)

>>> pd.read_json("busco_genome/busco_output/short_summary.generic.eukaryota_odb10.busco_output.json", typ="series")
one_line_summary                C:87.9%[S:87.5%,D:0.4%],F:7.1%,M:5.0%,n:255
C                                                                       224
S                                                                       223
D                                                                         1
F                                                                        18
M                                                                        13
input_file                /local/ifs3_scratch/CORE/jespinoz/VEBA/veba_ou...
mode                                                                 genome
gene_predictor                                                      metaeuk
dataset                   /usr/local/scratch/CORE/jespinoz/db/busco/odb1...
dataset_creation_date                                            2020-09-10
dataset_number_species                                                   70
dataset_total_buscos                                                    255
dtype: object
O.rka
  • 29,847
  • 68
  • 194
  • 309