0

I have a imported a .json list with my data, turning it into a list of dictionaries.

The keys are column headers. For a given value of a certain key I would like to get to the value of another key in the same dictionary.

data = [

   {
   "Nr.": 2,
   "Table data": "S - Sulfur",
   "Ref.": 571,
   "Formula": "S",
   "Name": "Sulfur",
 },
 {
   "Nr.": 3,
   "Table data": "HF - Hydrogen Fluoride",
   "Ref.": 556,
   "Formula": "HF",
   "Name": "Hydrogen Fluoride",
 },
 {
   "Nr.": 4,
   "Table data": "N2 - Nitrogen",
   "Ref.": 5,
   "Formula": "N2",
   "Name": "Nitrogen",
 },
]

so for example given a user input of "Nitrogen" I would like to retrieve the value of the key "Formula" i.e. "N2.

I would convert the list of dictionaries into a dataframe and then use the lookup method.

import pandas

data = pandas.read_json("file.json")

df = pandas.DataFrame(data,
                      columns=('Nr.', 'Table data','Ref','Formula','Name'))

df['Formula'] = df.lookup(df.index,df['Nitrogen'])

however matching a string with a value in this dataframe seems to be giving a KeyError every time.

大陸北方網友
  • 3,696
  • 3
  • 12
  • 37
  • `df['Nitrogen']` means select column called `Nitrogen`, if not exist returned KeyError. – jezrael Feb 23 '21 at 08:03
  • What is expected output after `df['Formula'] = df.lookup(df.index,df['Nitrogen'])` ? – jezrael Feb 23 '21 at 08:04
  • If you want to look up a key for a given value in a dict, you could consider inverting the dict for that purpose; or making an inverted copy of that dict. For inverting a dict, see e.g. https://stackoverflow.com/questions/483666/reverse-invert-a-dictionary-mapping . – 9769953 Feb 23 '21 at 08:06
  • 1
    I think you are looking for the following line of code `df[df['Name'] == 'Nitrogen']` – arhr Feb 23 '21 at 08:11
  • The actual data list is much longer. The expectation would be, given a list of names to get the list of matching formulas. – andreas2010 Feb 23 '21 at 08:29

2 Answers2

0

You can use str.match on the specific column in the dataframe, e.g.,

s= input()
df["Table data"].str.match(f".*{s}.*")
supercooler8
  • 503
  • 2
  • 7
0

You could use isin if you want to match the column values against multiple elements as:

df = pandas.DataFrame(data,
                      columns=('Nr.', 'Table data','Ref','Formula','Name'))

inp_list = ['Nitrogen', 'Sulfur']
out = df[df.Name.isin(inp_list)]
print(out['Formula'])

Output:

0     S
2    N2

Also, note that DataFrame.lookup is deprecated and you should use DataFrame.melt and DataFrame.loc instead.

Krishna Chaurasia
  • 8,924
  • 6
  • 22
  • 35
  • Thanks this works fine. Any idea why it also includes the value of the first column '"Nr." ? – andreas2010 Feb 23 '21 at 08:59
  • it seems it is a default behavior to print the index when a `dataframe` is dumped using print. However, you can skip the index using `out['Formula'].to_string(index=False)` – Krishna Chaurasia Feb 23 '21 at 09:06