Pandas extract informations

Question

This is my Pandas Frame:

       istat    cap               Comune
0       1001  10011               AGLIE'
1       1002  10060              AIRASCA
2       1003  10070         ALA DI STURA

I want to reproduce the equivalent SQL query:

Select cap
from DataFrame
where Comune = 'AIRASCA'

Obtaining:

cap
10060

I tried to achieve this with dataframe.loc() but i cannot retrieve what i need.

And this is my Python code:

import pandas as pd
from lxml import etree
from pykml import parser

def to_upper(l):
    return l.upper()


kml_file_path = '../Source/Kml_Regions/Lombardia.kml'
excel_file_path = '../Source/Milk_Coverage/Milk_Milan_Coverage.xlsx'
zip_file_path = '../Source/ZipCodes/italy_cap.csv'

# Read zipcode csv
zips = pd.read_csv(zip_file_path)
zip_df = pd.DataFrame(zips, columns=['cap',  'Comune']).set_index('Comune')
zips_dict = zips.apply(lambda x: x.astype(str).str.upper())

# Read excel file for coverage
df = pd.ExcelFile(excel_file_path).parse('Comuni')
x = df['City'].tolist()
cities = list(map(to_upper, x))

#-----------------------------------------------------------------------------------------------#
# Check uncovered
# parse the input file into an object tree
with open(kml_file_path) as f:
  tree = parser.parse(f)

# get a reference to the "Document.Folder" node
uncovered = tree.getroot().Document.Folder

# iterate through all "Document.Folder.Placemark" nodes and find and remove all nodes
# which contain child node "name" with content "ZONE"

for pm in uncovered.Placemark:
        if pm.name in cities:
            parent = pm.getparent()
            parent.remove(pm)
        else:
            pass

# convert the object tree into a string and write it into an output file
with open('../Output/Uncovered_Milkman_LO.kml', 'w') as output:
    output.write(etree.tostring(uncovered, pretty_print=True))

#---------------------------------------------------------------------------------------------#

# Check covered
with open(kml_file_path) as f:
    tree = parser.parse(f)

covered = tree.getroot().Document.Folder

for pmC in covered.Placemark:
        if pmC.name in cities:
            pass
        else:
            parentCovered = pmC.getparent()
            parentCovered.remove(pmC)

# convert the object tree into a string and write it into an output file
with open('../Output/Covered_Milkman_LO.kml', 'w') as outputs:
    outputs.write(etree.tostring(covered, pretty_print=True))

# Writing CAP
with open('../Output/Covered_Milkman_LO.kml', 'r') as f:
    in_file = f.readlines()  # in_file is now a list of lines

# Now we start building our output
out_file = []
cap = ''

#for line in in_file:
#    out_file.append(line)  # copy each line, one by one

# Iterate through dictionary which is a list transforming it in a itemable object
for city in covered.Placemark:
    print zips_dict.loc[city.name, ['Comune']]

I cannot understand the errors python is giving me, what i'm doing wrong? Technically i can look for a key by finding a value in pandas, is it correct?

I think is not similar to the possible duplicate question because i'm asking to retrieve a single value instead of a column.

Brian · Accepted Answer · 2018-08-20T13:49:17.937

3

ksooklall's answer should work just fine but (unless I'm remembering incorrectly) it's a bit faux pas to use back to back brackets in pandas -it's a bit slower than using loc and can actually matter when using larger dataframes with many calls.

Using loc like this should work just fine:

 df.loc[df['Comune'] == 'AIRASCA', 'cap']

edited Aug 20 '18 at 13:49

answered Aug 20 '18 at 13:34

Brian

1,572
9
18

Fantastic! This seems working perfectly! – xCloudx8 Aug 20 '18 at 13:37

score 2 · Answer 2 · answered Aug 20 '18 at 13:30

2

Try this:

cap = df[df['Comune'] == 'AIRASCA']['cap']

answered Aug 20 '18 at 13:30

Kenan

13,156
8
43
50

Rakesh · Answer 3 · 2018-08-20T13:34:11.720

2

You can use eq

Ex:

import pandas as pd

df = pd.DataFrame({"istat": [1001, 1002, 1003], "cap": [10011, 10060, 10070 ], "Comune": ['AGLIE', 'AIRASCA', 'ALA DI STURA']})
print( df.loc[df["Comune"].eq('AIRASCA'), "cap"] )

Output:

1    10060
Name: cap, dtype: int64

edited Aug 20 '18 at 13:34

answered Aug 20 '18 at 13:32

Rakesh

81,458
17
76
113

Pandas extract informations

3 Answers3