0

I have the code below, bringing data from a table converting into DataFrame. From that, I include a column with the API url + the zip code of each line. When reading each Json URL, I need to return the latitude and longitude, but I'm only getting this data from the last line, not each of them.

import json
import pandas as pd
import psycopg2
import urllib.request
import requests

con_db = psycopg2.connect(host='', database='', user='', password='')
cursor = con.cursor()
cursor.execute("SELECT * FROM table WHERE district = 'Monjolos' limit 10")
result = cursor.fetchall()
df = pd.DataFrame(result)
df.columns = [columns[0] for columns in cursor.description]

url_api = 'https://cep.awesomeapi.com.br/json/'
df['url_api'] = url_api + df['zip_code']

for index, row in df.iterrows():
    url = url_api + row['url_api']
    open_json = urllib.request.urlopen(url)
    data = json.loads(open_json.read())
    df['coord'] = data['lat'] + ', ' + data['lng']
    
print(df[['id', 'url_api', 'coord']])

Result:

     id                                      url_api                     coord
0  3748  https://cep.awesomeapi.com.br/json/24722120  -22.8147264, -42.9442479
1  3859  https://cep.awesomeapi.com.br/json/24722610  -22.8147264, -42.9442479
2  3860  https://cep.awesomeapi.com.br/json/24722630  -22.8147264, -42.9442479
3  3863  https://cep.awesomeapi.com.br/json/24722670  -22.8147264, -42.9442479
4  3882  https://cep.awesomeapi.com.br/json/24723000  -22.8147264, -42.9442479
5  3883  https://cep.awesomeapi.com.br/json/24723002  -22.8147264, -42.9442479
6  3884  https://cep.awesomeapi.com.br/json/24723003  -22.8147264, -42.9442479
7  3885  https://cep.awesomeapi.com.br/json/24723004  -22.8147264, -42.9442479
8  3886  https://cep.awesomeapi.com.br/json/24723005  -22.8147264, -42.9442479
9  3887  https://cep.awesomeapi.com.br/json/24723006  -22.8147264, -42.9442479

Process finished with exit code 0
Lino Costa
  • 67
  • 1
  • 8

2 Answers2

1

The last line in the for loop assign the value to the entire column on each iteration. Use loc to assign to value to the current index.

df.loc[index, 'coord'] = data['lat'] + ', ' + data['lng']
Antoine M
  • 121
  • 2
1
coord = []
for index, row in df.iterrows():
    url = url_api + row['url_api']
    open_json = urllib.request.urlopen(url)
    data = json.loads(open_json.read())
    coord.append(data['lat'] + ', ' + data['lng'])
df['coord'] = coord
Rinshan Kolayil
  • 1,111
  • 1
  • 9
  • 14
  • Thanks for the support. It also worked with the last two lines. – Lino Costa Sep 20 '21 at 13:59
  • @LinoCosta I hope using list will be faster [`What is the fastest and most efficient way to append rows to a DataFrame?`](https://stackoverflow.com/questions/57000903/what-is-the-fastest-and-most-efficient-way-to-append-rows-to-a-dataframe) – Rinshan Kolayil Sep 20 '21 at 14:04