2

I'm using this API to get companies data: https://github.com/vkruoso/receita-tools

Here you can see how the registry comes to me (which seems as a json structure): https://www.receitaws.com.br/v1/cnpj/27865757000102

I'm able to download it using the following:

cadastro = os.system("curl -X GET https://www.receitaws.com.br/v1/cnpj/27865757000102"

If I run type(cadastro) it shows up class 'int' to me. I want to turn that into a dataframe. How could I do that?

aabujamra
  • 4,494
  • 13
  • 51
  • 101

1 Answers1

5

os.system returns the exit code not the data. You should use subprocess instead, see Assign output of os.system to a variable and prevent it from being displayed on the screen.

If you are using python 3.5+, you should use subprocess.run()

import subprocess
import json
import pandas as pd

proc = subprocess.run(["curl",  "-X", "GET",  
                  "https://www.receitaws.com.br/v1/cnpj/27865757000102"],
                   stdout=subprocess.PIPE, encoding='utf-8')

cadastro = proc.stdout
df = pd.DataFrame([json.loads(cadastro)])

Otherwise, use subprocess.Popen()

import subprocess
import json
import pandas as pd

proc = subprocess.Popen(["curl",  "-X", "GET",  
                  "https://www.receitaws.com.br/v1/cnpj/27865757000102"],
                   stdout=subprocess.PIPE)

cadastro, err = proc.communicate()
df = pd.DataFrame([json.loads(cadastro)])

Or, you can use the Requests library.

import json
import requests
import pandas as pd

response = requests.get("https://www.receitaws.com.br/v1/cnpj/27865757000102")
data = json.loads(response.content.decode(response.encoding))
df = pd.DataFrame([data])
kaidokuuppa
  • 642
  • 4
  • 10
  • Tried using subprocess and it didn't work. Requests ran fine, but I will have to spend some time adjusting de data inside the dataframe. – aabujamra Oct 03 '17 at 13:32