3

I want to download more than 100k app definitions from api (json) I have simple script for this:

BASE_PATCH="/media/sf_1/getappid/"

rm -rf ${BASE_PATCH}results

while read -r package <&3; do

            mkdir -p ${BASE_PATCH}results
            curl "https://api.test.com/v2/appid/${package}" -X GET -H "API-KEY: XxXxX-xXxXxXx" -H "Content-Type: application/json" --output ${BASE_PATCH}results/getappid.json

done 3<${BASE_PATCH}appIdId.json

And this is working, but it makes one request per loop - and it takes a lot of time (hours). So my idea is to do it in parallel so. 1. Take first 5 ids from list (in file) 2. Start downloading those 5 json files 3. after it is finished take next 5 ids

Maybe someone have idea how to to this. I want to stick with curl, now i want to download something, but probably soon I will need to use POST, PATCH or PUT (so tool need to have those options)

kowalski
  • 31
  • 1
  • 3

2 Answers2

4

With GNU Parallel maybe:

parallel -j 5 -a ${BASE_PATCH}appIdId.json curl "http://.../appid/{}" -X ....

You could also add -X to fetch as many definitions as possible per invocation of curl and thereby avoid having to create 100k curl processes.

Mark Setchell
  • 191,897
  • 31
  • 273
  • 432
1

Gnu parallel as @Mark suggests, or xargs. See this question running-programs-in-parallel-using-xargs

Community
  • 1
  • 1
Mort
  • 3,379
  • 1
  • 25
  • 40