1

I have a python script that takes input from from stdin:

from urllib.parse import urlparse
import sys
import asyncio

from wapitiCore.main.wapiti import Wapiti, logging


async def scan(url: str):
    wapiti = Wapiti(url)
    wapiti.set_max_scan_time(30)
    wapiti.set_max_links_per_page(20)
    wapiti.set_max_files_per_dir(10)

    wapiti.verbosity(2)
    wapiti.set_color()
    wapiti.set_timeout(20)
    wapiti.set_modules("xss")
    wapiti.set_bug_reporting(False)

    parts = urlparse(url)
    wapiti.set_output_file(f"/tmp/{parts.scheme}_{parts.netloc}.json")
    wapiti.set_report_generator_type("json")

    wapiti.set_attack_options({"timeout": 20, "level": 1})

    stop_event = asyncio.Event()
    await wapiti.init_persister()
    await wapiti.flush_session()
    await wapiti.browse(stop_event, parallelism=64)
    await wapiti.attack(stop_event)

if __name__ == "__main__":
    asyncio.run(scan(sys.argv[1]))

How can I use xargs to run this script on multiple URL's from a file in parallel fashion?

urls.txt

https://jeboekindewinkel.nl/
https://www.codestudyblog.com/
group72web
  • 13
  • 2
  • Plenty of options here: https://stackoverflow.com/questions/1688999/how-can-i-read-a-list-of-filenames-from-a-file-in-bash – kpie Feb 21 '22 at 20:04
  • PS you can run things without blocking with the & – kpie Feb 21 '22 at 20:05

2 Answers2

0

I believe a bash file something like this would work.

cat urls.txt | while read line
do
python scriptName.py $line &
done
kpie
  • 9,588
  • 5
  • 28
  • 50
  • You want `cat`, not `echo`. This currently doesn't read the file's contents at all, but runs the loop once with the name as input – Charles Duffy Feb 21 '22 at 20:18
  • 2
    Or better, put ` – Charles Duffy Feb 21 '22 at 20:20
  • 1
    Also, see [BashFAQ #1](https://mywiki.wooledge.org/BashFAQ/001) for more details on correct `while read` use. There are some changes needed to correctly handle lines that contain backslashes or end in whitespace. And put double quotes around `$line` to prevent the line's contents from being split into multiple words and each of those words being expanded as a glob expression. – Charles Duffy Feb 21 '22 at 20:22
  • This answer doesn't read the title at all. – Weihang Jian Feb 22 '22 at 08:20
0

This would execute your script concurrently based on the number of cores of processors.

cat urls.txt | xargs -L1 -P0 python script.py

Reference

-P maxprocs
    Parallel mode: run at most maxprocs invocations of utility at once.
    If maxprocs is set to 0, xargs will run as many processes as possible.

-L number
    Call utility for every number non-empty lines read.  A line ending with a
    space continues to the next non-empty line.  If EOF is reached and fewer
    lines have been read than number then utility will be called with the
    available lines.  The -L and -n options are mutually-exclusive; the last
    one given will be used.
Weihang Jian
  • 7,826
  • 4
  • 44
  • 55