1

For my college project I am trying to develop a python based traffic generator.I have created 2 CentOS machines on vmware and I am using 1 as my client and 1 as my server machine. I have used IP aliasing technique to increase number of clients and severs using just single client/server machine. Upto now I have created 50 IP alias on my client machine and 10 IP alias on my server machine. I am also using multiprocessing module to generate traffic concurrently from all 50 clients to all 10 servers. I have also developed few profiles(1kb,10kb,50kb,100kb,500kb,1mb) on my server(in /var/www/html directory since I am using Apache Server) and I am using urllib2 to send request to these profiles from my client machine. Here while running my scripts when I monitor number of TCP Connections it is always <50. I want to increase it to say 10000. How do I achieve this? I thought that if a new TCP Connection is established for every new http request, then this goal can be achieved. Am I on right path? If not kindly guide to me correct path.

        '''
Traffic Generator Script:

 Here I have used IP Aliasing to create multiple clients on single vm machine. 
 Same I have done on server side to create multiple servers. I have around 50 clients and 10 servers
'''
import multiprocessing
import urllib2
import random
import myurllist    #list of all destination urls for all 10 servers
import time
import socbindtry   #script that binds various virtual/aliased client ips to the script
response_time=[]    #some shared variables
error_count=multiprocessing.Value('i',0)
def send_request3():    #function to send requests from alias client ip 1
    opener=urllib2.build_opener(socbindtry.BindableHTTPHandler3)    #bind to alias client ip1
    try:
    tstart=time.time()
    for i in range(myurllist.url):
    x=random.choice(myurllist.url[i])
    opener.open(x).read()
    print "file downloaded:",x
    response_time.append(time.time()-tstart)
    except urllib2.URLError, e:
    error_count.value=error_count.value+1
def send_request4():    #function to send requests from alias client ip 2
    opener=urllib2.build_opener(socbindtry.BindableHTTPHandler4)    #bind to alias client ip2
    try:
    tstart=time.time()
    for i in range(myurllist.url):
    x=random.choice(myurllist.url[i])
    opener.open(x).read()
    print "file downloaded:",x
    response_time.append(time.time()-tstart)
    except urllib2.URLError, e:
    error_count.value=error_count.value+1
#50 such functions are defined here for 50 clients
process=[]
def func():
    global process
    process.append(multiprocessing.Process(target=send_request3))
    process.append(multiprocessing.Process(target=send_request4))
    process.append(multiprocessing.Process(target=send_request5))
    process.append(multiprocessing.Process(target=send_request6))
#append 50 functions here
    for i in range(len(process)):
     process[i].start()
    for i in range(len(process)):
     process[i].join()
    print"All work Done..!!"
     return
start=float(time.time())
func()
end=float(time.time())-start
print end
Bhoomika Sheth
  • 323
  • 5
  • 17
  • 1
    You really wrote the (nearly) same code like 50 times? – Klaus D. Mar 27 '15 at 08:56
  • The code is not "complete" (indentation...), but as far as I can tell, you create 50 process, each one performing 1 download at a time. If this is correct, you obviously only have N<50 simultaneous downloads. – Sylvain Leroux Mar 27 '15 at 08:58
  • And just to add: there are tools like httperf [ http://www.hpl.hp.com/research/linux/httperf/ ] for that purpose. – Klaus D. Mar 27 '15 at 09:14
  • @KlausD. yes i had to... because every time i am using different ip to send request. and i cannot show httpref as my college proj – Bhoomika Sheth Mar 27 '15 at 10:04
  • @SylvainLeroux yes what you said is correct. but i somehow want to increase TCP Connections. – Bhoomika Sheth Mar 27 '15 at 10:05

1 Answers1

1

For this sort of things, you probably need to create a pool of worker process. I don't know if a pool of 10000 process is viable in your use case (it is a very ambitious goal), but you should definitively investigate that idea.


The basic idea behind a pool is that you have M tasks to perform, with a maximum of N running simultaneously. When one of the worker has finished its task, it is ready to work on an other until all the work is done. One major advantage is that if some number of tasks take long time to complete, they will not block the overall progress of the work (as long as the number of "slow" process is < N).

Along the lines, here would be the basic structure of your program Using Pool:

from multiprocessing import Pool

import time
import random

def send_request(some_parameter):
    print("Do send_request", some_parameter)

    time.sleep(random.randint(1,10)) # simulate randomly long process

if __name__ == '__main__':
    pool = Pool(processes=100)

    for i in range(200):
        pool.apply_async(send_request, [i])


    print("Waiting")
    pool.close()
    pool.join()
    print("Done")

On my system, this sample program took something like 19s (real time) to perform. On my Debian system, I was only able to spawn a little bit more than 1000 processes at a time before I reached the maximum number of open file (given the standard ulimit -n of 1024). You will have to somehow raise that limit if you need such a huge number of working threads. And even if doing so, as I said firstly 10000 concurrent process is probably rather ambitious (at least using Python).

Sylvain Leroux
  • 50,096
  • 7
  • 103
  • 125
  • But using pool allows me to run same function multiple times right? here i want to execute different functions (all are using different ips though executing the same task) at same time. so i avoided using pool. can i use pool to execute different functions? – Bhoomika Sheth Mar 27 '15 at 10:07
  • @BhoomikaSheth Please take a closer look at the example: You can pass one or several parameters to the called function. For example, the IP address of your host, or the URL or mostly whatever you want. – Sylvain Leroux Mar 27 '15 at 10:13
  • I am getting one RuntimeError: Synchronized objects should only be shared between processes through inheritance while running my scripts – Bhoomika Sheth Mar 27 '15 at 11:32
  • if i dont pass any parameter to the function then it works perfectly – Bhoomika Sheth Mar 27 '15 at 11:39
  • @Bhoomika As this is clearly a different issue, you should definitively ask an other question in order to draw enough attention to get useful answers. – Sylvain Leroux Mar 27 '15 at 17:56
  • @Sylvian Leroux I asked another question addressing my issue of runtime error. Can have a look at it once and guide me further in correct direction? [http://stackoverflow.com/questions/29430355/python-multiprocessing-pool-apply-async-with-shared-variables-value?noredirect=1#comment47030856_29430355] – Bhoomika Sheth Apr 03 '15 at 10:55