0

I have code like:

import pandas as pd
import multiprocessing as mp

a = {'a' : [1,2,3,1,2,3], 'b' : [5,6,7,4,6,5], 'c' : ['dog', 'cat', 'tree','slow','fast','hurry']}
df = pd.DataFrame(a)

def performDBSCAN(feature): 
    value=scorecalculate(feature)
    print(value)
    for ele in range(4):
        value=value+1
        print('here value is ', value)
    return value

def processing(feature):
    result1=performDBSCAN(feature)
    return result1

def scorecalculate(feature):
    scorecal=0
    for val in ['a','b','c','d']:
        print('alpha is:', val )
        scorecal=scorecal+1
    return scorecal

columns = df.columns
for ele in df.columns:
    processing(ele)

The above code is executing in a serial fashion. I would like to make faster by using multiprocessing in python and wrote the following code.

import pandas as pd
import multiprocessing as mp     

def performDBSCAN(feature): 
    value=scorecalculate(feature)
    print(value)
    for ele in range(4):
        value=value+1
        print('here value is ', value)
    return value

def scorecalculate(feature):
    scorecal=0
    for val in ['a','b','c','d']:
        print('alpha is:', val )
        scorecal=scorecal+1
    return scorecal

def processing(feature):
    result1=performDBSCAN(feature)
    return result1

a = {'a' : [1,2,3,1,2,3], 'b' : [5,6,7,4,6,5], 
'c' : ['dog','cat','tree','slow','fast','hurry']}
df = pd.DataFrame(a)
columns = df.columns
pool = mp.Pool(4)
resultpool = pool.map(processing, columns)

I want to see the output but the kernel is continuously running without any output? what could be the issue?

Vas
  • 918
  • 1
  • 6
  • 19
  • Are you on Windows? It's *really* important that the "main-like" functionality be guarded by `if __name__ == '__main__':` on Windows when using `multiprocessing` (it's a good idea everywhere, but it's critical on Windows, which simulates forking by re-importing the main module, which goes bad when a `Pool` is created in the main module without the guard). – ShadowRanger Sep 13 '18 at 22:37
  • Yes I am running it on windows. So where should I initialize it? Is it before pool = mp.Pool(4) step???? – Vas Sep 13 '18 at 22:41
  • Everything at and below the line that assigns `a` should be indented a level, and that `if` guard added above it (really, it's best to indent it and put it in a `def main():` method, then just use `if __name__ == '__main__': main()` so you don't accidentally rely on main variables being shared globals). Read the "Even Better Way" [here](https://stackoverflow.com/a/20158605/364696) for details; it's hard to convey this in comments. – ShadowRanger Sep 13 '18 at 22:44
  • `columns = df.columns` `if __name__ == '__main__':` `pool = mp.Pool(4)` `resultpool = pool.map(processing, columns)`. Is it Right? – Vas Sep 13 '18 at 22:46
  • It didn't help me. Still it is giving me the same issue. – Vas Sep 13 '18 at 22:48

0 Answers0