I had previously asked this question by may not have been clear enough on my explanation of my particular situation. My previous question was voted as a duplicate of how to get the return value from a thread in python?
Perhaps I should have explained more. I had already read and tried the referenced thread, but nothing that I did from there seemed to work. (I could be just implementing it incorrectly).
My main class that does all the work and data transformation is:
class SolrPull(object):
def __init__(self, **kwargs):
self.var1 = kwargs['var1'] if 'var1' in kwargs else 'this'
self.var2 = kwargs['var2'] if 'var2' in kwargs else 'that'
def solr_main(self):
#This is where the main data transformation takes place.
return(self.flattened_df)
I need to create multiple objects and have them pull from a Solr database and transform data synchronously in different threads.
My arguments must be passed to the SolrPull class, not to the solr_main function.
I need to wait for those returns before continuing with processing.
I tried a couple of different answers from the referenced thread, but nothing worked.
Using the accepted answer for that thread, I did:
class TierPerf(object):
def pull_current(self):
pool = ThreadPool(processes=5)
CustomerRecv_df_result = pool.apply_async(SolrPull(var1='this', var2='that').solr_main())
APS_df_result = pool.apply_async(SolrPull(var1='this', var2='that').solr_main())
self.CustomerRecv_df = CustomerRecv_df_result.get()
self.APS_df = APS_df_result.get()
But the pulls and transformation do not happen synchronously. Then when I do the .get(), I get the error 'DataFrame object is not callable'.
As an end result, I need to be able to synchronously call SolrPull(*args).solr_main() and return pandas dataframe that will then be used for further processing.