Sorry, a bit of a newbie with Python.
Can anyone help with the below code? I'm trying to write two dataframes, created by two separate multiprocessing Processes to the same excel file.
EDIT: this is simplified code. In my actual project the dataframes are built using pd.read_sql() on different connections. If this won't bring about any noticeable in terms of speed, please let me know. I just assumed running it normally would mean waiting for the first connection's SQL query to run before the second connection's.
import pyodbc
import pandas as pd
import os
from datetime import datetime
import multiprocessing
def Test1():
global df
df = pd.DataFrame({'Data': [10, 20, 30, 20, 15, 30, 45]})
def Test2():
global df2
df2 = pd.DataFrame({'Data': [20, 40, 60, 40, 30, 60, 90]})
if __name__ == '__main__':
Proc1 = multiprocessing.Process(target=Test1)
Proc2 = multiprocessing.Process(target=Test2)
Proc1.start()
Proc2.start()
Proc1.join()
Proc2.join()
writer =
pd.ExcelWriter(os.path.join(os.path.join(os.environ['USERPROFILE']), 'Desktop','Test.xlsx') , engine='xlsxwriter')
df.to_excel(writer, sheet_name='Test Title',index=False)
df2.to_excel(writer,sheet_name='Test Title2',index=False)
workbook = writer.book
worksheet = writer.sheets['Test Title']
worksheet = writer.sheets['Test Title2']
writer.save()
It doesn't help that I don't know the terminology in order to search out the answer. So apologies if this is a duplicate of a question asked by someone more Python-literate than myself.
Also, the error message:
line 37, in <module>
df.to_excel(writer, sheet_name='Test Title',index=False)
NameError: name 'df' is not defined