I'm trying to subset and return a pandas df. The main obstacle is I'm executing the subset from a df that is being continually updated.
I'm appending data on a timer that imports the same dataset every minute. I then want to subset this updated data and return it for a separate function. Specifically, the subset df will be emailed. I'm hoping to continually repeat this process.
I'll lay out each intended step below. I'm falling down on step 3.
Import the dataset over a 24 hour period
Continually update same dataset every minute
Subset the data frame by condition
If a new row is appended to df, execute email notification
Using below, data is imported from yahoo finance where the same data is pulled every minute.
I'm then aiming to subset specific rows from this updated dataframe and return the data to be emailed.
I only want to execute the email function when a new row of data has been appended.
The condition outlined below will return a new row at every minute (which is by design for testing purposes). My actual condition will return between 0-10 instances a day.
The example df outlined in df_out
is an example that may be taken at a point throughout a day.
import pandas as pd
import yfinance as yf
import datetime
import pytz
from threading import Thread
from time import sleep
import numpy as np
import pandas as pd
import requests
from email.mime.application import MIMEApplication
from email.mime.multipart import MIMEMultipart
from email.mime.text import MIMEText
import smtplib
def scheduled_update():
# Step 1. import data for a 24 hour period
my_date = datetime.datetime.now(pytz.timezone('Etc/GMT-5'))
prev_24hrs = my_date - datetime.timedelta(hours = 25, minutes = 0)
data = yf.download(tickers = 'EURUSD=X',
start = prev_24hrs,
end = my_date,
interval = '1m'
).iloc[:-1]#.reset_index()
# Step 2. update data every minute in 5 min blocks
while True:
sleep(60)
upd_data = yf.download(tickers = 'EURUSD=X',
start = my_date - datetime.timedelta(hours = 0, minutes = 5),
end = datetime.datetime.now(pytz.timezone('Etc/GMT-5')),
interval = '1m')
print(upd_data)
if len(upd_data) != 0:
# Here's the check to see if the data meets the desired condition.
# The last row is again removed for the same reason as noted earlier.
upd_data = upd_data.loc[upd_data['High'].lt(0.98000)].iloc[:-1]
# Merges the two sets of data.
data = pd.concat([data, upd_data])
# For the first few minutes after the function starts running, there will be
# duplicate timestamps, so this gets rid of those. This way is fastest, according to:
# https://stackoverflow.com/questions/13035764/remove-pandas-rows-with-duplicate-indices
data = data[~data.index.duplicated(keep = 'first')]
print(data)
else:
print('No new data')
return data
thread = Thread(target = scheduled_update)
thread.start()
For arguments sake, let's say during the day that a new row has been appended and we call the df as df_out
. When the new row has been appended, I want to execute the email notification.
# Step 3. return subset of data to be emailed
#df_out = scheduled_update()
# example df
df_out = pd.DataFrame({'Datetime' : ['2022-10-10 01:44:00+01:00','2022-10-10 01:45:00+01:00','2022-10-10 01:46:00+01:00','2022-10-10 01:47:00+01:00','2022-10-10 01:48:00+01:00'],
'Open' : [0.973899,0.973710,0.973615,0.973410,0.973799],
'High' : [0.973999,0.974110,0.973115,0.973210,0.973899],
'Low' : [0.973899,0.973710,0.973615,0.973710,0.973499],
'Close' : [0.973999,0.974110,0.973115,0.973410,0.973499],
'Adj Close' : [0.973999,0.974110,0.973115,0.973410,0.973499],
'Volume' : [0,0,0,0,0],
})
# Step 4. send notification containing df_out
def send_tradeNotification(send_to, subject, df):
# google account and password
send_from = 'xxxx1@gmail.com'
password = 'password'
# email message
message = """\
<p><strong>Notification </strong></p>
<p>
<br>
</p>
<p><strong>-
</strong><br><strong>Regards </strong></p>
"""
for receiver in send_to:
multipart = MIMEMultipart()
multipart['From'] = send_from
multipart['To'] = receiver
multipart['Subject'] = subject
attachment = MIMEApplication(df.to_csv())
attachment['Content-Disposition'] = 'attachment; filename=" {}"'.format(f'{subject}.csv')
multipart.attach(attachment)
multipart.attach(MIMEText(message, 'html'))
server = smtplib.SMTP('smtp.gmail.com', 587)
server.starttls()
server.login(multipart['From'], password)
server.sendmail(multipart['From'], multipart['To'], multipart.as_string())
server.quit()
#send_tradeNotification(['xxxx2@gmail.com'], 'Trade Setup', df_out)