0

BACKGROUND INFORMATION:

I have dataframe of x many stocks with y price sets (closing & 3 day SMA), (currently this is 5 and 2 respectively (one is closing price, the other is a 3 day Simple Moving Average SMA).

The current output is [2781 rows x 10 columns] with a ranging data set start_date = '2006-01-01' till end_date = '2016-12-31'. The output is as follows as a dataframe print(df):

CURRENT OUTPUT:

            ANZ Price  ANZ 3 day SMA  CBA Price  CBA 3 day SMA  MQG Price   MQG 3 day SMA  NAB Price  NAB 3 day SMA  WBC Price  WBC 3 day SMA 
Date
2006-01-02  23.910000            NaN  42.569401            NaN  66.558502             NaN  30.792999            NaN  22.566401            NaN
2006-01-03  24.040001            NaN  42.619099            NaN  66.086403             NaN  30.935699            NaN  22.705400            NaN
2006-01-04  24.180000      24.043334  42.738400      42.642300  66.587997       66.410967  31.078400      30.935699  22.784901      22.685567 
2006-01-05  24.219999      24.146667  42.708599      42.688699  66.558502       66.410967  30.964300      30.992800  22.794800      22.761700
...               ...             ...       ...            ...        ...            ...         ...            ...        ...            ...
2016-12-27   87.346667     30.670000  30.706666      32.869999  32.729999       87.346667  30.670000      30.706666  32.869999      32.729999
2016-12-28   87.456667     31.000000  30.773333      32.980000  32.829999       87.456667  31.000000      30.773333  32.980000      32.829999
2016-12-29   87.520002     30.670000  30.780000      32.599998  32.816666       87.520002  30.670000      30.780000  32.599998      32.816666

MY WORKING CODE:

#!/usr/bin/python3
from pandas_datareader import data
import pandas as pd
import itertools as it
import os
import numpy as np
import fix_yahoo_finance as yf
import matplotlib.pyplot as plt
yf.pdr_override()

stock_list = sorted(["ANZ.AX", "WBC.AX", "MQG.AX", "CBA.AX", "NAB.AX"])
number_of_decimal_places = 8
moving_average_period = 3

def get_moving_average(df, stock_name):
    df2 = df.rolling(window=moving_average_period).mean()
    df2.rename(columns={stock_name: stock_name.replace("Price", str(moving_average_period) + " day SMA")}, inplace=True)
    df = pd.concat([df, df2], axis=1, join_axes=[df.index])
    return df


# Function to get the closing price of the individual stocks
# from the stock_list list
def get_closing_price(stock_name, specific_close):
    symbol = stock_name
    start_date = '2006-01-01'
    end_date = '2016-12-31'
    df = data.get_data_yahoo(symbol, start_date, end_date)
    sym = symbol + " "
    print(sym * 10)
    df = df.drop(['Open', 'High', 'Low', 'Adj Close', 'Volume'], axis=1)

    df = df.rename(columns={'Close': specific_close})

    # https://stackoverflow.com/questions/16729483/converting-strings-to-floats-in-a-dataframe
    # df[specific_close] = df[specific_close].astype('float64')
    # print(type(df[specific_close]))
    return df


# Creates a big DataFrame with all the stock's Closing
# Price returns the DataFrame
def get_all_close_prices(directory):
    count = 0
    for stock_name in stock_list:
        specific_close = stock_name.replace(".AX", "") + " Price"
        if not count:
            prev_df = get_closing_price(stock_name, specific_close)
            prev_df = get_moving_average(prev_df,  specific_close)
        else:
            new_df = get_closing_price(stock_name, specific_close)
            new_df = get_moving_average(new_df, specific_close)
            # https://stackoverflow.com/questions/11637384/pandas-join-merge-concat-two-dataframes
            prev_df = prev_df.join(new_df)
        count += 1
    # prev_df.to_csv(directory)

    df = pd.DataFrame(prev_df, columns=list(prev_df))
    df = df.apply(pd.to_numeric)
    convert_df_to_csv(df, directory)
    return df


def convert_df_to_csv(df, directory):
    df.to_csv(directory)

def main():
    # FINDS THE CURRENT DIRECTORY AND CREATES THE CSV TO DUMP THE DF
    csv_in_current_directory = os.getcwd() + "/stock_output.csv"

    csv_in_current_directory_dow_distribution = os.getcwd() + "/dow_distribution.csv"
    # FUNCTION THAT GETS ALL THE CLOSING PRICES OF THE STOCKS
    # AND RETURNS IT AS ONE COMPLETE DATAFRAME
    df = get_all_close_prices(csv_in_current_directory)    
    print(df)


# Main line of code
if __name__ == "__main__":
    main()

QUESTION:

From this df I want to create x many lines graphs (one graph per stock) with y many lines (price, and SMAs). How can I do this with matplotlib? Could this be done with a for loop and save the individuals plots as the loop gets iterated? If so how?

3kstc
  • 1,871
  • 3
  • 29
  • 53

2 Answers2

0

First import import matplotlib.pyplot as plt. Then it depends whether you want x many individual plots or one plot with x many subplots:

Individual plots

df.plot(y=[0,1])
df.plot(y=[2,3])
df.plot(y=[4,5])
df.plot(y=[6,7])
df.plot(y=[8,9]) 

plt.show()

You can also save the individual plots in a loop:

for i in range(0,9,2):
   df.plot(y=[i,i+1])
   plt.savefig('{}.png'.format(i)) 

Subplots

fig, axes = plt.subplots(nrows=2, ncols=3)

df.plot(ax=axes[0,0],y=[0,1])
df.plot(ax=axes[0,1],y=[2,3])
df.plot(ax=axes[0,2],y=[4,5])
df.plot(ax=axes[1,0],y=[6,7])
df.plot(ax=axes[1,1],y=[8,9])

plt.show()  

See https://pandas.pydata.org/pandas-docs/stable/generated/pandas.DataFrame.plot.html for options to customize your plot(s).

Stef
  • 28,728
  • 2
  • 24
  • 52
  • could this be done with a for loop and save the individuals plots as the loop gets iterated? if so how? – 3kstc Aug 20 '18 at 03:27
  • @3kstc Yes, you can save the plots. I added a simple example to my answer – Stef Aug 20 '18 at 08:36
0

The best approach is to make a function that is dependent on the size of your lists x and y. Thereby the function should be as follows:

def generate_SMA_graphs(df):
    columnNames = list(df.head(0))
    print("CN:\t", columnNames)
    print(len(columnNames))

    count = 0
    for stock in stock_list:
        stock_iter = count * (len(moving_average_period_list) + 1)
        sma_iter = stock_iter + 1
        for moving_average_period in moving_average_period_list:
            fig = plt.figure()
            df.plot(y=[columnNames[stock_iter], columnNames[sma_iter]])
            plt.xlabel('Time')
            plt.ylabel('Price ($)')
            graph_title = columnNames[stock_iter] + " vs. " + columnNames[sma_iter]
            plt.title(graph_title)
            plt.grid(True)
            plt.savefig(graph_title.replace(" ", "") + ".png")
            print("\t\t\t\tCompleted: ", graph_title)
            plt.close(fig)
            sma_iter += 1
        count += 1

With the code above, irrespective how ever long either list is (for x or y, stock list or SMA list) the above function will generate a graph comparing the original price with every SMA for that given stock.

3kstc
  • 1,871
  • 3
  • 29
  • 53