I'm looking for the name for a procedure which handles output from one function in several others (trying to find better words for my problem). Some pseudo/actual code would be really helpful.
I have written the following code:
def read_data():
read data from a file
create df
return df
def parse_data():
sorted_df = read_data()
count lines
sort by date
return sorted_df
def add_new_column():
new_column_df = parse_data()
add new column
return new_column_df
def create_plot():
plot_data = add_new_column()
create a plot
display chart
What I'm trying to understand is how to skip a function, e.g. create following chain read_data() -> parse_data() -> create_plot()
.
As the code looks right now (due to all return values and how they are passed between functions) it requires me to change input data in the last function, create_plot()
.
I suspect that I'm creating logically incorrect code.
Any thoughts?
Original code:
import pandas as pd
import matplotlib.pyplot as plt
# Read csv files in to data frame
def read_data():
raw_data = pd.read_csv('C:/testdata.csv', sep=',', engine='python', encoding='utf-8-sig').replace({'{':'', '}':'', '"':'', ',':' '}, regex=True)
return raw_data
def parse_data(parsed_data):
...
# Convert CreationDate column into datetime
raw_data['CreationDate'] = pd.to_datetime(raw_data['CreationDate'], format='%Y-%m-%d %H:%M:%S', errors='coerce')
raw_data.sort_values(by=['CreationDate'], inplace=True, ascending=True)
parsed_data = raw_data
return parsed_data
raw_data = read_files()
parsed = parsed_data(raw_data)