How to apply euclidean distance to dataframe. Calculate each row

Question

Please help me, I have the problem. It's been about 2 weeks but I don't get it yet.

So, I want to use "apply" in dataframe, which I got from Alphavantage API. I want to apply euclidean distance to each row of dataframe.

import math
import numpy as np
import pandas as pd
from scipy.spatial import distance
import matplotlib.pyplot as plt
import matplotlib.patches as mpatches
from sklearn.neighbors import KNeighborsRegressor
from alpha_vantage.timeseries import TimeSeries
from services.KEY import getApiKey


ts = TimeSeries(key=getApiKey(), output_format='pandas')

And in my picture I got this

My chart (sorry can't post image because of my reputation)

In my code

stock, meta_data = ts.get_daily_adjusted(symbol, outputsize='full')
stock = stock.sort_values('date')

open = stock['1. open'].values
low = stock['3. low'].values
high = stock['2. high'].values
close = stock['4. close'].values
sorted_date = stock.index.get_level_values(level='date')

stock_numpy_format = np.stack((sorted_date, open, low
                               ,high, close), axis=1)
df = pd.DataFrame(stock_numpy_format, columns=['date', 'open', 'low', 'high', 'close'])

df = df[df['open']>0]
df = df[(df['date'] >= "2016-01-01") & (df['date'] <= "2018-12-31")]
df = df.reset_index(drop=True)

df['close_next'] = df['close'].shift(-1)
df['daily_return'] = df['close'].pct_change(1)
df['daily_return'].fillna(0, inplace=True)
stock_numeric_close_dailyreturn = df['close', 'daily_return']
stock_normalized = (stock_numeric_close_dailyreturn - stock_numeric_close_dailyreturn.mean()) / stock_numeric_close_dailyreturn.std()

euclidean_distances = stock_normalized.apply(lambda row: distance.euclidean(row, date_normalized) , axis=1)
distance_frame = pd.DataFrame(data={"dist": euclidean_distances, "idx":euclidean_distances.index})
distance_frame.sort_values("dist", inplace=True)
second_smallest = distance_frame.iloc[1]["idx"]
most_similar_to_date = df.loc[int(second_smallest)]["date"]

And I want that my chart like this

The chart that I want

And the code from this picture

distance_columns = ['Close', 'DailyReturn']
stock_numeric = stock[distance_columns]
stock_normalized = (stock_numeric - stock_numeric.mean()) / stock_numeric.std()
stock_normalized.fillna(0, inplace = True)
date_normalized = stock_normalized[stock["Date"] == "2016-06-29"]
euclidean_distances = stock_normalized.apply(lambda row: distance.euclidean(row, date_normalized), axis = 1)
distance_frame = pandas.DataFrame(data = {"dist": euclidean_distances, "idx": euclidean_distances.index})
distance_frame.sort_values("dist", inplace=True)
second_smallest = distance_frame.iloc[1]["idx"]
most_similar_to_date = stock.loc[int(second_smallest)]["Date"]

I tried to figure it out, the "apply" in the df.apply from pandas format and from pandas.csv_reader is different. Is there any alternative to have same output in different format (pandas and csv)

Thank you!

nb: sorry if my english bad.

1) what are stock_normalized, df and stock? 2) What is the actual question, are you getting ts with a output_format='csv'? 3) most likely imports pd == pandas, and the csv_reader from pandas is generating a dataframe, so it should have the exact same apply !? Are you sure your predicted saturated value (limited to at most +/- 2450) is coming from this part of the code? — B. Go, Apr 11 '19 at 17:58
any reason why you need to use `.apply()` at all? `distance.euclidean` should also work on vector. Better yet, use `numpy.linalg.norm(a-b)` as per [this question](https://stackoverflow.com/questions/1401712/how-can-the-euclidean-distance-be-calculated-with-numpy/21986532) — Aditya Santoso, Apr 12 '19 at 06:18
Also, I think it's best to clarify that you have many questions lumped into one. On one hand, you have problem with the `.apply()`, on another hand you have issue with your chart not looking like the example. — Aditya Santoso, Apr 12 '19 at 06:23
ok I get it. Finnaly it's because of the stock I use. I've tried some of different stock code and some of them give good result. thank you for Mr. @B.Go — Julius Tanuwijaya, Apr 12 '19 at 15:15

How to apply euclidean distance to dataframe. Calculate each row

0 Answers0