I'm currently doing the project and I'm having a problem with the Hilbert's curve. And here's the details: I'm trying to convert 1 column x 50 rows of numerical data inside the pandas's dataframe and turn it into image by plotting the data (50 values) into order 4 Hilbert's curve. But the issue is some of the data are plotting in the same coordination and that data will be replaced by latest value. The data cannot be loosed also cannot be summation by others value or averaged data. I will use these image in the image classification deep learning models. Is there any solution or methods that will prevent the overlapping data?
I had advice to scaling and padding the data. I already scaling all the value and the overlapping reduce but most of the overlap are still remain. I'm trying to padding the data by adding rows between all the original rows but it did work, the overlapping still remain. Here's is my plotting Hilbert's curve code:
def plot_hilbert_curve(hilbert_order, df, column):
p = hilbert_order
N = 2 ** p
hilbert_curve = HilbertCurve(p, 2)
scaler = MinMaxScaler()
image = np.zeros((N, N))
scale_df = df.copy()
corrdinate = []
for value in df[column]:
corr = list(hilbert_curve.point_from_distance(value))
corrdinate.append(corr)
scale_df[column] = scaler.fit_transform(scale_df[column].values.reshape(-1,1))
count_n=1
for i, corr in enumerate(corrdinate):
new_value = scale_df[column].iloc[i]
current_value = image[corr[0]][corr[1]]
if new_value != current_value:
print(f'{count_n} At coordinates {corr}, current value is {current_value} and new value is {new_value}')
count_n+=1
image[corr[0]][corr[1]] = new_value
return image
here's my data to image code:
def hilbert_to_image(df_input, row, p, multi, padding, value):
df = add_row_padding(df_input[:row], padding)
columns = df.columns
scaler = MinMaxScaler(feature_range=(0, (2 ** p) - 1)) # scale values to the range [0, 2^(p*2)-1]
df.loc[:, columns] = scaler.fit_transform(df.loc[:, columns])
df = df.multiply(multi)
df = df[columns]
image_show(plot_hilbert_curve(p, df, value), value)
and this is my starting code:
row = 50
p = 4
multi = 17
padding= 1
value = 'count'
df_load = read_newDB() # My dataframe
hilbert_to_image(df_load, row, p, multi, padding, value)