Overlapping data processing using Hilbert's curve

Question

I'm currently doing the project and I'm having a problem with the Hilbert's curve. And here's the details: I'm trying to convert 1 column x 50 rows of numerical data inside the pandas's dataframe and turn it into image by plotting the data (50 values) into order 4 Hilbert's curve. But the issue is some of the data are plotting in the same coordination and that data will be replaced by latest value. The data cannot be loosed also cannot be summation by others value or averaged data. I will use these image in the image classification deep learning models. Is there any solution or methods that will prevent the overlapping data?

I had advice to scaling and padding the data. I already scaling all the value and the overlapping reduce but most of the overlap are still remain. I'm trying to padding the data by adding rows between all the original rows but it did work, the overlapping still remain. Here's is my plotting Hilbert's curve code:

def plot_hilbert_curve(hilbert_order, df, column):
    p = hilbert_order
    N = 2 ** p
    
    hilbert_curve = HilbertCurve(p, 2)
    scaler = MinMaxScaler()
    image = np.zeros((N, N))
    scale_df = df.copy()
    corrdinate = []

    for value in df[column]:
        corr = list(hilbert_curve.point_from_distance(value))
        corrdinate.append(corr)
        
    scale_df[column] = scaler.fit_transform(scale_df[column].values.reshape(-1,1))
    count_n=1
    for i, corr in enumerate(corrdinate):
        
        new_value = scale_df[column].iloc[i]
        current_value = image[corr[0]][corr[1]]
        if new_value != current_value:
            print(f'{count_n} At coordinates {corr}, current value is {current_value} and new value is {new_value}')
            count_n+=1
        image[corr[0]][corr[1]] = new_value

    return image

here's my data to image code:

def hilbert_to_image(df_input, row, p, multi, padding, value):
    df = add_row_padding(df_input[:row], padding)
    columns = df.columns
    
    scaler = MinMaxScaler(feature_range=(0, (2 ** p) - 1))  # scale values to the range [0, 2^(p*2)-1]
    
    df.loc[:, columns] = scaler.fit_transform(df.loc[:, columns])
    df = df.multiply(multi)
    df = df[columns]
    
    image_show(plot_hilbert_curve(p, df, value), value)

and this is my starting code:

row = 50
p = 4
multi = 17
padding= 1
value = 'count'

df_load = read_newDB() # My dataframe

hilbert_to_image(df_load, row, p, multi, padding, value)

Your question needs, at a minimum, sample input, expected output, actual output, inorder to reproduce your problem. See [How to make good reproducible pandas examples](https://stackoverflow.com/questions/20109391/how-to-make-good-reproducible-pandas-examples) for best practices related to Pandas questions. — itprorh66, Jul 08 '23 at 14:45

Overlapping data processing using Hilbert's curve

0 Answers0