0

I have a pandas dataframe. One of it's columns contains variable-length float arrays. I need to convert them to arrays of uint8 because actually these arrays contain grayscale images with values from 0 to 255. Currently arrays dimention is 1. I understand that it's possible to iterate over rows and do conversion in this cycle. But hope there is an out of box solution cause this task seems to be common. I also tryed the following df['grayscale255'] = df['grayscale255'].astype('uint8'), but it doesn't work becuase

TypeError: only size-1 arrays can be converted to Python scalars

Data snippet: enter image description here

Dmitry
  • 727
  • 1
  • 8
  • 34
  • Please include a _small_ subset of your data as a __copyable__ piece of code that can be used for testing as well as your expected output for the __provided__ data. See [MRE - Minimal, Reproducible, Example](https://stackoverflow.com/help/minimal-reproducible-example), and [How to make good reproducible pandas examples](https://stackoverflow.com/q/20109391/15497888). – Henry Ecker May 29 '21 at 15:07
  • to kumar: apply('uint8') causes the same error because as I think this is actually a df column, and not an array. Sorry, it's not possible to add a comment to your answer so had to downvote – Dmitry May 29 '21 at 15:09
  • to Henry: added – Dmitry May 29 '21 at 15:13

2 Answers2

2

Use apply + astype

df['grayscale255'] = df['grayscale255'].apply(lambda x: x.astype('uint8'))

Or apply np.ubyte:

df['grayscale255'] = df['grayscale255'].apply(np.ubyte)

df:

   random_len         grayscale255
0           4    [72, 195, 17, 79]
1           3        [70, 97, 198]
2           4  [161, 129, 163, 48]
3           2            [152, 22]
4           3        [40, 23, 175]

Sample Data:

import numpy as np
import pandas as pd

np.random.seed(5)

df = pd.DataFrame({'random_len': np.random.randint(1, 5, size=5)})
df['grayscale255'] = df['random_len'].apply(
    lambda x: np.random.random(size=x) * 200
)
   random_len                                       grayscale255
0           4  [72.74737942011788, 195.8889998322892, 17.9642...
1           3  [70.82760858300028, 97.32759937167242, 198.164...
2           4  [161.6563366066534, 129.89177664881987, 163.89...
3           2           [152.87452173223647, 22.180152347812744]
4           3  [40.830949566118434, 23.81907149565208, 175.58...
Henry Ecker
  • 34,399
  • 18
  • 41
  • 57
-1

df['grayscale255'] = df['grayscale255'].apply('uint8')

Try this

  • Take a look at the accepted answer, which came in around the same time as yours. It offers the same basic guidance, but with the level of detail and context we hope for on Stack Overflow. – Jeremy Caney May 30 '21 at 04:43