1

I have a file.xlsx with numbers in one column. I want it to convert into a list in python. The problem is when I run the code, the output is:

[[123], [234], [345], ...]

but want it to be

["123", "234", "345", ...]

How can I "delete" the additional " [] "?

import pandas as pd


excel_file = pd.read_excel(r'C:\xxx\xxx\xxx\file.xlsx')
excel_list = excel_file.values.tolist()

print(excel_list)
martineau
  • 119,623
  • 25
  • 170
  • 301
Next_adik
  • 13
  • 2

3 Answers3

1

You can unwrap the 1-item lists with a list comprehension that accesses the zeroth member of the list:

excel_list = [[123], [234], [345]]  # stand-in for your current reading code

excel_list = [item[0] for item in excel_list]  # unwrap the lists

However, it's better to just tell Pandas that the first line is a header (if it is):

import pandas as pd

excel_file = pd.read_excel(r'so71568988.xlsx', header=1)
print(excel_file)
excel_list = excel_file["values"].tolist()
print(excel_list)

outputs

   values
0     123
1     456
2     790
3   35932
[123, 456, 790, 35932]

without extra unwrapping.

AKX
  • 152,115
  • 15
  • 115
  • 172
1

You can also flatten the array before converting to list.

import pandas as pd


excel_file = pd.read_excel(r'C:\xxx\xxx\xxx\file.xlsx')
excel_list = excel_file.values.flatten().tolist()

print(excel_list)
AKX
  • 152,115
  • 15
  • 115
  • 172
Olca Orakcı
  • 372
  • 3
  • 12
0

You can use .flatten() on the DataFrame converted to a NumPy array:

df.to_numpy().flatten()

Please refer to the original thread here - python pandas flatten a dataframe to a list

Docs link - https://docs.scipy.org/doc/numpy-1.13.0/reference/generated/numpy.ndarray.flatten.html

mohammed_ayaz
  • 620
  • 11
  • 16