0

I have a messy data which seems like

10 130 31 11 106 27 12 105 31 14 100 24 15 60 24 16 70 25 17 65 18 18 60 23 20 48 21 22 28 14 24 12 18 25 2 14 26 1 10 30 0 9

I want to group them in a way that they would look like

10, 130, 31    
11, 106, 27   
12, 105, 31  
14, 100, 24  
15, 60, 24  
16, 70, 25  
...

I tried mystr.split() to end up in a list and then define the following function to group them in 3's:

def dataframing(df):
    nd = ["["]
    a = 0
    for data in df:
        if a < 3:
            nd += str(data)
            a = a+1
        else:
            nd+= str("],[")
            a = 0
    nd  += str("]")
            
    return nd

I was pretty sure that was going to work, however, I got the following output:

['[',
 '1',
 '0',
 '1',
 '3',
 '0',
 '3',
 '1',
 ']',
 ',',
 '[',
 '1',
 '0',
 '6',
 '2',
 '7',
 '1',
 '2',
 ']',
 ',',
 '[',
 '3',
 '1',
 '1',
 '4',
 '1',
 '0',
 '0',
 ']',
 ',',
 '[',
 '1',
 '5',
 '6',
 '0',
 '2',
 '4',
 ']',
 ',',
 '[',
 '7',
 '0',
 '2',
 '5',
 '1',
 '7',
 ']',
 ',',
 '[',
 '1',
 '8',
 '1',
 '8',
 '6',
 '0',
 ']',
 ',',
 '[',
 '2',
 '0',
 '4',
 '8',
 '2',
 '1',
 ']',
 ',',
 '[',
 '2',
 '8',
 '1',
 '4',
 '2',
 '4',
 ']',
 ',',
 '[',
 '1',
 '8',
 '2',
 '5',
 '2',
 ']',
 ',',
 '[',
 '2',
 '6',
 '1',
 '1',
 '0',
 ']',
 ',',
 '[',
 '0',
 '9',
 ']']

I don't know why but I guess my code took every figure as an individual data rather than accepting them as pieces of a number. Now I cannot advance furthermore, any alternative solutions would be graciously appreciated.

Mayank Porwal
  • 33,470
  • 8
  • 37
  • 58
Yekta Aktaş
  • 125
  • 1
  • 9
  • In what format is the original data? A string with spaces? – User1010 Jan 23 '21 at 09:53
  • @User1010 Actually, it was in pdf format but, when I copied and pasted the data on excel it filled only one block with all the data separating them with spaces – Yekta Aktaş Jan 23 '21 at 09:59

2 Answers2

3

try list comprehension method. Also never use list1 += list2. Insead use list.extend() if needed.

l = "10 130 31 11 106 27 12 105 31 14 100 24 15 60 24 16 70 25 17 65 18 18 60 23 20 48 21 22 28 14 24 12 18 25 2 14 26 1 10 30 0 9"
l = l.split(" ")
l = [int(i) for i in l]

l = [10,130,31,11,106,27,12,105,31,14,100,24,15,60,24,16,70,25,17,65,18,18,60,23,20,48,21,22,28,14,24,12,18,25,2,14,26,1,10,30,0,9]

grouped = [l[i: i+4] for i in range(0,len(l), 4)]
# use 4 if you want to group in 4 or 3 if u want to group in 3
# for example to group in 3 use
# grouped = [l[i: i+3] for i in range(0,len(l), 3)]

output

[[10, 130, 31, 11],
 [106, 27, 12, 105],
 [31, 14, 100, 24],
 [15, 60, 24, 16],
 [70, 25, 17, 65],
 [18, 18, 60, 23],
 [20, 48, 21, 22],
 [28, 14, 24, 12],
 [18, 25, 2, 14],
 [26, 1, 10, 30],
 [0, 9]]
Epsi95
  • 8,832
  • 1
  • 16
  • 34
2

Assuming the input is a space seperated string as it appears in the question there is no need to use pandas,

data = "10 130 31 11 106 27 12 105 31 14 100 24 15 60 24 16 70 25 17 65 18 18 60 23 20 48 21 22 28 14 24 12 18 25 2 14 26 1 10 30 0 9"

# Split the string into an array
data_array = data.split()

print(data_array)


# Chunking function, ref: https://stackoverflow.com/a/17483656/8458712
def chunks(array, n):
    return [array[i:i+n] for i in range(0, len(array), n)]


print(chunks(data_array, 3))

Output,

[['10', '130', '31'], 
 ['11', '106', '27'], 
 ['12', '105', '31'], 
 ['14', '100', '24'], 
 ['15', '60', '24'],
 ['16', '70', '25'], 
 ['17', '65', '18'], 
 ['18', '60', '23'], 
 ['20', '48', '21'], 
 ['22', '28', '14'], 
 ['24', '12', '18'],
 ['25', '2', '14'], 
 ['26', '1', '10'], 
 ['30', '0', '9']]
User1010
  • 789
  • 1
  • 6
  • 19