sort row's similar columns and put row other items below the sorted row name

Question

Sort rows names and print the corresponding values below them.

text file contains

x 1 asd
x 2 asd
x 3 asd
x 4 asd
x 5 asd
x 5 asd
x 7 asd
b 8 axy
b 9 axc

output required

x 
asd
asd
asd
asd
asd
asd
asd

b
axy
axc

It is not clear if your table always has a single value in the first column and if the second column is always a nice incrementing index. — Edward Ji, May 21 '22 at 11:24
I have changed the title, I'm not sure how to explain this. I just need to sort out the similar row names in a file and then put the rest of the items in the next column below it. — Captain-Robot, May 21 '22 at 11:27
So you want to group by the first column? Please clarify what you're trying to do. — user2314737, May 21 '22 at 11:30
items in the first column in the above text fie could be the same and corresponding items to these values are different. So I want to print x only one time and whatever in front to the x should be printed below x. — Captain-Robot, May 21 '22 at 11:35

user2314737 · Answer 1 · 2022-05-21T20:11:23.620

1

Use csv reader

with open("file.txt", "r") as f:
    oldx, newx = '', ''
    for row in csv.reader(f, delimiter=' '):
        newx = row[0]
        if newx != oldx:
            print(newx)
            oldx = newx
        print(row[-1])

edited May 21 '22 at 20:11

answered May 21 '22 at 11:29

user2314737

27,088
20
102
114

constantstranger · Accepted Answer · 2022-05-21T15:20:21.460

Here's a way to do it:

with open('infile.txt', 'r', encoding="utf-8") as f:
    rows = [row.split() for row in f.readlines()]
    print('rows:'); print(rows)
    column = [rows[0][0]] + [row[-1] for row in rows]
    print('column:'); print(column)
    with open('outfile.txt', 'w', encoding="utf-8") as g:
        for row in column:
            g.write(f'{row}\n')

# check the output file:
with open('outfile.txt', 'r', encoding="utf-8") as f:
    print('contents of output file:')
    [print(row.strip('\n')) for row in f.readlines()]

Explanation:

read all lines from the input file and split each into a list of tokens, creating a list of lists named rows
create a list named column whose first element is the top left element of rows, with the remaining elements coming from the right column of rows
write the contents of column to the output file one at a time, each on its own line using \n as the line terminator
read and print the output file to check it contains the desired output (taking care to to strip the final \n since print() appends its own \n)

Output:

rows:
[['x', '1', 'asd'], ['x', '2', 'asd'], ['x', '3', 'asd'], ['x', '4', 'asd'], ['x', '5', 'asd'], ['x', '5', 'asd'], ['x', '7', 'asd']]
column:
['x', 'asd', 'asd', 'asd', 'asd', 'asd', 'asd', 'asd']
contents of output file:
x
asd
asd
asd
asd
asd
asd
asd

UPDATE: Addressing OP's modified question.

with open('infile.txt', 'r', encoding="utf-8") as f:
    rows = [row.split() for row in f.readlines()]
    print('rows:'); print(rows)
    columns = []
    col = []
    iRow = 0
    while iRow < len(rows):
        if iRow == 0 or rows[iRow - 1][0] != rows[iRow][0]:
            if iRow > 0:
                columns.append(col)
            col = [rows[iRow][0]]
        col.append(rows[iRow][-1])
        iRow += 1
    columns.append(col)
    print('columns:'); print(columns)
    
    #column = [rows[0][0]] + [row[-1] for row in rows]
    #print('column:'); print(column)
    with open('outfile.txt', 'w', encoding="utf-8") as g:
        isFirstCol = True
        for column in columns:
            if isFirstCol:
                isFirstCol = False
            else:
                g.write(f'\n')
            for row in column:
                g.write(f'{row}\n')

# check the output file:
with open('outfile.txt', 'r', encoding="utf-8") as f:
    print('contents of output file:')
    [print(row.strip('\n')) for row in f.readlines()]

Output:

rows:
[['x', '1', 'asd'], ['x', '2', 'asd'], ['x', '3', 'asd'], ['x', '4', 'asd'], ['x', '5', 'asd'], ['x', '5', 'asd'], ['x', '7', 'asd'], ['b', '8', 'axy'], ['b', '9', 'axc']]
columns:
[['x', 'asd', 'asd', 'asd', 'asd', 'asd', 'asd', 'asd'], ['b', 'axy', 'axc']]
contents of output file:
x
asd
asd
asd
asd
asd
asd
asd

b
axy
axc

Thank you, this is working really fine. Just one thing can we do this same if the file has other entires for example ( I have updated the question, could you please take a look? ) — Captain-Robot, May 21 '22 at 12:33
I see you have updated your question significantly. Normally you should ask a new question in such cases. I (and perhaps others on SO) would be happy to help you with the new question, but I have already invested my time into answering your original question, so I will be more likely to help with your new question if you first mark my answer to your original question as accepted (assuming you agree that it did what your first question asked, if not what you intended to ask). — constantstranger, May 21 '22 at 13:28
Feel free to put a link to your new question in the comments here, so I will know that you have asked it, and I will do my best to answer it as well. — constantstranger, May 21 '22 at 13:31
Hello sure bro. I'll post the new question as well. thanks for your help. — Captain-Robot, May 21 '22 at 17:55
https://stackoverflow.com/questions/72332043/remove-duplicates-from-column-names-and-print-crossponding-items — Captain-Robot, May 21 '22 at 17:59

sort row's similar columns and put row other items below the sorted row name

2 Answers2