0

The data of the 2nd row should be shifted next to the data of the 1st row.This thing should be done for every 10 rows.As if the dataset is a matrix of 20x10, it should become 2x100.

input:
1 - A B C D E F G
2 - H I J K L M N
.
.
.
10 - O P Q R S T U

Output:
1 - A B C D E F G H I J K L M N . . . . . . . . O P Q R S T U

Ravikumar
  • 1,121
  • 1
  • 12
  • 23

2 Answers2

0

I realize you tagged your question with Python, but here's a command line way to do it:

xargs -n10 -d'\n' < yourlistfile.txt

Where yourlistfile.txt is the name of the file you want to parse.

The command, as written, will output to the screen. You can redirect that output to a new file by adding this to the end of that command: > your_results.txt, e.g.:

xargs -n10 -d'\n' < yourlistfile.txt > reorganizedlistfile.txt

Check out some other ideas in this post: How to merge every two lines into one from the command line?

Note: apparently some versions of xargs don't like that -d option, e.g. I'm getting an error on MacOS saying the option is not supported. But for the simple example in which tokens are separated by hard return, the delimiter parameter is not required anyway.

Marc
  • 11,403
  • 2
  • 35
  • 45
0

Assuming you want to read from a file, and that you might be relatively new to Python, here is some example code for you to look over. I've tried to add enough comments & safety checks to give you an idea of how it works and how you might extend it.

Note that you still have to do something with the results in this python version, I would strongly consider @Marcs's elegant answer if you can just use that.

Lots of assumptions to consider here. How certain are you every line has the same number of things? I added some logic to check for that, I find being modestly paranoid like that to be helpful.

Assuming you want to read from a file, here is an example program for you to consider:

output

line_cnt=1 #things=3, line="a b c"
line_cnt=2 #things=3, line="d e f"
line_cnt=3 #things=3, line="g h i"
gathered 3 into=['a', 'b', 'c', 'd', 'e', 'f', 'g', 'h', 'i']
line_cnt=4 #things=3, line="j k l"
line_cnt=5 #things=3, line="m n o"
line_cnt=6 #things=3, line="p q r"
gathered 3 into=['j', 'k', 'l', 'm', 'n', 'o', 'p', 'q', 'r']
line_cnt=7 #things=3, line="s t v"
line_cnt=8 #things=3, line="u w x"
line_cnt=9 #things=3, line="y z 0"
gathered 3 into=['s', 't', 'v', 'u', 'w', 'x', 'y', 'z', '0']
line_cnt=10 #things=3, line="1 2 3"
line_cnt=11 #things=3, line="4 5 6"
line_cnt=12 #things=3, line="7 8 9"
gathered 3 into=['1', '2', '3', '4', '5', '6', '7', '8', '9']
now have 4 rows
rows=['a', 'b', 'c', 'd', 'e', 'f', 'g', 'h', 'i']
['j', 'k', 'l', 'm', 'n', 'o', 'p', 'q', 'r']
['s', 't', 'v', 'u', 'w', 'x', 'y', 'z', '0']
['1', '2', '3', '4', '5', '6', '7', '8', '9']

Process finished with exit code 0

source code

import io

def join_dataset(f, nrows=10):
    temp_row = [ ]
    consolidated_rows = [ ]
    expected_row_size = None
    line_cnt = 0
    for line in f:
        line_cnt += 1 # want one-based line numbering so not using enumerate
        line = line.strip() # remove trailing newline
        things = line.split() # create list based on whitespace
        row_size = len(things) # check how long this row's list is
        if expected_row_size is None:
            expected_row_size = row_size # assume all same size as 1st row
        elif row_size != expected_row_size:
            raise ValueError('Expected {} things but found {}  on line# {}'.format(expected_row_size,row_size,line_cnt))
        print('line_cnt={} #things={}, line="{}"'.format(line_cnt, len(things), line))
         # read about append vs extend here https://stackoverflow.com/q/252703/5590742
        temp_row.extend(things)
        # check with %, the mod operator, 1%3 = 1, 2%3 = 2, 3%3 = 0 (even division), 4%3 = 1, 5%3 = 2, etc.
        # We're counting lines from 1, so if we get zero we have that many lines
        if 0 == (line_cnt % nrows):
            print('gathered {} into={}'.format(nrows,temp_row))
            # or write gathered to another file
            consolidated_rows.append(temp_row)
            temp_row = [ ] # start a new list
    if temp_row:
        # at end of file, but make sure we include partial results
        # (if you expect perfect alignment this would
        # be another good place for an error check.)
        consolidated_rows.append(temp_row)
    return consolidated_rows

test_lines="""a b c
d e f
g h i
j k l
m n o
p q r
s t v
u w x
y z 0
1 2 3
4 5 6
7 8 9"""
# if your data is in a file use:
# with open('myfile.txt', 'r') as f:
with io.StringIO(test_lines) as f:
    rows = join_dataset(f, nrows=3)
    # rows = join_dataset(f) # this will default to nrows=10
print('now have {} rows'.format(len(rows)))
print('rows={}'.format('\n'.join([str(row) for row in rows])))
jgreve
  • 1,225
  • 12
  • 17