I have the following sample data. The last column is a combo of the first three columns separated by a '-'. The "Time" column will always be in an increasing order for each unique "Group" defined in the last column. The data is sorted by "Line", "Blk", "Time".
I'm trying to append two columns to this dataset.
Line Blk Trp Time Group
1 1 1494 826 32280 1-1494-826
2 1 1494 865 32340 1-1494-865
3 1 1494 865 32640 1-1494-865
4 1 1494 865 32940 1-1494-865
5 1 1494 865 33180 1-1494-865
6 1 1494 865 33780 1-1494-865
7 1 1494 829 34080 1-1494-829
8 1 1494 829 34800 1-1494-829
9 1 1494 829 35280 1-1494-829
10 1 1494 829 35580 1-1494-829
11 1 1494 829 35880 1-1494-829
12 2 1495 926 32280 2-1495-926
13 2 1495 965 32340 2-1495-965
14 2 1495 965 32640 2-1495-965
15 2 1495 965 32940 2-1495-965
16 2 1495 965 33180 2-1495-965
17 2 1495 965 33780 2-1495-965
18 2 1495 929 34080 2-1495-929
19 2 1495 929 34800 2-1495-929
20 2 1495 929 35280 2-1495-929
21 2 1495 929 35580 2-1495-929
The output after appending the two columns is shown below.
Column "Sqn" always starts at 1 for each unique "Group" and increases by 1 when the "Time" increases in that "Group".
Column "Rtn" will have only two values (D1,D2). The first time I encounter a new "Line" number, this value is D1 for the first unique "Group". Now, within that specific "Line", this value alternates to D2 and back to D1 and so on for each new "Group".
Line Blk Trp Time Group Sqn Rtn
1 1 1494 826 32280 1-1494-826 1 D1
2 1 1494 865 32340 1-1494-865 1 D2
3 1 1494 865 32640 1-1494-865 2 D2
4 1 1494 865 32940 1-1494-865 3 D2
5 1 1494 865 33180 1-1494-865 4 D2
6 1 1494 865 33780 1-1494-865 5 D2
7 1 1494 829 34080 1-1494-829 1 D1
8 1 1494 829 34800 1-1494-829 2 D1
9 1 1494 829 35280 1-1494-829 3 D1
10 1 1494 829 35580 1-1494-829 4 D1
11 1 1494 829 35880 1-1494-829 5 D1
12 2 1495 926 32280 2-1495-926 1 D1
13 2 1495 965 32340 2-1495-965 1 D2
14 2 1495 965 32640 2-1495-965 2 D2
15 2 1495 965 32940 2-1495-965 3 D2
16 2 1495 965 33180 2-1495-965 4 D2
17 2 1495 965 33780 2-1495-965 5 D2
18 2 1495 929 34080 2-1495-929 1 D1
19 2 1495 929 34800 2-1495-929 2 D1
20 2 1495 929 35280 2-1495-929 3 D1
21 2 1495 929 35580 2-1495-929 4 D1
I did the "Sqn" column using for
loop and quickly realized it's highly inefficient. I tried to see if I can use diff
somehow, but got stuck because "Group" column is not an integer.
Any efficient solution for adding the two columns would be appreciated.