I am new to both R and SO and after figuring quite a few things in my dataset, I am kind of stuck on this new challenge. I am working on a .csv dataset and I am using r for datacleaning.
If you see, the first column label reads 'District/Subdistrict'. In that column, the District names start with a underscore and the sub district names are written as is. Now what I need to do is create a new column at the end, (column number 5) in my .csv with the label 'District'. I need to know how to use grepl and/or ifelse to populate that new column based on the first column as follows. I am going to use the example of the District name <_A>.
The new column should contain the values <_A> corresponding to the values of the District <_A> and values of Subdistricts under the District such as , , in the first column. Similarly, this should repeat for other districts such as the next District name <_E> and its subdistricts.
I know how to load the data in R and set the working directory etc. I just need specific help with the code for this output that I am looking for. Even some sort of a generic form would be helpful. Apologies for the shortcomings in this question.
Sample data:
District/Subdistrict X Y Z
_A 10 12 13
B 8 40 15
C 21 22 23
D 32 40 21
_E 24 94 97
F 56 72 12
G 35 23 12
H 54 23 17
Expected output
District/Subdistrict X Y Z District
_A 10 12 13 _A
B 8 40 15 _A
C 21 22 23 _A
D 32 40 21 _A
_E 24 94 97 _E
F 56 72 12 _E
G 35 23 12 _E
H 54 23 17 _E