I have a large dataframe (approximately 40000 x 500) that I read from a .CSV file. The dataset contains a list of error code combinations, with each combination specifically listing all the unique codes. I'm searching for the rows that contain a certain piece of string, like "name="
, to see if it is known who listed that combination. The problem is that I do not know in which column this may occur, since the 10th column and onwards are irregular per row, as the number of error codes per combination is irregular. This means that after some point the rows are empty, except for the biggest error code combination. I only know that if the row contains the piece of string I'm looking for, it will be in one of the last 6 columns of a row containing any information. If the piece of string I'm looking for is found in a row, I would like an answer in a different column.
I have extensively looked for a solution, but I couldn't find anything that serves my needs. Also, I'm new to R.
[Fake dataset]
Year Problem_ID Number.of.errors X X.1 X.2 X.3 X.4
1 2005 Server 2 X-4555 X-1222 name=Sara
2 2011 Hardware 3 X-8922 X-7644 X-6485
3 2015 Software 4 X-8494 X-4321 X-7452 X-5321 name=John
4 2018 Hardware 2 X-6901 X-1121
5 2007 Server 3 X-7655 X-6555 X-8999 name=Sara
[Desired output]
Year Problem_ID Number.of.errors Name X X.1 X.2 X.3 X.4
1 2005 Server 2 1 X-4555 X-1222 name=Sara
2 2011 Hardware 3 0 X-8922 X-7644 X-6485
3 2015 Software 4 1 X-8494 X-4321 X-7452 X-5321 name=John
4 2018 Hardware 2 0 X-6901 X-1121
5 2007 Server 3 1 X-7655 X-6555 X-8999 name=Sara