I work with a large df with 'sloppy' strings with characters, numbers and punctuation characters like this:
cnames <- c("X1_1", "X1_12", "X1_9", X11_9, "X4_112", "X4_2")
These strings can't be ordered properly by R
because of the missing of the required 'preceeding zeros'.
I worked with the regular expressions to convert it to:
"X01_01", "X01_12", "X01_09", X11_09, "X04_12", "X04_02"
and this requires quite a bit of programming (was a bit rusty on RegEx)!
I think I am not the only one that faces this problem so I wondered:
Is there a package that:
- automatically detects 'patterns' which parts of the code consists of numbers
- detects the maximum length of each part
- fills in the right number of zero's that has to be placed before each number
- returns the string in the format that can be ordered logically
If it does not exist, maybe I found a nice case to write a package.