I have a dataframe with ~34,000 rows. I have a identifier column containing ~2,300 unique values, that are repeated an arbitrary number of times for each value in this column.
I need to recode these values into something shorter. So far, I can only find examples that show how to recode unique values manually, which isn't really practical in this case.
A tibble: 2300 × 1
ID
<dbl>
24650010203
24650010203
24650010203
24650010203
24650010304
24650010405
24650010405
24650010405
24650010405
24650010506
etc...
What's the quickest and easiest way to recode all these values in a way that rows with the same identifier retain their identity? It can be something as simple as an integer range from 0001:2300, though I'd like all IDs to have the same number of digits.
E.g.
24650010203 --> 0001
24650010203 --> 0001
24650010203 --> 0001
24650010203 --> 0001
24650010304 --> 0002
24650010405 --> 0003
24650010405 --> 0003
24650010405 --> 0003
24650010405 --> 0003
24650010506 --> 0004