Hello everyone and many thanks in advance for your help,
As can be seen above, I have to assign numerical values from column A to column B, so that the number 4193 in column B matches the number 1 in column A each and every time, the number 15 in column B matches the number 2 in column A each and every time, and so on and so forth (this is just a random sample that I've presented as an example, as I'm working with an extremely large dataset). This should have been no big deal, but the thing is that these two columns are of different size (column A is much larger than column B).
I've spent hours and hours trying to do this by myself, as well as browsing forums, but I haven't found any similar question on how to get around this problem. Also, because the dataset I'm working with is extremely large, there is no way I could do this manually.
The main idea would be to have each single number in column B repeated standing side by side with its corresponding number from the column A (as explained previously). I don't know how to do this computationally, but logically, the idea is either to enlarge column B, or to put the numbers into another column, say, column C.
I'd be enormously grateful if anyone could help me out with this. A gist on how to do this in R would be much appreciated.
Many thanks once again!
ADDENDA:
DATA
df <- structure(list(A = c(rep(list(1048575))
), B = c(rep(list(10571)))), class = "data.frame", row.names = c(-10571L, -1048575L))
After introducing this data and trying to run the code, I get the following error: "Internal error in df_slice()
: Columns must match the data frame size.".
CODE
The closest I have got so far to the desired 1048575 observations in Column B is by using the following code:
Result <- match(A, unique(B))
However, the given output is a matrix of 'NA's, while what I'm looking for is to have two numerical columns of 1048575 observations each.
Similarly, this other code does give a numerical output, but it is still in a matrix form and the order is incorrect:
Result <- match((A %in% unique(B)), B, table(A)[seq_along(B)])
The desired output should look like this:
Thanks once again!