1

I need help with R in finding the right way to apply the same process as the unique() function does, just in the whole table. My data are organized by rows, with no column names. It is a big table, 130 X 180. So, what I would need is a list of all unique elements by row, and information on how many times each of the elements appears in each row.

example of my data:

Row1, F, M2, E, E, H, E, E, H, E, E, M, 21, E, M, L, L, L, L, L, E, H, L, L, L, L, L, L, L, L, L, L, E, H, L, L, L, L, L, L, L, L, L, E, H, L, L, L, L, L, L, L, L, L  
Row2, A3, A3, V, R, A3, A3, V, R, A3, A3, V, R, A3, A3, R, A3, A3, V, R, A3, A3, V, R, A3, A3, V, V, W, 12, N, N, N, W, N, N, N, 21, N, N, W, N, N, N, 21, N, N, W    
Row3, I, M, A1, A1, H, A1, A1, H, A1, A1, H, A1, A1, H, M, D2, M, L, L, L, L, D2, M, L, L, L, L, D2, M2, G, M2, G, M2, G, R, K, E, R, K, E, R, K, E, R, K, E  
Row4, H, A1, A1, H, A1, A1, H, A1, A1, M, A1, A1, H, A1, A1, A1, W, N, N, 21, N, N, W, N, 21, W, 21, W, W, Q, Q, Q, Q, Q, Q, L, F, D, Q, Q, Q, Q, Q, F, D, Q 

Which can be found as a .txt file here

The correct answer for Row1 would be (plus the frequency of elements, which I don't know how to count):

> unique(Row1)
[1] " F"  " M2" " E"  " H"  " M"  " 21" " L" 

But when I try to apply it to the whole table, it counts by columns, and I need the answer by rows.

Ben Bolker
  • 211,554
  • 25
  • 370
  • 453
Divna Djokic
  • 37
  • 1
  • 6
  • I understand that you're frustrated with the closure of your previous question, but it's better practice to try to edit your previous question (and hope for re-opening) rather than post the same question again (if you must, it's polite to link to the previous attempt ...) – Ben Bolker Oct 25 '20 at 19:36
  • Hi Ben, sorry, I actually tried that, several times, but nothing seemed to change, the question stayed closed. So I was not totally sure if that was a permanent state of it, excuse me, I am new to Stackflow, still need to learn how things work over here. Thank you for your patience and advices, and sorry for the extra trouble I gave you! – Divna Djokic Oct 25 '20 at 19:49
  • Fair enough. Maybe go back and delete the old question now ... – Ben Bolker Oct 25 '20 at 19:54
  • Fair enough! :) – Divna Djokic Oct 25 '20 at 20:36

1 Answers1

4

We can use apply to loop over the data and get the unique

apply(df1[-1], 1, unique)

If there are leading, lagging spaces, use trimws to remove those

apply(df1[-1], 1, function(x) unique(trimws(x)))
#[[1]]
#[1] "F"  "M2" "E"  "H"  "M"  "21" "L" 

#[[2]]
#[1] "A3" "V"  "R"  "W"  "12" "N"  "21" ""  

#[[3]]
# [1] "I"  "M"  "A1" "H"  "D2" "L"  "M2" "G"  "R"  "K"  "E"  ""  

#[[4]]
# [1] "H"  "A1" "M"  "W"  "N"  "21" "Q"  "L"  "F"  "D"  ""  

If it needs the frequency, use table

apply(df1[-1], 1, table)

data

df1 <- read.csv("Data_exmpl.txt", fill = TRUE, header = FALSE)
akrun
  • 874,273
  • 37
  • 540
  • 662
  • Hi Akrun, thank you, this seems to work just fine! Can you please explain to me how you made unique() work by row and not by column? Thank you! – Divna Djokic Oct 25 '20 at 19:25
  • 1
    @DivnaDjokic it is the `MARGIN = 1` in `apply` that loop over row. If you used `2`, it will loop over columns – akrun Oct 25 '20 at 19:26
  • Okay, that's great, thank you so much! Just one remark- I have a problem of exporting the results now, as it is a list of lists, which are pretty long (when working with the original data), so or it doesn't save everything, or there are numbers I do not understand. I tried all the answers given in this question . Do you happen to have any idea how to solve this? – Divna Djokic Oct 25 '20 at 19:35
  • @DivnaDjokic How do you want the output to be. Should that be a single dataset or a `list` ? – akrun Oct 25 '20 at 19:36
  • 1
    @DivnaDjokic if you want to create a txt output with the list. you can use `capture.output(out, 'file.txt')` – akrun Oct 25 '20 at 19:40
  • 1
    Thank you for the idea! It worked, with the slight modification `cat(capture.output(print(List), file="file.txt"))` Thank you so much for your help, cheers! – Divna Djokic Oct 25 '20 at 23:21