0

I have read a txt file into an R data.frame. Some lines are duplicated. How can i write to a new file without a duplicate (only one of the duplicates for each duplicate) For example:

A; a
B; a
C; a
A; a
C; a
A; b

I must write to new file:

A; a
B; a
C; a
A; b

I tried. My code:

#read file 
t = read.table('/home/BigClaster.txt',sep=';',header = FALSE)
........

I have big file ~ 1269821 lines in txt file. When i read file RStudio in Environment show me line size without duplicate (1,095,079) enter image description here

When i rewrite to new file i get duplicate lines

Dossanov
  • 121
  • 1
  • 1
  • 9

1 Answers1

1

R base

 t[!duplicated(t), ]

Dplyr

t %>% distinct(.keep_all = TRUE)

Result

  V1 V2
1  A  a
2  B  a
3  C  a
6  A  b
Rana Usman
  • 1,031
  • 7
  • 21