0

I'm looking for help with a problem I'm trying to solve in R.

I have DNA alignment results, a big dataset (more than 200 000 rows, and 20 columns) and I want to clean it and delete non-specific sequence and have at the end just one DNA sequence name for one species.

I've tried unique,duplicate and distinct function but they always keep the first duplicate rows and I don't want them, I would like to delete ALL the duplicate rows.

Do you have an idea how to solve my problem?

user4157124
  • 2,809
  • 13
  • 27
  • 42
  • 1
    Welcome to stack. Please try to give a reproducible example, see [mcve], so people can reproduce your problem and give a solution that others can try – denis Dec 11 '20 at 15:43
  • 1
    Does this answer your question? [Remove all copies of rows with duplicate values in R](https://stackoverflow.com/questions/35507787/remove-all-copies-of-rows-with-duplicate-values-in-r) – denis Dec 11 '20 at 15:45
  • 1
    Welcome to SO, Lpr! Beyond denis' recommendation for a minimal reprex, here are two other links on how to provide a self-contained question, an expectation on SO: https://stackoverflow.com/q/5963269 and https://stackoverflow.com/tags/r/info. It also will help (when you add sample data and code you've attempted) to ensure the question is formatted reasonably well, a good reference for that is https://stackoverflow.com/editing-help. (If you don't get the formatting perfectly, we can help, but we can't help with missing data/code.) Thanks! – r2evans Dec 11 '20 at 15:49

0 Answers0