Questions tagged [duplicate-detection]

18 questions
12
votes
2 answers

Near-duplicate video detection

I'm looking for an open source project that able to solve near-duplicate video detection problem. The best, that I've found now it's SOTU, but its source are closed. So, is there any open source solutions? Also, I will be very grateful for some…
Nelson Tatius
  • 7,693
  • 8
  • 47
  • 70
8
votes
2 answers

How to find duplicates including the first occurrence

I have this vector vector <- c("www.one","www.two","www.one","www.three") and I want to find all duplicates, including the first occurrence of the duplicated value. If I do dup <- duplicated(vector) I get dup # [1] FALSE FALSE TRUE FALSE while…
CptNemo
  • 6,455
  • 16
  • 58
  • 107
1
vote
2 answers

Find duplicated code in overridden method

Is there any program which can find duplicated code in a base method and overridden methods in inherited classes? I have a base class for 20 classes that has about 30 virtual methods (I didn't write this code). I found one method that has almost…
Jacek
  • 11,661
  • 23
  • 69
  • 123
1
vote
2 answers

C# - Looking for the list of duplicated rows (need optimization)

Please, I would like to optimize this code in C#, if possible. When there are less than 1000 lines, it's fine. But when we have at least 10000, it starts to take some time... Here a little benchmark : 5000 lines => ~2s 15000 lines => ~20s 25000…
1
vote
1 answer

R: detecting duplicated of *specific* columns

How do I detect in R duplicates of a specific columns? I know the duplicated() function, but it gives any duplicates, while I'm interested only if one specific column is duplicated. Example: > x = 1:5 > y=6:10 > z=11:15 > mat=cbind(x,y,x,x,y,z) >…
Ruslan
  • 911
  • 2
  • 11
  • 28
1
vote
2 answers

Avoid comma expressions and duplicate declarations

In order to clear up my code I have been paying attention to all of the hints in Web Storm. The following duplicate declaration errors have confused me. In the following code, is the double use of var necessary (or advised) to prevent a global…
Startec
  • 12,496
  • 23
  • 93
  • 160
1
vote
1 answer

R: subset a set of duplicates

Imagine this is my df >df gen A B C D M1 1 2 3 4 M1 8 6 5 3 M1 4 8 6 0 M1 8 5 6 3 M2 8 5 6 0 M2 0 2 8 6 M3 3 8 9 2 M3 8 9 5 6 M4 3 7 8 5 M4 5 6 3 2 Here, how to subset set…
ramesh
  • 1,187
  • 7
  • 19
  • 42
1
vote
2 answers

Create/Update Duplicate Detection - Only check against contacts with fieldX = true

We have a lot of contacts in CRM 2011 which are imported to support a legacy application. All these contacts have a field which is set to be true to indicate that we don't show these on any of the views. I am looking at a way to exclude these from…
Andrew
  • 9,967
  • 10
  • 64
  • 103
0
votes
2 answers

duplicate email addresses with ID column

My table consists of duplicate email addresses. Each email address has a unique create date and a unique ID. I want to identify the email address with the most recent create date and its associated ID and show the duplicate ID with its create date…
sqlbg
  • 73
  • 1
  • 11
0
votes
1 answer

Stored procedure duplicated ? in DB2

I've created a stored procedure in db2, and I've modified it a couple of times, but in my db manager (Dbbeaver) and RazorSQL the same stored procedure appears two times. How can I determine what the last version is?
0
votes
1 answer

retrieving unique entries from a mysql table with distinct

inside a mysql table there are several classifieds with the rows ID, title, advertiser_id In order to omit duplicated content inside my sitemap I am trying to retrieve a list where title and advertiser are unique. My sql stmt looks like…
merlin
  • 2,717
  • 3
  • 29
  • 59
0
votes
2 answers

Linux command or script counting duplicated bunch of lines in a text file?

I am looking for something like this, but instead of counting the number of duplicated lines I would need to count the number of duplicated bunch of lines. For the sake of clarification, I have a file like…
pafede2
  • 1,626
  • 4
  • 23
  • 40
0
votes
1 answer

How do I get my code to find whether a number repeats in an array in Java?

I've already written the code for this, but it didn't work. If it had worked, the run time complexity would have been very high. for (int collumnInput=0; collumnInput < 3; collumnInput++) { for (int rowInput = 0; rowInput < 3;…
cluemein
  • 884
  • 13
  • 27
0
votes
1 answer

Can R detect duplicate sentences in a word file?

I have one word document contains 100 pages and want to detect duplicate sentences. Is there any way to automatically do this in R? 1- convert to a txt file 2-read: tx=readLines("C:\\Users\\paper-2013.txt")
sacvf
  • 2,463
  • 5
  • 36
  • 54
0
votes
1 answer

How to use SSIS extract data from excel files to OLE DB without extracting the duplicated data

I'd like to extract data from excel file to OLE DB by using SSIS. The source files are some excel files. The identification for each entry is the date.There are some duplicated data between these files. For example In file1, the date of entries are…
Samual
  • 1
  • 1
1
2