Questions tagged [duplicate-detection]
18 questions
12
votes
2 answers
Near-duplicate video detection
I'm looking for an open source project that able to solve near-duplicate video detection problem. The best, that I've found now it's SOTU, but its source are closed. So, is there any open source solutions?
Also, I will be very grateful for some…

Nelson Tatius
- 7,693
- 8
- 47
- 70
8
votes
2 answers
How to find duplicates including the first occurrence
I have this vector
vector <- c("www.one","www.two","www.one","www.three")
and I want to find all duplicates, including the first occurrence of the duplicated value. If I do
dup <- duplicated(vector)
I get
dup
# [1] FALSE FALSE TRUE FALSE
while…

CptNemo
- 6,455
- 16
- 58
- 107
1
vote
2 answers
Find duplicated code in overridden method
Is there any program which can find duplicated code in a base method and overridden methods in inherited classes?
I have a base class for 20 classes that has about 30 virtual methods (I didn't write this code). I found one method that has almost…

Jacek
- 11,661
- 23
- 69
- 123
1
vote
2 answers
C# - Looking for the list of duplicated rows (need optimization)
Please, I would like to optimize this code in C#, if possible.
When there are less than 1000 lines, it's fine. But when we have at least 10000, it starts to take some time...
Here a little benchmark :
5000 lines => ~2s
15000 lines => ~20s
25000…

PublicDisplayName
- 13
- 3
1
vote
1 answer
R: detecting duplicated of *specific* columns
How do I detect in R duplicates of a specific columns? I know the duplicated() function, but it gives any duplicates, while I'm interested only if one specific column is duplicated. Example:
> x = 1:5
> y=6:10
> z=11:15
> mat=cbind(x,y,x,x,y,z)
>…

Ruslan
- 911
- 2
- 11
- 28
1
vote
2 answers
Avoid comma expressions and duplicate declarations
In order to clear up my code I have been paying attention to all of the hints in Web Storm. The following duplicate declaration errors have confused me.
In the following code, is the double use of var necessary (or advised) to prevent a global…

Startec
- 12,496
- 23
- 93
- 160
1
vote
1 answer
R: subset a set of duplicates
Imagine this is my df
>df
gen A B C D
M1 1 2 3 4
M1 8 6 5 3
M1 4 8 6 0
M1 8 5 6 3
M2 8 5 6 0
M2 0 2 8 6
M3 3 8 9 2
M3 8 9 5 6
M4 3 7 8 5
M4 5 6 3 2
Here, how to subset set…

ramesh
- 1,187
- 7
- 19
- 42
1
vote
2 answers
Create/Update Duplicate Detection - Only check against contacts with fieldX = true
We have a lot of contacts in CRM 2011 which are imported to support a legacy application. All these contacts have a field which is set to be true to indicate that we don't show these on any of the views.
I am looking at a way to exclude these from…

Andrew
- 9,967
- 10
- 64
- 103
0
votes
2 answers
duplicate email addresses with ID column
My table consists of duplicate email addresses. Each email address has a unique create date and a unique ID. I want to identify the email address with the most recent create date and its associated ID and show the duplicate ID with its create date…

sqlbg
- 73
- 1
- 11
0
votes
1 answer
Stored procedure duplicated ? in DB2
I've created a stored procedure in db2, and I've modified it a couple of times, but in my db manager (Dbbeaver) and RazorSQL the same stored procedure appears two times. How can I determine what the last version is?

Josue Estrada
- 3
- 1
0
votes
1 answer
retrieving unique entries from a mysql table with distinct
inside a mysql table there are several classifieds with the rows ID, title, advertiser_id
In order to omit duplicated content inside my sitemap I am trying to retrieve a list where title and advertiser are unique.
My sql stmt looks like…

merlin
- 2,717
- 3
- 29
- 59
0
votes
2 answers
Linux command or script counting duplicated bunch of lines in a text file?
I am looking for something like this, but instead of counting the number of duplicated lines I would need to count the number of duplicated bunch of lines.
For the sake of clarification, I have a file like…

pafede2
- 1,626
- 4
- 23
- 40
0
votes
1 answer
How do I get my code to find whether a number repeats in an array in Java?
I've already written the code for this, but it didn't work. If it had worked, the run time complexity would have been very high.
for (int collumnInput=0; collumnInput < 3; collumnInput++)
{
for (int rowInput = 0; rowInput < 3;…

cluemein
- 884
- 13
- 27
0
votes
1 answer
Can R detect duplicate sentences in a word file?
I have one word document contains 100 pages and want to detect duplicate sentences.
Is there any way to automatically do this in R?
1- convert to a txt file
2-read:
tx=readLines("C:\\Users\\paper-2013.txt")

sacvf
- 2,463
- 5
- 36
- 54
0
votes
1 answer
How to use SSIS extract data from excel files to OLE DB without extracting the duplicated data
I'd like to extract data from excel file to OLE DB by using SSIS. The source files are some excel files. The identification for each entry is the date.There are some duplicated data between these files.
For example
In file1, the date of entries are…

Samual
- 1
- 1