Questions tagged [dedupeplugin]

9 questions
8
votes
1 answer

Dedupe in Python

While going through the examples of the Dedupe library in Python which is used for records deduplication, I found out that it creates a Cluster Id column in the output file, which according to the documentation indicates which records refer to each…
Arnab
  • 1,037
  • 3
  • 16
  • 31
3
votes
1 answer

Dedupe one new row against existing dataset

I'm using dedupe python library. Any code sample will do, for example this. Let's say I have a trained deduper and used it it to deduplicate a dataset successfully. Now I add one new row to the dataset. I want to check if this new row is a duplicate…
rfg
  • 1,331
  • 1
  • 8
  • 24
2
votes
2 answers

AttributeError: 'Dedupe' object has no attribute 'sample'

I was running the csv_example.py from dedupe-examples. I got an error message as below File "csv_example.py", line 111, in deduper.sample(data_d, 15000) AttributeError: 'Dedupe' object has no attribute 'sample' Any help would be…
1
vote
1 answer

Clustering Components

When clustering I receive the following warning UserWarning: A component contained 77760 elements. Components larger than 30000 are re-filtered. The threshold for this filtering is 4.08109134074e-15 What does this mean? My original thereshold…
Rtab
  • 123
  • 10
1
vote
1 answer

De-Duplicate libraries in app within deeply nested node modules

I have a app in which i can add modules as node_modules. Now, these modules and app uses a library XYZ as node module. Also, these modules have other node modules which has their own library XYZ as a node module. So, this is roughly how the…
Cute_Ninja
  • 4,742
  • 4
  • 39
  • 63
0
votes
0 answers

SAS Array Dedupe

I have a question on the SAS code below. I am new to arrays and what the below code is doing exactly. My understanding is that there are two indices below. I believe this is deduping the SAS data set by the two indices. I am not exactly sure. Thanks…
user9016406
0
votes
1 answer

Webpack dedupe webpack bundle

Will webpack dedupe packages that have already been bundled with webpack? For example, | Webpack bundle 1 | |------------------| | react@15.5 | | jquery@3.0 | | Webpack app bundle | |--------------------| | react@15.5 | |…
Matt
  • 59
  • 6
0
votes
0 answers

Using React components bundled with Webpack causes duplication of submodules

We have 4 React components bundled with Webpack (version 1): A, B, C and D. The dependency tree looks like this: A B D C D We want each component to be reusable, so we use webpack to generate a UMD module. The generated bundle…
jaime
  • 520
  • 5
  • 15
-1
votes
2 answers

SQL: Trying and failing to sort data to display certain months

I'm working on a homework assignment, and all has been well until I got to this point. My professor wants me to pull only dates in MARCH, APRIL, and MAY, without using the BETWEEN operator. NOTE: I'm not getting any errors. I am using EDUPE, which…
Sierra
  • 327
  • 4
  • 11