My Dataset looks something like this. Note below is hypothetical dataset.
Objective: Sales employee has to go to a particular location and verify the houses/Stores/buildings and device captures below mentioned information
Sr.No. | Store_Name | Phone-No. | Agent_id | Area | Lat-Long |
---|---|---|---|---|---|
1 | ABC Stores | 89099090 | 121 | Bay Area | 23.909090,89.878798 |
2 | Wuhan Masks | 45453434 | 122 | Santa Fe | 24.452134,78.123243 |
3 | Twitter Cafe | 67556090 | 123 | Middle East | 11.889766,23.334483 |
4 | abc | 33445569 | 121 | Santa Cruz | 23.345678,89.234213 |
5 | Silver Gym | 11004110 | 234 | Worli Sea Link | 56.564311, 78.909087 |
6 | CK Clothings | 00908876 | 223 | 90 th Street | 34.445887, 12.887654 |
Facts: #1 Unique Identifier for finding Duplicates – ** Check Sr.No 1 & 4 basically same
In this dummy dataset all the columns can be manipulated i.e. for same store/house/building-outlet
a) Since Name is entered manually for same house/store names can be changed and entered in the system - multiple visits can happen b) Mobile number can also be manipulated, different number can be associated with same outlet
c) Device with Agent capturing lat-long info also can be fudged - by moving closer or near to the building
Problem:
How to make Lat-Long Data as the Unique Identifier keeping in mind point - c), above for finding duplicates in the huge dataset.
Deploying QR is not also very helpful as this can also be tweaked.
Hereby stopping the fraudulent practice by an employee ( Same emp can visit same store/outlet or a different emp can also again visit the same store outlet to increase visit count)
Right now I can only think of Lat-Long Column to make UID please feel free to suggest if anything else can be made