We have written a de-duplication logic for contact records where we call a batch job from trigger (Yes, it sounds weird but the only thing that seems to work as we have variable criteria for each account). To overcome batch schedule limit of 5, we are using data loader with bulk API enabled and the natch size set to 1000 so that we can upload 5000 records successfully without hitting the 5 batch job limit. When I am testing with 3000 thousand contact records, let say they are named from Test0001 to Test3000 I am observing a strange behavior.
For 3000 records, 3 batch jobs start to run (as batch size is 1000). I am passing the newly inserted records in parameters to the stateful batch class. What I expect is that 1000 records will be passed for each of the 3 batch jobs and they will be compared to existing records for duplicates (which I query in start method of batch) but I only get Test0001 to Test0200 i.e. from batch of 1000 records inserted via data loader API only FIRST 200 records are passed in parameter to the batch class and rest 800 are not. This is something strance as it means only first 200 records are processes if I insert using a batch size of 1000 through data loader with Bulk API enabled.
Has anyone of you encountered this issue or have any ideas to share on how to deal with it? I can share code as well but I think the question is more conceptual. Any help is much appreciated.
Thanks
EDIT: Here is my code:
This is the call from after insert triiger -->
ContactTriggerHandler trgHandler = new ContactTriggerHandler();
trgHandler.deDupAndCreateOfficebyBatch(accountIdContactMap);
//accountIdContactMap is the map which contains List of new contacts w.r.t thier account.
This is the call from handler class -->
public void deDupAndCreateOfficebyBatch (Map<String,List<Contact>> accountIdContactMap){
ContactDeDuplicationBatch batchObj = new ContactDeDuplicationBatch(accountIdContactMap);
String jobId = Database.executeBatch(batchObj,100);
}
This is the batch -->
global class ContactDeDuplicationBatch implements Database.Batchable<sObject>, Database.Stateful{
//set of duplicate contacts to delete
global Set<Contact> duplicateContactSet;
//Map of list of new contacts with account id as key
global Map<String,List<Contact>> newAccIdContactMap;
/*Constructor*/
public ContactDeDuplicationBatch(Map<String,List<Contact>> accountIdContactMap){
System.Debug('## accountIdContactMap size = '+ accountIdContactMap.keySet().size());
newAccIdContactMap = accountIdContactMap;
duplicateContactSet = new Set<Contact>();
}
/*Start Method */
global Database.QueryLocator start(Database.BatchableContext BC){
System.Debug('## newAccIdContactMap size = '+ newAccIdContactMap.keySet().size());
if(newAccIdContactMap.keySet().size() > 0 && newAccIdContactMap.values().size() > 0){
//Fields to be fetched by query
String fieldsToBeFetched = 'Id, AccountId ';
//Add account Id's for contacts which are to be matched
String accountIds = '(';
for(String id : newAccIdContactMap.keySet()){
if(accountIds == '('){
accountIds += '\''+id+'\'';
}else{
accountIds += ', \''+id+'\'';
}
}
accountIds += ')';
String query = 'SELECT '+fieldsToBeFetched+' FROM Contact WHERE Target_Type__c <> \'Office\' AND AccountId IN '+accountIds;
return Database.getQueryLocator(query);
} else {
return null;
}
}
/*Execute Method */
global void execute(Database.BatchableContext BC, List<sObject> scope){
System.Debug('## scope.zixe '+scope.size());
System.Debug('## newAccIdContactMap.zixe '+newAccIdContactMap.size());
//In My execute method I get only 200 records in newAccIdContactMap per batch
}
/*Finish Method */
global void finish(Database.BatchableContext BC){
//Some logic using the two global variables
}
}
In My execute method I get only 200 records in newAccIdContactMap per batch
Thanks