Google Script App Delete Duplicate Rows with a Specific Value in Specific Column in Google Sheet

Question

I have a google sheet where people submit tasks they are working on and they'll submit a status update which is either 'in progress' or 'complete' for let's say task A. Task A is in column D[4] and Status is in column E[5]. I'm trying to get this code to only delete the 'in progress' row for Task A when there is duplicate Task A with the status being 'Complete'. This works for identifying the duplicate and removes the duplicate rows in order, but I'm not sure how to get it to only delete duplicate 'Task A' 'In Progress' rows when there is a 'Task A' 'Complete' row. Any help would be much appreciated!

function removeDuplicates() {
  var sheet = SpreadsheetApp.getActiveSpreadsheet().getSheetByName("Sheet2");
  var data = sheet.getDataRange().getValues();
  var newData = [];
  for (i in data) {
    var row = data[i];
    var duplicate = false;
    for (j in newData) {
      if(row[3] == newData[j][3]){
  duplicate = true;
}
    }
    if (!duplicate) {
      newData.push(row);
    }
  }
  sheet.clearContents();
  sheet.getRange(1, 1, newData.length, newData[0].length).setValues(newData);
}

can you share your spreadsheet please, I'll be glad to help. — JSmith, Oct 26 '18 at 17:52
@JSmith [link](https://docs.google.com/spreadsheets/d/1wAFJvAFpB8GkRUR5V4qZ7AfFIBddb1gaYjo_kO9B7F0/edit?usp=sharing) Thanks! — Ricky Adams, Oct 26 '18 at 20:49

JSmith · Accepted Answer · 2018-10-26T22:29:18.827

I would try

function removeDuplicates()
{
  var sheet = SpreadsheetApp.getActiveSpreadsheet().getSheetByName("test");
  var dataRange = sheet.getDataRange();
  var data = dataRange.getValues();
  var indexOfStatusColumn = 4;
  var newData = [];
  data.forEach(function (row, rowI){
    var isDuplicate = false
    if (newData.length)
    newData.forEach(function (dup, dupI)
    {
      if (dup[3] == row[3])
      {
        if (row[indexOfStatusColumn] == "Complete" && dup[indexOfStatusColumn] != "Complete")
          newData[dupI][indexOfStatusColumn] = "Complete";
        isDuplicate = true;
        return (false);
      }
    })
    if (!isDuplicate)
      newData.push(row);
  });
  dataRange = sheet.getRange(2, 1, dataRange.getLastRow() - 1, dataRange.getLastColumn());
  dataRange.clearContent();
  sheet.getRange(1, 1, newData.length, newData[0].length).setValues(newData);
}

few things

I use forEach()loop it's more efficient
break loop by returning falseto avoid parsing trough all your newDatauselessly
when I find a duplicate I perform some of this actions
1. Check if duplicate is Complete on data and not on newData if so
2. change the value in newDatato "Complete" this way it will keep the complete status (be careful if there's different datas on both rows it will probably twist datas).
also use clearContent() from a range to avoid removing all the content of the sheet but only a specific portion of it. Here I've rebuilded it to keep the header

REFERENCES

forEach()

Tanaike benchmark

clearContent()

thank you for the help, unfortunalty this did not seem to work. Here is the spreadsheet [link](https://docs.google.com/spreadsheets/d/1wAFJvAFpB8GkRUR5V4qZ7AfFIBddb1gaYjo_kO9B7F0/edit?usp=sharing). I hope this helps! I really appreciate the help! — Ricky Adams, Oct 26 '18 at 20:49
I just tried running it again and for some reason it's still not deleting the in-progress row. Instead, it's deleting the duplicate task with the complete status. It also makes a copy of the first row and inserts it into the second row. Any ideas why this might be? Sorry! and Thanks! — Ricky Adams, Oct 26 '18 at 22:11

Google Script App Delete Duplicate Rows with a Specific Value in Specific Column in Google Sheet

1 Answers1

Linked