0

I have a list of filenames like:

helloworld#123.xml
hi.xml
test#1.xml
thisguyrighthere.xml

The program I'm designing will use this list(newFileList) to compare to another list(existingFileList) for duplicates. When I run the program, it will search the existingFileLists with a binary search(they are actually large lists) and remove from the newFileList as they are found. After the newFileList has been trimmed down, it will add remaining elements to the existingFileList. So if I ran the program twice with the exact same newFileList, the newFileList should be empty after the end of this process.

The issue I'm having(code to be shown below), is that the first element is not being removed from the newFileList and is repeatedly being added to existingFileList and produces a file contains these lines (the last line is repeated depending on how many times the program is run):

helloworld#123.xml
hi.xml
test#1.xml
thisguyrighthere.xml
helloworld#123.xml

Here are the relevant code snippets:

public class FileName : IComparable<FileName>
{
    public string fName { get; set; }
    public int CompareTo(FileName other)
    {
        return fName.CompareTo(other.fName);
    }
}

public static void CheckLists(List<FileName> newFileList, List<FileName> existingFileList)
    {
        for (int i = newFileList.Count - 1; i>-1; i--)
        {
            if (existingFileList.BinarySearch(newFileList[i]) > 0)
            {
                newFileList.Remove(newFileList[i]);
            }               
        }
    }

The purpose for this process is to grab a list of files from an FTP and copy them to another FTP while preventing duplicates. If someone can think of a better way(I've tried a couple and this seemed to be the fastest so far), I'd be open to changing the way this all works. Any help would be greatly appreciated!

John Boling
  • 464
  • 6
  • 14
  • 1
    See top answer here - http://stackoverflow.com/questions/47752/remove-duplicates-from-a-listt-in-c-sharp - is that what you want? – PaulF Jul 02 '15 at 15:03
  • I'm unable to test the code right now so I could be way off the mark (hence comments), but inside the CheckLists loop I think you need to take a copy of the value of i, then use the copy in the following statements. – Equalsk Jul 02 '15 at 15:17
  • I need to be able to iterate thru the list, can HashSets do that? – John Boling Jul 02 '15 at 18:33
  • @Equalsk If you get a chance to show me what you mean, could you? I've tried doing what you said and it didn't work. Maybe I just didn't get the exact idea of what you are saying. – John Boling Jul 02 '15 at 18:38
  • Ah, I can see that you solved it. Nice one. Just in case anyone else is curious, I meant that inside your loop you would have put `int index = i` and then you would use `newFileList[index]` instead. – Equalsk Jul 02 '15 at 19:24

2 Answers2

1

why not use linq? Is this what you want?

newFileList.RemoveAll(item => existingFileList.Contains(item));
maraaaaaaaa
  • 7,749
  • 2
  • 22
  • 37
0

I found that this worked:

public static void CheckLists(List<FileName> sourceFileList, List<FileName> targetFileList)
    {
        for (int i = targetFileList.Count - 1; i>-1; i--)
        {
            sourceFileList.RemoveAll(x => x.fName == targetFileList[i].fName);             
        }
    }
John Boling
  • 464
  • 6
  • 14