3

I have a strange thing where some code I am doing is modifying both the copy and the original List.. I have boiled the problem down as much as I can to only show the error in a single file. Though my real world example us a lot more complex.. but at the root of it all this is the problem.

using System;
using System.Collections.Generic;
using System.Linq;
using System.Text;
using System.Threading.Tasks;

namespace TestingRandomShit
{
    class Program
    {
        private static string rawInput;
        private static List<string> rawList;
        private static List<string> modifiedList;

        static void Main(string[] args)
        {
            rawInput = "this is a listing of cats";

            rawList = new List<string>();
            rawList.Add("this");
            rawList.Add("is");
            rawList.Add("a");
            rawList.Add("listing");
            rawList.Add("of");
            rawList.Add("cats");

            PrintAll();

            modifiedList = ModIt(rawList);

            Console.WriteLine("\n\n**** Mod List Code has been run **** \n\n");
            PrintAll();
        }

        public static List<string> ModIt(List<string> wordlist)
        {

            List<string> huh = new List<string>();
            huh = wordlist;

            for (int i = 0; i < huh.Count; i++)
            {
                huh[i] = "wtf?";
            }
            return huh;
        }

//****************************************************************************************************************
//Below is just a print function.. all the action is above this line


        public static void PrintAll()
        {
            Console.WriteLine(": Raw Input :");
            Console.WriteLine(rawInput);

            if (rawList != null)
            {
                Console.WriteLine("\n: Original List :");
                foreach (string line in rawList)
                {
                    Console.WriteLine(line);
                }
            }

            if (modifiedList != null)
            {
                Console.WriteLine("\n: Modified List :");
                foreach (string wtf in modifiedList)
                {
                    Console.WriteLine(wtf);
                }
                Console.ReadKey();
            }
        }
    }
}

Basically, I have three variables.... a string and two List. The original code dose some tokenisation on the string but for this demo I simple use the List.Add() to fake it to make it simple to read.

So I now have a string and a List with a single word in each element.

This is the confusing part that I do not understand.. I know it has something to do with references but I can not work out how to fit it.

There is a method I have called ModIt()... it simple takes in a List then makes a completely new List called huh, copies the original list over the new list and then changes every line in huh to "wtf?".

Now as I understand it.. I should end up with 3 variables...

1) a string 2) a List with a different word in each element 3) a List of the same length as the other with each element being "wtf?"

But, what happens is that is I try to print out both List they BOTH have every element set to "WTF?".... so yeah.. wtf man? I am super confused. I mean in the ModIt I even build a entire new string rather than modding the one being passes but it doesn't seem to effect anything.

This is the output...

: Raw Input : this is a listing of cats

: Original List : this is a listing of cats

**** Mod List Code has been run ****

: Raw Input : this is a listing of cats

: Original List : wtf? wtf? wtf? wtf? wtf? wtf?

: Modified List : wtf? wtf? wtf? wtf? wtf? wtf?

ProgrammingLlama
  • 36,677
  • 7
  • 67
  • 86
aJynks
  • 677
  • 2
  • 14
  • 27

3 Answers3

12

huh = wordlist; doesn't copy the items of wordlist into a new list, it copies the reference to the same object occupied by wordlist (i.e. huh and wordlist then point at the same object in memory).

If you want a copy, the simplest way to produce one is using LINQ:

List<string> huh = wordlist.ToList();

Note that this will be a "shallow copy". If your list stores reference objects, both the old and new lists will store references to the same objects.

See here for more reading on value vs reference types, and then here if you need a deep copy.

Since all you're doing is replacing the value at an index of the list, I imagine a shallow copy is fine.

ProgrammingLlama
  • 36,677
  • 7
  • 67
  • 86
  • 3
    Just a side note: due the immutable nature of the strings in .NET, this will work as intended, but it can bring some interesting results if the list contains an object that has a string property for example. – Dimitar Jan 07 '19 at 06:43
  • Thanks guys.. I can not accept an answers as it was posted so quickly... but this works and the link you gave me is helpful. So thanks! – aJynks Jan 07 '19 at 06:48
5

John's already commented on the faulting code:

        List<string> huh = new List<string>();
        huh = wordlist;

Here you make a new list, then throw it away and attach your reference huh to your old list, so both huh and wordlist refer to the same thing..

I just wanted to point out the non LINQ way of copying a list:

        List<string> huh = new List<string>(wordlist);

Pass the old list into the new list's constructor; list has a constructor that takes a collection of objects to store in the new list

You now have two lists, and initially they both refer to the same strings, but because strings cannot be altered, if you start to change the strings inside the list (rather than just shufffling or removing them from the list) new ones will be created

If a worthy point though; you'll have 2 lists pointing to the same objects so if you have, in the future, the same scenario with objects that can be changed and you change the object in one list it will also change in the other list:

//imagine the list stores people, the age of the first
//person in the list is 27, and we increment it
List1[0].PersonAge++;

//list2 is a different list but refers to the same people objects
//this will print 28
Console.Out.WriteLine(list2[0].PersonAge);

That's what we mean by a shallow copy

Caius Jard
  • 72,509
  • 5
  • 49
  • 80
4

Your problem comes from the fact that in C# we have reference types and value types.

Value types can be assigned values by the direct assignment operator (=), but for reference types it is different. Reference types do not store the actual data itself, they store a location in memory where the data is held. Like pointers, if you come from the C world.

Have a look into IClonable. Also read Parameter passing by Jon Skeet, it gives a good description of value and reference types.

mehmetseckin
  • 3,061
  • 1
  • 31
  • 43
navbor
  • 43
  • 2