0

I am trying to use modifying the original code which uses 2 classes (true,false) to use 4 classes (unacc, acc, good, vgood). The original code assume the "boolean", so to workaround it I need to compare string to string.

For some reason, it is returning the following error object reference not set to an instance of an object. I have google it and this happens when I don't have something initiated which is not the case.

Error:

Unhandled Exception: System.NullReferenceException: Object reference not set to an instance of an object.
   at AA.TreeNode..ctor(Attribute attribute) in c:\VS\Program.cs:line 117
   at AA.DecisionTreeID3.internalMountTree(DataTable samples, String targetAttribute, Attribute[] attributes) in c:\VS\Program.cs:line 574
   at AA.DecisionTreeID3.mountTree(DataTable samples, String targetAttribute, Attribute[] attributes) in c:\VS\Program.cs:line 626
   at AA.DecisionTreeID3.internalMountTree(DataTable samples, String targetAttribute, Attribute[] attributes) in c:\VS\Program.cs:line 607
   at AA.DecisionTreeID3.mountTree(DataTable samples, String targetAttribute, Attribute[] attributes) in c:\VS\Program.cs:line 626
   at AA.DecisionTreeID3.internalMountTree(DataTable samples, String targetAttribute, Attribute[] attributes) in c:\VS\Program.cs:line 607
   at AA.DecisionTreeID3.mountTree(DataTable samples, String targetAttribute, Attribute[] attributes) in c:\VS\Program.cs:line 626
   at AA.ID3Sample.Main(String[] args) in c:\VS\Program.cs:line 718

The original code/solution is available here

I leave my code below. Could be my error for comparing two strings? It's possible to create a function as string and return values in the end with strings right? If not, I could change the function to "int" and return it as values 0,1,2 and 3.

Could anyone help me or share ideas how I could solve this?

using System;
using System.Collections.Generic;
using System.Linq;
using System.Text;
using System.Collections;
using System.Data;

namespace AA
{
    /// <summary>
    /// Classe que representa um atributo utilizado na classe de decisão
    /// </summary>
    public class Attribute
    {
        ArrayList mValues;
        string mName;
        object mLabel;

        /// <summary>
        /// Inicializa uma nova instância de uma classe Atribute
        /// </summary>
        /// <param name="name">Indica o nome do atributo</param>
        /// <param name="values">Indica os valores possíveis para o atributo</param>
        public Attribute(string name, string[] values)
        {
            mName = name;
            mValues = new ArrayList(values);
            mValues.Sort();
        }

        public Attribute(object Label)
        {
            mLabel = Label;
            mName = string.Empty;
            mValues = null;
        }

        /// <summary>
        /// Indica o nome do atributo
        /// </summary>
        public string AttributeName
        {
            get
            {
                return mName;
            }
        }

        /// <summary>
        /// Retorna um array com os valores do atributo
        /// </summary>
        public string[] values
        {
            get
            {
                if (mValues != null)
                    return (string[])mValues.ToArray(typeof(string));
                else
                    return null;
            }
        }

        /// <summary>
        /// Indica se um valor é permitido para este atributo
        /// </summary>
        /// <param name="value"></param>
        /// <returns></returns>
        public bool isValidValue(string value)
        {
            return indexValue(value) >= 0;
        }

        /// <summary>
        /// Retorna o índice de um valor
        /// </summary>
        /// <param name="value">Valor a ser retornado</param>
        /// <returns>O valor do índice na qual a posição do valor se encontra</returns>
        public int indexValue(string value)
        {
            if (mValues != null)
                return mValues.BinarySearch(value);
            else
                return -1;
        }

        /// <summary>
        /// 
        /// </summary>
        /// <returns></returns>
        public override string ToString()
        {
            if (mName != string.Empty)
            {
                return mName;
            }
            else
            {
                return mLabel.ToString();
            }
        }
    }

    /// <summary>
    /// Classe que representará a arvore de decisão montada;
    /// </summary>
    public class TreeNode
    {
        private ArrayList mChilds = null;
        private Attribute mAttribute;

        /// <summary>
        /// Inicializa uma nova instância de TreeNode
        /// </summary>
        /// <param name="attribute">Atributo ao qual o node está ligado</param>
        public TreeNode(Attribute attribute)
        {
            if (attribute.values != null)
            {
                mChilds = new ArrayList(attribute.values.Length);
                for (int i = 0; i < attribute.values.Length; i++)
                    mChilds.Add(null);
            }
            else
            {
                mChilds = new ArrayList(1);
                mChilds.Add(null);
            }
            mAttribute = attribute;
        }

        /// <summary>
        /// Adiciona um TreeNode filho a este treenode no galho de nome indicicado pelo ValueName
        /// </summary>
        /// <param name="treeNode">TreeNode filho a ser adicionado</param>
        /// <param name="ValueName">Nome do galho onde o treeNode é criado</param>
        public void AddTreeNode(TreeNode treeNode, string ValueName)
        {
            int index = mAttribute.indexValue(ValueName);
            mChilds[index] = treeNode;
        }

        /// <summary>
        /// Retorna o nro total de filhos do nó
        /// </summary>
        public int totalChilds
        {
            get
            {
                return mChilds.Count;
            }
        }

        /// <summary>
        /// Retorna o nó filho de um nó
        /// </summary>
        /// <param name="index">Indice do nó filho</param>
        /// <returns>Um objeto da classe TreeNode representando o nó</returns>
        public TreeNode getChild(int index)
        {
            return (TreeNode)mChilds[index];
        }

        /// <summary>
        /// Atributo que está conectado ao Nó
        /// </summary>
        public Attribute attribute
        {
            get
            {
                return mAttribute;
            }
        }

        /// <summary>
        /// Retorna o filho de um nó pelo nome do galho que leva até ele
        /// </summary>
        /// <param name="branchName">Nome do galho</param>
        /// <returns>O nó</returns>
        public TreeNode getChildByBranchName(string branchName)
        {
            int index = mAttribute.indexValue(branchName);
            return (TreeNode)mChilds[index];
        }
    }

    /// <summary>
    /// Classe que implementa uma árvore de Decisão usando o algoritmo ID3
    /// </summary>
    public class DecisionTreeID3
    {
        private DataTable mSamples;
        private int mTotalUnacc = 0;
        private int mTotalAcc = 0;
        private int mTotalGood = 0;
        private int mTotalVgood = 0;
        private int mTotal = 0;
        private string mTargetAttribute = "result";
        private double mEntropySet = 0.0;

        /// <summary>
        /// Retorna o total de amostras positivas em uma tabela de amostras
        /// </summary>
        /// <param name="samples">DataTable com as amostras</param>
        /// <returns>O nro total de amostras positivas</returns>
        /// 

        private int countTotalUnacc(DataTable samples)
        {
            int result = 0;

            foreach (DataRow aRow in samples.Rows)
            {
                if (aRow[mTargetAttribute].ToString().ToUpper().Trim() == "UNACC")
                    result++;
            }

            return result;
        }

        private int countTotalAcc(DataTable samples)
        {
            int result = 0;

            foreach (DataRow aRow in samples.Rows)
            {
                if (aRow[mTargetAttribute].ToString().ToUpper().Trim() == "ACC")
                    result++;
            }

            return result;
        }

        private int countTotalGood(DataTable samples)
        {
            int result = 0;

            foreach (DataRow aRow in samples.Rows)
            {
                if (aRow[mTargetAttribute].ToString().ToUpper().Trim() == "GOOD")
                    result++;
            }

            return result;
        }

        private int countTotalVgood(DataTable samples)
        {
            int result = 0;

            foreach (DataRow aRow in samples.Rows)
            {
                if (aRow[mTargetAttribute].ToString().ToUpper().Trim() == "VGOOD")
                    result++;
            }

            return result;
        }

        /// <summary>
        /// Calcula a entropia dada a seguinte fórmula
        /// -p+log2p+ - p-log2p-
        /// 
        /// onde: p+ é a proporção de valores positivos
        ///       p- é a proporção de valores negativos
        /// </summary>
        /// <param name="positives">Quantidade de valores positivos</param>
        /// <param name="negatives">Quantidade de valores negativos</param>
        /// <returns>Retorna o valor da Entropia</returns>
        private double calcEntropy(int unacc, int acc,int good,int vgood)
        {
            //int total = positives + negatives;
            int total = unacc + acc + good + vgood;
            double ratioUnacc = (double)unacc / total;
            double ratioAcc = (double)acc / total;
            double ratioGood = (double)good / total;
            double ratioVgood = (double)vgood / total;

            if (ratioUnacc != 0)
                ratioUnacc = -(ratioUnacc) * System.Math.Log(ratioUnacc, 2);

            if (ratioAcc != 0)
                ratioAcc = -(ratioAcc) * System.Math.Log(ratioAcc, 2);

            if (ratioGood != 0)
                ratioGood = -(ratioGood) * System.Math.Log(ratioGood, 2);

            if (ratioVgood != 0)
                ratioVgood = -(ratioVgood) * System.Math.Log(ratioVgood, 2);

            double result = ratioUnacc + ratioAcc + ratioGood + ratioVgood;

            return result;
        }

        /// <summary>
        /// Varre tabela de amostras verificando um atributo e se o resultado é positivo ou negativo
        /// </summary>
        /// <param name="samples">DataTable com as amostras</param>
        /// <param name="attribute">Atributo a ser pesquisado</param>
        /// <param name="value">valor permitido para o atributo</param>
        /// <param name="positives">Conterá o nro de todos os atributos com o valor determinado com resultado positivo</param>
        /// <param name="negatives">Conterá o nro de todos os atributos com o valor determinado com resultado negativo</param>
         private void getValuesToAttribute(DataTable samples, Attribute attribute, string value, out int unacc, out int acc, out int good, out int vgood)    
    {

            unacc = 0;
            acc = 0;
            good = 0;
            vgood = 0;

            foreach (DataRow aRow in samples.Rows)
            {
                if (
                    ((string)aRow[attribute.AttributeName] == value))
                    if ((string)aRow[mTargetAttribute] == "unacc")
                        unacc++;
                    else if ((string)aRow[mTargetAttribute] == "acc")
                        acc++;
                    else if ((string)aRow[mTargetAttribute] == "good")
                    good++;
                    else
                        vgood++;

            }
        }

        /// <summary>
        /// Calcula o ganho de um atributo
        /// </summary>
        /// <param name="attribute">Atributo a ser calculado</param>
        /// <returns>O ganho do atributo</returns>
        private double gain(DataTable samples, Attribute attribute)
        {
            string[] values = attribute.values;
            double sum = 0.0;

            for (int i = 0; i < values.Length; i++)
            {
                int unacc, acc, good, vgood;
                unacc = acc = good = vgood = 0;
                getValuesToAttribute(samples, attribute, values[i], out unacc, out acc, out good, out vgood);
                double entropy = calcEntropy(unacc, acc,good,vgood);
                sum += -(double)(unacc + acc + good + vgood) / mTotal * entropy;
            }
            return mEntropySet + sum;
        }

        /// <summary>
        /// Retorna o melhor atributo.
        /// </summary>
        /// <param name="attributes">Um vetor com os atributos</param>
        /// <returns>Retorna o que tiver maior ganho</returns>
        private Attribute getBestAttribute(DataTable samples, Attribute[] attributes)
        {
            double maxGain = 0.0;
            Attribute result = null;

            foreach (Attribute attribute in attributes)
            {
                double aux = gain(samples, attribute);
                if (aux > maxGain)
                {
                    maxGain = aux;
                    result = attribute;
                }
            }
            return result;
        }

        /// <summary>
        /// Retorna true caso todos os exemplos da amostragem são positivos
        /// </summary>
        /// <param name="samples">DataTable com as amostras</param>
        /// <param name="targetAttribute">Atributo (coluna) da tabela a qual será verificado</param>
        /// <returns>True caso todos os exemplos da amostragem são positivos</returns>
        private string allSamplesUnacc(DataTable samples, string targetAttribute)
        {
            foreach (DataRow row in samples.Rows)
            { //alterar
                if (row[targetAttribute].ToString() == "acc")
                    return "acc";
                if (row[targetAttribute].ToString() == "good")
                    return "good";
                if (row[targetAttribute].ToString() == "vgood")
                    return "vgood";
            }

            return "unacc";
        }

        private string allSamplesAcc(DataTable samples, string targetAttribute)
        {
            foreach (DataRow row in samples.Rows)
            { //alterar
                if (row[targetAttribute].ToString() == "unacc")
                    return "unacc";
                if (row[targetAttribute].ToString() == "good")
                    return "good";
                if (row[targetAttribute].ToString() == "vgood")
                    return "vgood";
            }

            return "acc";
        }
        private string allSamplesGood(DataTable samples, string targetAttribute)
        {
            foreach (DataRow row in samples.Rows)
            { //alterar
                if (row[targetAttribute].ToString() == "unacc")
                    return "unacc";
                if (row[targetAttribute].ToString() == "acc")
                    return "acc";
                if (row[targetAttribute].ToString() == "vgood")
                    return "vgood";
            }

            return "good";
        }
        private string allSamplesVgood(DataTable samples, string targetAttribute)
        {
            foreach (DataRow row in samples.Rows)
            { //alterar
                if (row[targetAttribute].ToString() == "unacc")
                    return "unacc";
                if (row[targetAttribute].ToString() == "acc")
                    return "acc";
                if (row[targetAttribute].ToString() == "good")
                    return "good";
            }

            return "vgood";
        }

        /// <summary>
        /// Retorna uma lista com todos os valores distintos de uma tabela de amostragem
        /// </summary>
        /// <param name="samples">DataTable com as amostras</param>
        /// <param name="targetAttribute">Atributo (coluna) da tabela a qual será verificado</param>
        /// <returns>Um ArrayList com os valores distintos</returns>
        private ArrayList getDistinctValues(DataTable samples, string targetAttribute)
        {
            ArrayList distinctValues = new ArrayList(samples.Rows.Count);

            foreach (DataRow row in samples.Rows)
            {
                if (distinctValues.IndexOf(row[targetAttribute]) == -1)
                    distinctValues.Add(row[targetAttribute]);
            }

            return distinctValues;
        }

        /// <summary>
        /// Retorna o valor mais comum dentro de uma amostragem
        /// </summary>
        /// <param name="samples">DataTable com as amostras</param>
        /// <param name="targetAttribute">Atributo (coluna) da tabela a qual será verificado</param>
        /// <returns>Retorna o objeto com maior incidência dentro da tabela de amostras</returns>
        private object getMostCommonValue(DataTable samples, string targetAttribute)
        {
            ArrayList distinctValues = getDistinctValues(samples, targetAttribute);
            int[] count = new int[distinctValues.Count];

            foreach (DataRow row in samples.Rows)
            {
                int index = distinctValues.IndexOf(row[targetAttribute]);
                count[index]++;
            }

            int MaxIndex = 0;
            int MaxCount = 0;

            for (int i = 0; i < count.Length; i++)
            {
                if (count[i] > MaxCount)
                {
                    MaxCount = count[i];
                    MaxIndex = i;
                }
            }

            return distinctValues[MaxIndex];
        }

        /// <summary>
        /// Monta uma árvore de decisão baseado nas amostragens apresentadas
        /// </summary>
        /// <param name="samples">Tabela com as amostragens que serão apresentadas para a montagem da árvore</param>
        /// <param name="targetAttribute">Nome da coluna da tabela que possue o valor true ou false para 
        /// validar ou não uma amostragem</param>
        /// <returns>A raiz da árvore de decisão montada</returns></returns?>
        private TreeNode internalMountTree(DataTable samples, string targetAttribute, Attribute[] attributes)
        {
            //alterar
            if (allSamplesUnacc(samples, targetAttribute) == "unacc")
                return new TreeNode(new Attribute("unacc"));

            if (allSamplesAcc(samples, targetAttribute) == "acc")
                return new TreeNode(new Attribute("acc"));

            if (allSamplesGood(samples, targetAttribute) == "good")
                return new TreeNode(new Attribute("good"));

            if (allSamplesVgood(samples, targetAttribute) == "vgood")
                return new TreeNode(new Attribute("vgood"));

            if (attributes.Length == 0)
                return new TreeNode(new Attribute(getMostCommonValue(samples, targetAttribute)));

            mTotal = samples.Rows.Count;
            mTargetAttribute = targetAttribute;
            mTotalUnacc = countTotalUnacc(samples);
            mTotalAcc = countTotalAcc(samples);
            mTotalGood = countTotalGood(samples);
            mTotalVgood = countTotalVgood(samples);

            mEntropySet = calcEntropy(mTotalUnacc, mTotalAcc, mTotalGood, mTotalVgood);

            Attribute bestAttribute = getBestAttribute(samples, attributes);

            TreeNode root = new TreeNode(bestAttribute);

            DataTable aSample = samples.Clone();

            foreach (string value in bestAttribute.values)
            {
                // Seleciona todas os elementos com o valor deste atributo              
                aSample.Rows.Clear();

                DataRow[] rows = samples.Select(bestAttribute.AttributeName + " = " + "'" + value + "'");

                foreach (DataRow row in rows)
                {
                    aSample.Rows.Add(row.ItemArray);
                }
                // Seleciona todas os elementos com o valor deste atributo              

                // Cria uma nova lista de atributos menos o atributo corrente que é o melhor atributo               
                ArrayList aAttributes = new ArrayList(attributes.Length - 1);
                for (int i = 0; i < attributes.Length; i++)
                {
                    if (attributes[i].AttributeName != bestAttribute.AttributeName)
                        aAttributes.Add(attributes[i]);
                }
                // Cria uma nova lista de atributos menos o atributo corrente que é o melhor atributo

                if (aSample.Rows.Count == 0)
                {
                    return new TreeNode(new Attribute(getMostCommonValue(aSample, targetAttribute)));
                }
                else
                {
                    DecisionTreeID3 dc3 = new DecisionTreeID3();
                    TreeNode ChildNode = dc3.mountTree(aSample, targetAttribute, (Attribute[])aAttributes.ToArray(typeof(Attribute)));
                    root.AddTreeNode(ChildNode, value);
                }
            }

            return root;
        }


        /// <summary>
        /// Monta uma árvore de decisão baseado nas amostragens apresentadas
        /// </summary>
        /// <param name="samples">Tabela com as amostragens que serão apresentadas para a montagem da árvore</param>
        /// <param name="targetAttribute">Nome da coluna da tabela que possue o valor true ou false para 
        /// validar ou não uma amostragem</param>
        /// <returns>A raiz da árvore de decisão montada</returns></returns?>
        public TreeNode mountTree(DataTable samples, string targetAttribute, Attribute[] attributes)
        {
            mSamples = samples;
            return internalMountTree(mSamples, targetAttribute, attributes);
        }
    }

    /// <summary>
    /// Classe que exemplifica a utilização do ID3
    /// </summary>
    class ID3Sample
    {

        public static void printNode(TreeNode root, string tabs)
        {
            Console.WriteLine(tabs + '|' + root.attribute + '|');

            if (root.attribute.values != null)
            {
                for (int i = 0; i < root.attribute.values.Length; i++)
                {
                    Console.WriteLine(tabs + "\t" + "<" + root.attribute.values[i] + ">");
                    TreeNode childNode = root.getChildByBranchName(root.attribute.values[i]);
                    printNode(childNode, "\t" + tabs);
                }
            }
        }


        static DataTable getDataTable()
        {
            DataTable result = new DataTable("samples");
            DataColumn column = result.Columns.Add("buying");
            column.DataType = typeof(string);

            column = result.Columns.Add("maint");
            column.DataType = typeof(string);

            column = result.Columns.Add("doors");
            column.DataType = typeof(string);

            column = result.Columns.Add("persons");
            column.DataType = typeof(string);

            column = result.Columns.Add("lugboot");
            column.DataType = typeof(string);

            column = result.Columns.Add("safety");
            column.DataType = typeof(string);

            column = result.Columns.Add("result");
            column.DataType = typeof(string);

            result.Rows.Add(new object[] { "vhigh", "med", "2", "4", "small", "high", "acc" });
            result.Rows.Add(new object[] { "vhigh", "med", "2", "4", "med", "low", "unacc" });
            result.Rows.Add(new object[] { "vhigh", "med", "2", "4", "med", "med", "unacc" });
            result.Rows.Add(new object[] { "vhigh", "med", "2", "4", "med", "high", "acc" });
            result.Rows.Add(new object[] { "vhigh", "med", "2", "4", "big", "low", "unacc" });
            result.Rows.Add(new object[] { "vhigh", "med", "2", "4", "big", "med", "acc" });
            result.Rows.Add(new object[] { "vhigh", "med", "2", "4", "big", "high", "acc" });
            result.Rows.Add(new object[] { "med", "low", "2", "4", "big", "high", "vgood" });
            result.Rows.Add(new object[] { "med", "low", "2", "more", "small", "low", "unacc" });
            result.Rows.Add(new object[] { "med", "low", "2", "more", "small", "med", "unacc" });
            result.Rows.Add(new object[] { "med", "low", "2", "more", "small", "high", "unacc" });
            result.Rows.Add(new object[] { "med", "low", "2", "more", "med", "low", "unacc" });
            result.Rows.Add(new object[] { "med", "low", "2", "more", "med", "med", "acc" });
            result.Rows.Add(new object[] { "med", "low", "2", "more", "med", "high", "good" });
            result.Rows.Add(new object[] { "med", "low", "2", "more", "big", "low", "unacc" });
            result.Rows.Add(new object[] { "med", "low", "2", "more", "big", "med", "good" });
            result.Rows.Add(new object[] { "med", "low", "2", "more", "big", "high", "vgood" });

            return result;

        }

        /// <summary>
        /// The main entry point for the application.
        /// </summary>
        /// 
        [STAThread]
        static void Main(string[] args)
        {

            Attribute buying = new Attribute("buying", new string[] { "vhigh", "high", "med", "low" });
            Attribute maint = new Attribute("maint", new string[] { "vhigh", "high", "med", "low" });
            Attribute doors = new Attribute("doors", new string[] { "2", "3", "4", "5more" });
            Attribute persons = new Attribute("persons", new string[] { "2", "4", "more" });
            Attribute lugboot = new Attribute("lugboot", new string[] { "small", "med", "big" });
            Attribute safety = new Attribute("safety", new string[] { "low", "med", "high" });

            Attribute[] attributes = new Attribute[] { buying, maint, doors, persons, lugboot, safety };

            DataTable samples = getDataTable();

            DecisionTreeID3 id3 = new DecisionTreeID3();
            TreeNode root = id3.mountTree(samples, "result", attributes);

            printNode(root, "");

        }
    }

}
Mark Rotteveel
  • 100,966
  • 191
  • 140
  • 197
Rafael Cardoso
  • 121
  • 1
  • 1
  • 9
  • possible duplicate of [What is a NullReferenceException and how do I fix it?](http://stackoverflow.com/questions/4660142/what-is-a-nullreferenceexception-and-how-do-i-fix-it) – Dmitry Jan 23 '15 at 01:52
  • And now we should debug your program instead of you ? Narrow your problem to the specific code snippet. – mybirthname Jan 23 '15 at 02:17

1 Answers1

0

Your stack trace has the information you need:

AA.TreeNode..ctor(Attribute attribute) in c:\VS\Program.cs:line 117

'ctor' is fancy compiler slang for constructor, so we'll take a look at TreeNode(Attribute).

This likely means that attribute (the argument to the constructor) is null. How did that happen? Since we have all the code out in front of us, let's focus on the one call that doesn't explicitly create a new Attribute object to pass to the TreeNode constructor:

Attribute bestAttribute = getBestAttribute(samples, attributes);
TreeNode root = new TreeNode(bestAttribute);

getBestAttribute returns null if it none of the attributes have a gain greater than zero, or if there are no attributes. To fix the exception, you'll need to fix either the input or the logic on that method.

Steve Howard
  • 6,737
  • 1
  • 26
  • 37