so im in the middle of writing a decision tree program. lets say i have a dataset of 1000 instances. as i understand it - with cross validation i split the dataset to 900-100 groups. each time using a different 900 set to create the tree and 100 to test it
what i don't understand is these questions: 1. which tree do i use to as my final decision tree (choosing the one with the least error isn't a good option because i guess it could be because of over-fitting) 2. is cross validation used only to estimate the error in the final tree? 3. i found some different algorithms about cross-validation, some used the same splitting criterion, and some used different ones in order to choose the best tree- can you point me to a good place with information so i could figure out exactly what i need? or explain your self?
Thank you!