0

I've done a grid search in order to find the optimal number of nodes in my deep network using AUROC as my measure for optimality. Let's say that having 100 nodes in my first hidden layer produces the highest AUROC value of 0.7. Can I assume that when adding a second hidden layer that having 100 nodes in the first hidden layer will lead to the best model? I don't want to do a grid search for the second hidden layer while varying the number of nodes in the first and second hidden layers because as I add more hidden layers, this will lead to exponentially longer run times.

g00glechr0me
  • 365
  • 6
  • 19
  • 1
    ```Can I assume that when adding a second hidden layer that having 100 nodes in the first hidden layer will lead to the best model?``` -> **Of course not!**. There is so much to tell about this wrong assumption, but i just recommend you read some basic introduction to Neural Networks (given some nonlinearities). I'm also pretty sure, our world would be very different if this assumption would be true (NN-architecture-optimization in Polynomial-time; which is an implication of your assumption for me). – sascha Sep 02 '16 at 01:46
  • Ahh I see. So then other than doing a grid search, how would I go about doing this in an efficient manner? – g00glechr0me Sep 02 '16 at 02:02
  • 1
    this is more of a http://stats.stackexchange.com/ question, as it is a pure machine learning question and not a software engineering question – Jules G.M. Sep 02 '16 at 19:06
  • also, giving us a better description of your model would allow us to give you a better answer – Jules G.M. Sep 02 '16 at 19:07
  • http://stackoverflow.com/questions/10565868/multi-layer-perceptron-mlp-architecture-criteria-for-choosing-number-of-hidde?rq=1 – Jules G.M. Sep 02 '16 at 19:12

1 Answers1

0

"Of course not!. There is so much to tell about this wrong assumption, but i just recommend you read some basic introduction to Neural Networks (given some nonlinearities). I'm also pretty sure, our world would be very different if this assumption would be true (NN-architecture-optimization in Polynomial-time; which is an implication of your assumption for me)"- Courtesy of sascha. I don't mean to plagiarize your answer, just want others to see this as the correct answer!

g00glechr0me
  • 365
  • 6
  • 19