Given a parameter space and the task to find an optimum, gridsearch is probably the easiest thing you can do: Discretize the parameter space and just check all combinations by brute-force. Return the parameter combination that yielded the best result.
This works, but as you can imagine, this does not scale well. For high dimensional optimization problems this is simply not feasible.
Strategies to improve here depend on what additional information you have. In the optimal case you optimize a smooth and differentiable function. In this case you can use numerical optimization.
In numerical optimization routines you exploit the fact that the gradient of a function always points upward. So if you want to increase the function value, you simply follow the gradient a little bit and you will always improve, as long as the gradient is not zero.
This powerful concept is exploited in most of scipy
's routines. This way you can optimize high-dimensional functions by exploiting additional information you get about the neighborhood of your current position.
So if you do not have a smooth and differential function, scipy
's numerical routines cannot be used.
Note that exploiting the information in the neighborhood of your current parameter vector can be used in non-smooth optimization as well. Basically you do the same thing: You check a window around your current estimate and try to improve by finding a better value in that window.