If you are trying to maximize the height of a tree while minimizing the average depth of all the nodes of the tree, the unambiguous best shape would be an "umbrella" shape, e.g. a full binary tree with k nodes and height = lg k, where 0 < k < n, along with a single path, or "tail", of n-k nodes coming out of one of the leaves of the full part. The height of this tree is roughly lg k + n - k.
Now let's compute the total depth of all the nodes. The sum of the depths of the nodes of the full part is sum[ j * 2^j ], where the sum is taken from j=0 to j=lg k. By some algebra, the dominant term of the result is 2k lg k.
Next, the sum of the depths of the tail part is given by sum[i + lg k], where the sum is taken from i=0 to i=n-k. By some algebra, the result is approximately (n-k)lg k + (1/2)(n-k)^2.
Hence, summing the two parts above together and dividing by n, the average depth of all the nodes is (1 + k/n) lg k + (n-k)^2 / (2n). Note that because 0 < k < n, the first term here is O(lg n) no matter what k we choose. Hence, we need only make sure the second term is O(lg n). To do so, we require that (n-k)^2 = O(n lg n), or k = n - O(sqrt(n lg n)). With this choice, the height of the tree is
lg k + n - k = O( sqrt(n lg n) )
this is asymptotically larger than the ordinary O(lg n), and is asymptotically the tallest you can make the tree while keeping the average depth to be O(lg n)