Obviously the naive solution is to perform binary search on each row (or column) which will result in a runtime complexity of O (n log n).
The main problem I see in directly adapting binary search to the whole matrix is that there is no complete ordering. This makes it hard to define a "split element" where all elements to the "left" are smaller and all elements to the "right" are larger.
My approach for a direct adaptation would be closer to a spatial index like a quad-tree. The basic observation is the following: For each sub-matrix of the original matrix you can define bounds of the elements by looking at the top-left (lower bound) and bottom-right (upper-bound) element.
Now you can basically perform depth first search, recursively splitting the matrix in 4 sub matrices, computing upper and lower bounds for each sub matrix and discarding or exploring it depending if the key is within its bounds.