Not sure it could help. It depends on how often you do this operation, and also how big and how deep your tree is.
But basically, my suggestion, if you need to accelerate this, would be "precompute every thing for every node", just, then, you can do it numpy's style:
preList=[tree]
idx=tree[:,0]
while (preList[-1][:,0]!=0).any():
preList.append(preList[-1][idx])
pre=np.stack(preList)
# Values of 6th node
pre[:,5][:,1]
# array([7, 3, 4, 6])
Note that it would always give 4 values, repeating root value if needed. But you can stop at the first pre[:,5][:,0] that is 0 (root).
This is just the same thing you are doing (from a row #i = [j,v], getting the parent row #j). Just done once for all on all nodes. To get a 3D matrix, whose 1st axis is the ancestor axis
Note that if your computation timing were unbearable with your current algorithm, meaning that your tree is very deep, then, chances are that mine would suffer from heavy memory usage, since my 3d matrix has the size of your 2d matrix times the tree depth.
For cpu usage tho, even if, from a strictly "number of operations point of view" what I do is just precompute your algorithm for all nodes in the tree, it probably worth it even if you don't need that much computation, because obviously, numpy array indexation is faster.
With a tree as small as yours, it takes 46 such request before my method is cheaper than yours (it takes 46 requests to absorb cost of precomputation). which is not good, considering that you have only 6 nodes.
But for a 13 nodes tree, precomputation time is 76μs, your code needs 3.12 μs/request, mine 350 ns. So number of request before it worth it drops to 27. Still more than number of nodes (13).
For a 27 nodes tree, precomputation time is 84μs, your code needs 3.81 μs/request, mine still 350 ns. So number of requests for which precomputation is profitable is 24.
In CPU time, precomputation is O(n.log(n)), your code is O(log(n)). And my request is O(1). So, in theory, if number of requests is k, that is O(n.log(n) + k) on my side, and O(klog(n)) on yours. Which becomes equivalent if k~n. As I said, it is just as calling your code on all possible nodes.
But because of numpy efficiency, precomputation costs less that calling your code n times. So, it is worthy even if k is smaller than n.