I'm not sure your example demonstrates a bug in the nan_policy
parameter which refers to inputs not outputs and there are no nans
in your input.
You're getting nan
because your samples are too short for meaningful statistics. Technically, you are probably right, p value should always be finite, so it is a bug.
That said, if I do not totally misunderstand what Spearman's rank cc is, the function does return wrong p values, e.g.
>>> stats.spearmanr(np.arange(4.),np.arange(4.))
SpearmanrResult(correlation=1.0, pvalue=0.0)
having four samples with the same rank order really isn't that unlikely.
Edit: The above smells to me like they are using an approximation formula for the distribution of rank cc's which doesn't work too well for small n. So what can you do? If your n is small, don't use this function (sorry, I can't be more constructive; you could compute the distribution of rank cc's by brute force and then calculate the p-value yourself); if your actual samples are large you're probably fine, but I would crosscheck a few examples against some other stats software.