0

So I have two questions: First I'm trying to print my array that contains 1004 elements but it's printing only the first 29 elements and then jumping to 974 to continue printing. How can I get the full array of 1004 elements?

This is my code

paired_data = []
for x in data:
    closest, ignored = pairwise_distances_argmin_min(x, result)
    paired_data.append([x, result[closest]])
#print paired_data
S = pd.DataFrame(paired_data, columns=['x','center'])
print S
# distance
Y = pdist(S, 'euclidean')
print Y

Also I want to calculate the distance between each two elements of the array. for example

0 [5, 4] [3, 2]

1 [22, -10] [78, 90]

I want to calculate the distance( Euclidean ) between [5, 4] and [3, 2] and so on for all the rest of the array.

Micheal
  • 17
  • 1
  • 7
  • 1
    Welcome to Stack Overflow! To keep things from getting confusing, can you please edit your question so it only has one question in it, then ask another question for your second question? – David Wolever Mar 03 '15 at 19:57
  • I can't ask two questions in the same day.. Sorry ! – Micheal Mar 03 '15 at 19:58
  • Oh dang, I didn't realize that was a restriction! Well, check out this answer which addresses your first question: http://stackoverflow.com/questions/1987694/print-the-full-numpy-array – David Wolever Mar 03 '15 at 19:59
  • have you tried `from scipy.spatial import distance` – Joran Beasley Mar 03 '15 at 19:59
  • you might also find http://docs.scipy.org/doc/scipy/reference/generated/scipy.spatial.KDTree.html helpful – Joran Beasley Mar 03 '15 at 20:01
  • I will search for from scipy.spatial import distance example. if you have one already, I would appreciate it – Micheal Mar 03 '15 at 20:11
  • I edited my code calculating the distance with from scipy.spatial import distance , but it didn't work – Micheal Mar 03 '15 at 20:34

1 Answers1

1

Another solution to #1:

print(S.to_string())    # print the entire table

and to get distances

# assumes Python 3
from functools import partial

def dist(row, col1, col2):
    return sum((c2 - c1)**2 for c1,c2 in zip(row[col1], row[col2])) ** 0.5

# compose a function (name the columns it applies to)
s_dist = partial(dist, col1="x", col2="center")
# apply it
S["dist"] = S.apply(s_dist, axis=1)
Hugh Bothwell
  • 55,315
  • 8
  • 84
  • 99
  • @Micheal: I think it should work as is, but I do not have anything but 3.4 on this machine to test with. – Hugh Bothwell Mar 03 '15 at 20:35
  • If I want to square ( ^2) each of these distances (1004 distances values), and then sum all of them after do the square, how can I do that? – Micheal Mar 03 '15 at 20:45
  • @Micheal: it is an n-dimensional Euclidean distance. If you want sum of squares, try `sq_sum = (S["dist"]**2).sum()`. – Hugh Bothwell Mar 03 '15 at 20:50
  • so why you multiply it by ** 0.5? – Micheal Mar 03 '15 at 22:30
  • `**` is exponentiation, not multiplication, so `x ** 0.5` is square root of x. – Hugh Bothwell Mar 03 '15 at 22:47
  • why I'm getting this error: when running the distance code above: ValueError: Wrong number of items passed 2, placement implies 1 – Micheal Mar 04 '15 at 16:10
  • @Micheal: Without seeing your exact code, it's hard to tell but at a guess you've changed the `s_dist = partial` line (which takes a 3-argument function and returns a one-argument function) and then tried to pass it two arguments. – Hugh Bothwell Mar 04 '15 at 16:41