Skip to content Skip to sidebar Skip to footer

Wondering Why Scipy.spatial.distance.sqeuclidean Is Twice Slower Than Numpy.sum((y1-y2)**2)

Here is my code import numpy as np import time from scipy.spatial import distance y1=np.array([0,0,0,0,1,0,0,0,0,0]) y2=np.array([0. , 0.1, 0. , 0. , 0.7, 0.2, 0. , 0. , 0. , 0. ]

Solution 1:

Here is a more comprehensive comparison (credit to @Divakar's benchit package):

def m1(y1,y2):
  return distance.sqeuclidean(y1,y2)

def m2(y1,y2):
  return np.sum((y1-y2)**2)

in_ = {n:[np.random.rand(n), np.random.rand(n)] for n in [10,100,1000,10000,20000]}

enter image description here

scipy gets more efficient for larger arrays. For smaller arrays, the overhead of calling the function most likely outweighs its benefit. According to source, scipy calculates np.dot(y1-y2,y1-y2).

And if you want an even faster solution, use np.dot directly without the overhead of extra lines and function calling:

def m3(y1,y2):
  y_d = y1-y2
  return np.dot(y_d,y_d)

enter image description here

Post a Comment for "Wondering Why Scipy.spatial.distance.sqeuclidean Is Twice Slower Than Numpy.sum((y1-y2)**2)"