Wondering Why Scipy.spatial.distance.sqeuclidean Is Twice Slower Than Numpy.sum((y1-y2)**2)

September 29, 2023 Post a Comment

Here is my code import numpy as np import time from scipy.spatial import distance y1=np.array([0,0,0,0,1,0,0,0,0,0]) y2=np.array([0. , 0.1, 0. , 0. , 0.7, 0.2, 0. , 0. , 0. , 0. ]

Solution 1:

Here is a more comprehensive comparison (credit to @Divakar's benchit package):

def m1(y1,y2):
  return distance.sqeuclidean(y1,y2)

def m2(y1,y2):
  return np.sum((y1-y2)**2)

in_ = {n:[np.random.rand(n), np.random.rand(n)] for n in [10,100,1000,10000,20000]}

scipy gets more efficient for larger arrays. For smaller arrays, the overhead of calling the function most likely outweighs its benefit. According to source, scipy calculates np.dot(y1-y2,y1-y2).

And if you want an even faster solution, use np.dot directly without the overhead of extra lines and function calling:

def m3(y1,y2):
  y_d = y1-y2
  return np.dot(y_d,y_d)

Python College

Wondering Why Scipy.spatial.distance.sqeuclidean Is Twice Slower Than Numpy.sum((y1-y2)**2)

Solution 1:

Post a Comment for "Wondering Why Scipy.spatial.distance.sqeuclidean Is Twice Slower Than Numpy.sum((y1-y2)**2)"