Calculating Correlations Between Every Item In A List
I'm trying to calculate the Pearson correlation correlation between every item in my list. I'm trying to get the correlations between data[0] and data[1], data[0] and data[2], and
Solution 1:
range will return you a list of int values while you are trying to use it like it returning you a tuple. Try itertools.combinations instead:
import scipy
from scipy import stats
from itertools import combinations
data = [[1, 2, 4], [9, 5, 1], [8, 3, 3]]
def pearson(x, y):
series1 = data[x]
series2 = data[y]
if x != y:
return scipy.stats.pearsonr(series1, series2)
h = [pearson(x,y) for x,y in combinations(len(data), 2)]
Or as @Marius suggested:
h = [stats.pearsonr(data[x], data[y]) for x,y in combinations(len(data), 2)]
Solution 2:
Why not use numpy.corrcoef
import numpy as np
data = [[1, 2, 4], [9, 5, 1], [8, 3, 3]]
Result:
>>> np.corrcoef(data)
array([[ 1. , -0.98198051, -0.75592895],
[-0.98198051, 1. , 0.8660254 ],
[-0.75592895, 0.8660254 , 1. ]])
Solution 3:
The range() function will give you only an int for each iteration, and you can't assign an int to a pair of values.
If you want to go through every possible pair of possibilities of ints in that range you could try
import itertools
h = [pearson(x,y) for x,y in itertools.product(range(len(data)), repeat=2)]
That will combine all the possibilities in the given range in a tuple of 2 elements
Remember that, using that function you defined, when x==y you will have None values. To fix that you could use:
import itertools
h = [pearson(x,y) for x,y in itertools.permutations(range(len(data)), 2)]
Post a Comment for "Calculating Correlations Between Every Item In A List"