Why do np.corrcoef(x) and df.corr() give different results?
Why the numpy correlation coefficient matrix and the pandas correlation coefficient matrix different when using np.corrcoef(x) and df.corr()?
x = np.array([[0, 2, 7], [1, 1, 9], [2, 0, 13]]).T
x_df = pd.DataFrame(x)
print("matrix:")
print(x)
print()
print("df:")
print(x_df)
print()
print("np correlation matrix: ")
print(np.corrcoef(x))
print()
print("pd correlation matrix: ")
print(x_df.corr())
print()
Gives me the output
matrix:
[[ 0 1 2]
[ 2 1 0]
[ 7 9 13]]
df:
0 1 2
0 0 1 2
1 2 1 0
2 7 9 13
np correlation matrix:
[[ 1. -1. 0.98198051]
[-1. 1. -0.98198051]
[ 0.98198051 -0.98198051 1. ]]
pd correlation matrix:
0 1 2
0 1.000000 0.960769 0.911293
1 0.960769 1.000000 0.989743
2 0.911293 0.989743 1.000000
I'm guessing they are different types of correlation coefficients?
from Recent Questions - Stack Overflow https://ift.tt/2KYnk2O
https://ift.tt/eA8V8J
Comments
Post a Comment