# Python 来算算一线城市的二手房价格指数相关性

Python中有很多方法计算相关性，scipy中有自带的分析工具，pandas里也有非常方便的多变量相关性分析。我们今天就讲讲这两个工具的用法。

## 1.数据收集

`# 北京：bj = [100.3,100.4,99.9,100.1,99.8,99.9,100.1,100,99.6,99.5,99.3,99.2,99.1,99.8,100.2,100.4,99.9,100.2,100.3,100.3,100.1,100,100.3,101,101,102.2,103.1,102,101.7,101.3,101.4,101.2,101.3,101.1,101.2,100.6,99.9,100,100.2,99.8,99.1,98.7,99.2,99.1,98.6,100.3,100.7,100.2,100,99.9,100.5,102.1,104.3,102.3,102.6,102,101.4,101.1,101.4,101.7,102.3,103.2,106.3,103.7,102.3,101.4,101.6,103.9,105.7,101.1,100.2,100.2,100.8,101.3,102.2,100,99.1,98.9,99.2,99.1,99.4,99.5,99.5,99.6,99.4,99.5,99.8,99.9,100.3,100.1,100.4,100,99.8,99.8,99.4,99.8,99.9,100.2,100.4,100.6,100,100,99.7,99.6,99.5,99.4]`
`# 广州：gz = [101.2,100.6,99.5,101,99.8,100.1,100.2,100.7,100.6,99.5,99.2,99.6,99.6,99.6,99.8,99.6,99.9,100.5,100.7,100.9,100.6,100.4,100.5,100.5,100.4,101.7,101.5,100.7,101.1,100.9,101,101,100.4,101,101.2,100.6,101,100.3,100.2,100.7,100.1,99.7,98.9,98.6,98.7,100,100,100.2,99.8,99.7,100,101.1,102.3,101.8,101.3,101,101.2,101.1,100.7,101,101.3,101.2,103.5,102.6,101.9,101.6,101.4,102.8,103.3,101.6,100.8,101.3,101.6,102.7,103.3,101,100.5,100.8,100.1,100,100.2,99.7,100.1,99.6,99.9,100.2,100.2,100.5,101,100.3,100.3,100.6,100.2,99.8,99.7,99.6,99.7,99.8,99.5,99.6,99.7,100,100.4,100,99.7,99.9]`
```# 上海：
sh = [100.5,100.4,100.4,100.6,100.2,100.2,100.3,100.1,100.1,99.8,99.5,99.6,99.3,99.7,99.5,100.1,100.3,100.2,100.2,100.3,100.2,100.2,100.2,100.4,100.8,101.6,102.6,101.3,100.9,101.1,100.8,100.8,101,100.9,100.7,100.5,100.1,100.6,100.2,100,99.8,99.3,99.1,99.3,99.2,100,100,100.4,100.3,100.1,100,100.6,102.2,101.2,101.6,101.1,101,100.8,101,101.2,102.7,105.3,106.2,102.5,101.4,102.2,102,103.7,103.4,100.3,99.8,99.5,99.6,100.2,100.7,100.8,100,99.9,99.6,99.8,99.9,100.3,99.7,99.9,100.1,99.6,99.4,99.8,99.7,99.7,99.9,99.9,99.8,99.8,99.9,99.7,100,99.9,100.3,100.5,100.1,99.9,100.4,100,100.6,99.8]```
```# 深圳：
sz = [100.6,102.6,100.6,100.5,100.3,100,99.5,100,99.8,100,99.2,99.6,99.2,100,100.1,100,100,100.2,100.2,100.1,100.1,100.4,100.3,100.6,100.5,101.4,102.3,101.1,101,101.3,101,101.6,101.3,100.9,100.8,100.7,100.8,100.8,101.1,100.1,100.2,99.4,99.4,99.5,99.3,100,100.4,100.7,100.6,100.3,100.5,102.4,106.3,106.9,105.3,104.4,103.3,101,101.9,103.3,105.7,103.3,104.7,99.6,100,100.8,101.8,102,101.8,99.4,99.3,99.8,99.9,99.3,100.3,100.8,100.3,99.7,100.6,99.8,99.9,100.4,100.1,100.4,100.9,101.3,100.7,100.2,100.8,100.3,100.6,101.1,100,99.4,99.8,99.7,99.7,100.5,100.7,101.1,100,99.9,100.7,100.2,101.3,101] ```

## 2.准备工作

```pip install scipy
pip install pandas```

## 3.编写代码

### 3.1 scipy计算相关性

scipy计算相关性其实非常简单，引入包的stats模块：

`import scipy.stats as stats `

```# 计算广州和深圳二手房价格指数相关性
print(stats.pearsonr(gz, sz))```

```F:\push\20191130>python 1.py
(0.4673289851643741, 4.4100775485723706e-07)```

### 3.2 pandas一次性两两对比计算相关性

`import pandas as pd`

```df = pd.DataFrame()
df['北京'] = bj
df['上海'] = sh
df['广州'] = gz
df['深圳'] = sz ```

`print(df.corr()) `

wow，看来深圳的二手房价还真是与众不同，不过从下面这个图看，确实，深圳的二手房价格和北京的二手房价格已经出现了背离的情况。

