Python利用matplotlib画出漂亮的分析图表

目录

前言

作为一名优秀的分析师,还是得学会一些让图表漂亮的技巧,这样子拿出去才更加有面子哈哈。好了,今天的锦囊就是介绍一下各种常见的图表,可以怎么来画吧。

数据集引入

首先引入数据集,我们还用一样的数据集吧,分别是 Salary_Ranges_by_Job_Classification以及 GlobalLandTemperaturesByCity。(具体数据集可以后台回复 plot获取)

# 导入一些常用包
import pandas as pd
import numpy as np
import seaborn as sns

%matplotlib inline
import matplotlib.pyplot as plt
import matplotlib as mpl
plt.style.use(\'fivethirtyeight\')

#解决中文显示问题,Mac
from matplotlib.font_manager import FontProperties

# 查看本机plt的有效style
print(plt.style.available)
# 根据本机available的style,选择其中一个,因为之前知道ggplot很好看,所以我选择了它
mpl.style.use([\'ggplot\'])

# [\'_classic_test\', \'bmh\', \'classic\', \'dark_background\', \'fast\', \'fivethirtyeight\', \'ggplot\', \'grayscale\', \'seaborn-bright\', \'seaborn-colorblind\', \'seaborn-dark-palette\', \'seaborn-dark\', \'seaborn-darkgrid\', \'seaborn-deep\', \'seaborn-muted\', \'seaborn-notebook\', \'seaborn-paper\', \'seaborn-pastel\', \'seaborn-poster\', \'seaborn-talk\', \'seaborn-ticks\', \'seaborn-white\', \'seaborn-whitegrid\', \'seaborn\', \'Solarize_Light2\']

# 数据集导入

# 引入第 1 个数据集 Salary_Ranges_by_Job_Classification
salary_ranges = pd.read_csv(\'./data/Salary_Ranges_by_Job_Classification.csv\')

# 引入第 2 个数据集 GlobalLandTemperaturesByCity
climate = pd.read_csv(\'./data/GlobalLandTemperaturesByCity.csv\')
# 移除缺失值
climate.dropna(axis=0, inplace=True)
# 只看中国
# 日期转换, 将dt 转换为日期,取年份, 注意map的用法
climate[\'dt\'] = pd.to_datetime(climate[\'dt\'])
climate[\'year\'] = climate[\'dt\'].map(lambda value: value.year)
climate_sub_china = climate.loc[climate[\'Country\'] == \'China\']
climate_sub_china[\'Century\'] = climate_sub_china[\'year\'].map(lambda x:int(x/100 +1))
climate.head()

Python利用matplotlib画出漂亮的分析图表

Python利用matplotlib画出漂亮的分析图表

折线图

折线图是比较简单的图表了,也没有什么好优化的,颜色看起来顺眼就好了。下面是从网上找到了颜色表,可以从中挑选~

Python利用matplotlib画出漂亮的分析图表

# 选择上海部分天气数据
df1 = climate.loc[(climate[\'Country\']==\'China\')&(climate[\'City\']==\'Shanghai\')&(climate[\'dt\']>=\'2010-01-01\')]\\
.loc[:,[\'dt\',\'AverageTemperature\']]\\
.set_index(\'dt\')
df1.head()

Python利用matplotlib画出漂亮的分析图表

# 折线图
df1.plot(colors=[\'lime\'])
plt.title(\'AverageTemperature Of ShangHai\')
plt.ylabel(\'Number of immigrants\')
plt.xlabel(\'Years\')
plt.show()

Python利用matplotlib画出漂亮的分析图表

上面这是单条折线图,多条折线图也是可以画的,只需要多增加几列。

# 多条折线图
df1 = climate.loc[(climate[\'Country\']==\'China\')&(climate[\'City\']==\'Shanghai\')&(climate[\'dt\']>=\'2010-01-01\')]\\
.loc[:,[\'dt\',\'AverageTemperature\']]\\
.rename(columns={\'AverageTemperature\':\'SH\'})
df2 = climate.loc[(climate[\'Country\']==\'China\')&(climate[\'City\']==\'Tianjin\')&(climate[\'dt\']>=\'2010-01-01\')]\\
.loc[:,[\'dt\',\'AverageTemperature\']]\\
.rename(columns={\'AverageTemperature\':\'TJ\'})
df3 = climate.loc[(climate[\'Country\']==\'China\')&(climate[\'City\']==\'Shenyang\')&(climate[\'dt\']>=\'2010-01-01\')]\\
.loc[:,[\'dt\',\'AverageTemperature\']]\\
.rename(columns={\'AverageTemperature\':\'SY\'})
# 合并
df123 = df1.merge(df2, how=\'inner\', on=[\'dt\'])\\
.merge(df3, how=\'inner\', on=[\'dt\'])\\
.set_index([\'dt\'])
df123.head()

Python利用matplotlib画出漂亮的分析图表

# 多条折线图
df123.plot()
plt.title(\'AverageTemperature Of 3 City\')
plt.ylabel(\'Number of immigrants\')
plt.xlabel(\'Years\')
plt.show()

Python利用matplotlib画出漂亮的分析图表

饼图

接下来是画饼图,我们可以优化的点多了一些,比如说从饼块的分离程度,我们先画一个“低配版”的饼图。

df1 = salary_ranges.groupby(\'SetID\', axis=0).sum()

Python利用matplotlib画出漂亮的分析图表

 

# “低配版”饼图
df1[\'Step\'].plot(kind=\'pie\', figsize=(7,7),
autopct=\'%1.1f%%\',
shadow=True)
plt.axis(\'equal\')
plt.show()

Python利用matplotlib画出漂亮的分析图表

# “高配版”饼图
colors = [\'lightgreen\', \'lightblue\'] #控制饼图颜色 [\'lightgreen\', \'lightblue\', \'pink\', \'purple\', \'grey\', \'gold\']
explode=[0, 0.2] #控制饼图分离状态,越大越分离

df1[\'Step\'].plot(kind=\'pie\', figsize=(7, 7),
autopct = \'%1.1f%%\', startangle=90,
shadow=True, labels=None, pctdistance=1.12, colors=colors, explode = explode)
plt.axis(\'equal\')
plt.legend(labels=df1.index, loc=\'upper right\', fontsize=14)
plt.show()

Python利用matplotlib画出漂亮的分析图表

散点图

散点图可以优化的地方比较少了,ggplot2的配色都蛮好看的,正所谓style选的好,省很多功夫!

# 选择上海部分天气数据
df1 = climate.loc[(climate[\'Country\']==\'China\')&(climate[\'City\']==\'Shanghai\')&(climate[\'dt\']>=\'2010-01-01\')]\\
.loc[:,[\'dt\',\'AverageTemperature\']]\\
.rename(columns={\'AverageTemperature\':\'SH\'})

df2 = climate.loc[(climate[\'Country\']==\'China\')&(climate[\'City\']==\'Shenyang\')&(climate[\'dt\']>=\'2010-01-01\')]\\
.loc[:,[\'dt\',\'AverageTemperature\']]\\
.rename(columns={\'AverageTemperature\':\'SY\'})
# 合并
df12 = df1.merge(df2, how=\'inner\', on=[\'dt\'])
df12.head()

Python利用matplotlib画出漂亮的分析图表

# 散点图
df12.plot(kind=\'scatter\', x=\'SH\', y=\'SY\', figsize=(10, 6), color=\'darkred\')
plt.title(\'Average Temperature Between ShangHai - ShenYang\')
plt.xlabel(\'ShangHai\')
plt.ylabel(\'ShenYang\')
plt.show()

Python利用matplotlib画出漂亮的分析图表

面积图

# 多条折线图
df1 = climate.loc[(climate[\'Country\']==\'China\')&(climate[\'City\']==\'Shanghai\')&(climate[\'dt\']>=\'2010-01-01\')]\\
.loc[:,[\'dt\',\'AverageTemperature\']]\\
.rename(columns={\'AverageTemperature\':\'SH\'})
df2 = climate.loc[(climate[\'Country\']==\'China\')&(climate[\'City\']==\'Tianjin\')&(climate[\'dt\']>=\'2010-01-01\')]\\
.loc[:,[\'dt\',\'AverageTemperature\']]\\
.rename(columns={\'AverageTemperature\':\'TJ\'})
df3 = climate.loc[(climate[\'Country\']==\'China\')&(climate[\'City\']==\'Shenyang\')&(climate[\'dt\']>=\'2010-01-01\')]\\
.loc[:,[\'dt\',\'AverageTemperature\']]\\
.rename(columns={\'AverageTemperature\':\'SY\'})
# 合并
df123 = df1.merge(df2, how=\'inner\', on=[\'dt\'])\\
.merge(df3, how=\'inner\', on=[\'dt\'])\\
.set_index([\'dt\'])
df123.head()

Python利用matplotlib画出漂亮的分析图表

colors = [\'red\', \'pink\', \'blue\'] #控制饼图颜色 [\'lightgreen\', \'lightblue\', \'pink\', \'purple\', \'grey\', \'gold\']
df123.plot(kind=\'area\', stacked=False,
figsize=(20, 10), colors=colors)
plt.title(\'AverageTemperature Of 3 City\')
plt.ylabel(\'AverageTemperature\')
plt.xlabel(\'Years\')
plt.show()

Python利用matplotlib画出漂亮的分析图表

直方图

# 选择上海部分天气数据
df = climate.loc[(climate[\'Country\']==\'China\')&(climate[\'City\']==\'Shanghai\')&(climate[\'dt\']>=\'2010-01-01\')]\\
.loc[:,[\'dt\',\'AverageTemperature\']]\\
.set_index(\'dt\')
df.head()

Python利用matplotlib画出漂亮的分析图表

# 最简单的直方图
df[\'AverageTemperature\'].plot(kind=\'hist\', figsize=(8,5), colors=[\'grey\'])
plt.title(\'ShangHai AverageTemperature Of 2010-2013\') # add a title to the histogram
plt.ylabel(\'Number of month\') # add y-label
plt.xlabel(\'AverageTemperature\') # add x-label
plt.show()

Python利用matplotlib画出漂亮的分析图表

条形图

# 选择上海部分天气数据
df = climate.loc[(climate[\'Country\']==\'China\')&(climate[\'City\']==\'Shanghai\')&(climate[\'dt\']>=\'2010-01-01\')]\\
.loc[:,[\'dt\',\'AverageTemperature\']]\\
.set_index(\'dt\')
df.head()

Python利用matplotlib画出漂亮的分析图表

df.plot(kind=\'bar\', figsize = (10, 6))
plt.xlabel(\'Month\')
plt.ylabel(\'AverageTemperature\')
plt.title(\'AverageTemperature of shanghai\')
plt.show()

Python利用matplotlib画出漂亮的分析图表

df.plot(kind=\'barh\', figsize=(12, 16), color=\'steelblue\')
plt.xlabel(\'AverageTemperature\')
plt.ylabel(\'Month\')
plt.title(\'AverageTemperature of shanghai\')
plt.show()

Python利用matplotlib画出漂亮的分析图表

© 版权声明
THE END
喜欢就支持一下吧
点赞0 分享
评论 抢沙发

请登录后发表评论

    暂无评论内容