본문 바로가기

Data Science/Data Visualization

[04. Area Chart] 005. Ridgeline Plot

728x90

여러 그룹에 대한 숫자 값의 분포를 보여주는 차트이다. 히스토그램을 수평으로 여러 개 중첩한 모양이다.

 

 

import matplotlib.pyplot as plt
import seaborn as sns
import pandas as pd

if __name__ == '__main__':
    sns.set_theme(style="white", rc={"axes.facecolor": (0, 0, 0, 0)})
    temp = pd.read_csv(
        'https://raw.githubusercontent.com/plotly/datasets/master/2016-weather-data-seattle.csv')  # we retrieve the data from plotly's GitHub repository
    temp['month'] = pd.to_datetime(temp['Date']).dt.month  # we store the month in a separate column
    month_dict = {1: 'january',
                  2: 'february',
                  3: 'march',
                  4: 'april',
                  5: 'may',
                  6: 'june',
                  7: 'july',
                  8: 'august',
                  9: 'september',
                  10: 'october',
                  11: 'november',
                  12: 'december'}
    temp['month'] = temp['month'].map(month_dict)
    month_mean_serie = temp.groupby('month')['Mean_TemperatureC'].mean()
    temp['mean_month'] = temp['month'].map(month_mean_serie)
    print(temp.head())
    pal = sns.color_palette(palette='coolwarm', n_colors=12)
    g = sns.FacetGrid(temp, row='month', hue='mean_month', aspect=15, height=0.75, palette=pal)
    g.map(sns.kdeplot, 'Mean_TemperatureC', bw_adjust=1, clip_on=False, fill=True, alpha=1, linewidth=1.5)
    g.map(sns.kdeplot, 'Mean_TemperatureC', bw_adjust=1, clip_on=False, color="w", lw=2)
    g.map(plt.axhline, y=0, lw=2, clip_on=False)
    g.fig.subplots_adjust(hspace=-0.3)
    g.set_ylabels('')
    g.set_titles('')
    g.set(yticks=[])
    g.despine(bottom=True, left=True)

    for i, ax in enumerate(g.axes.flat):
        ax.text(-15, 0.02, month_dict[i + 1], fontweight='bold', fontsize=15, color=ax.lines[-1].get_color())

    plt.xlabel('Temperature in degree Celsius', fontweight='bold', fontsize=15)
    g.fig.suptitle('Daily average temperature in Seattle per month', fontsize=20, fontweight=20)
    plt.show()


결과 값
       Date  Max_TemperatureC  ...    month  mean_month
0  1/1/1948                10  ...  january    4.493982
1  1/2/1948                 6  ...  january    4.493982
2  1/3/1948                 7  ...  january    4.493982
3  1/4/1948                 7  ...  january    4.493982
4  1/5/1948                 7  ...  january    4.493982

 

728x90