Data Visualization with Seaborn (Part #3)

⭐️ Part #3 of a 3-Part Series


Continuing from Part 1 and Part 2 of my seaborn series, we'll proceed to cover 3D plots.


This notebook is a reorganization of the many ideas shared in this Github repo and this blog post. What you see here is a modified version that works for me that I hope will work for you as well. Also, enjoy the cat GIFs.


seaborn visualizations

3D: Visualizing Data in Three Dimensions

Visualizing data all the way to two dimensions is quite straightforward. But things start to become more complicated as the number of dimensions (or attributes) starts increasing. The reason being that we are often bounded by the two dimensions of our display mediums and our environment.

For 3D data, we can introduce a fake notion of depth by taking a z-axis in our chart or leveraging subplots and facets. However, for data higher than 3D, it becomes even more difficult to visualize with the same technique.

The best way to go higher than 3D is to use plot facets, color, shapes, sizes, depth and so on. You can also use time as a dimension by making an animated plot for other attributes over time.

For the following plot, we'll use color (i.e. hue) as the third dimension to represent wine_type.

# Attributes of interest
cols = ['density', 
        'residual sugar', 
        'total sulfur dioxide', 
        'fixed acidity', 
        'wine_type']
        
pp = sns.pairplot(data=wines[cols], 
                  hue='wine_type', # <== 😀 Look here!
                  size=1.8, aspect=1.8, 
                  palette={"red": "#FF9999", "white": "#FFE888"},
                  plot_kws=dict(edgecolor="black", linewidth=0.5))
fig = pp.fig 
fig.subplots_adjust(top=0.93, wspace=0.3)
fig.suptitle('Wine Attributes Pairwise Plots', fontsize=14)

output_114_0


3D: Three Continuous Numeric Attributes

[💔] The traditional way — using matplotlib:

fig = plt.figure(figsize=(8, 6))
ax = fig.add_subplot(111, projection='3d')

xs = wines['residual sugar']
ys = wines['fixed acidity']
zs = wines['alcohol']
ax.scatter(xs, ys, zs, s=50, alpha=0.6, edgecolors='w')

ax.set_xlabel('Residual Sugar')
ax.set_ylabel('Fixed Acidity')
ax.set_zlabel('Alcohol')

plt.show()

output_117_0

[💚] The better alternative — using Seaborn + toggle the size via the s parameter:

plt.scatter(x = wines['fixed acidity'], 
            y = wines['alcohol'], 
            s = wines['residual sugar']*25, # <== 😀 Look here!
            alpha=0.4, 
            edgecolors='w')

plt.xlabel('Fixed Acidity')
plt.ylabel('Alcohol')
plt.title('Wine Alcohol Content - Fixed Acidity - Residual Sugar', y=1.05)

output_118_1


3D: Three Discrete Categorical Attributes

Using factorplot():

fc = sns.factorplot(x="quality", 
                    hue="wine_type", 
                    col="quality_label", # <== 😀 Look here!
                    data=wines, 
                    kind="count",
                    palette={"red": "#FF9999", "white": "#FFE888"})
  1. The attribute quality is represented via the x-axis.
  2. The attribute wine_type is represented by the color.
  3. The attribute quality_label is split into 3 columnslow, medium, and high.

output_121_0


3D Mixed Attributes (Numeric & Categorical)

Using sns.pairplot():

# Plot pairwise relationships in a dataset.
jp = sns.pairplot(data=wines, 
                  x_vars=["sulphates"], 
                  y_vars=["alcohol"], 
                  size=4.5,
                  hue="wine_type", # <== 😀 Look here!
                  palette={"red": "#FF9999", "white": "#FFE888"},
                  plot_kws=dict(edgecolor="k", linewidth=0.5))
  1. The attribute sulphates is represented via the x-axis.
  2. The attribute alcohol is represented via the y-axis.
  3. The attribute wine_type is represented by the color.

output_124_0


Using sns.lmplot() to fit linear regression models to the scatter plots:

# Plot data and regression model fits across a FacetGrid.
lp = sns.lmplot(data=wines,
                x='sulphates', 
                y='alcohol', 
                hue='wine_type', # <== 😀 Look here!
                palette={"red": "#FF9999", "white": "#FFE888"},
                fit_reg=True, # <== 😀 Look here!
                legend=True,
                scatter_kws=dict(edgecolor="k", linewidth=0.5))

output_125_0


Using sns.kdeplot():

ax = sns.kdeplot(white_wine['sulphates'], # <== 😀 Look here!
                 white_wine['alcohol'],   # <== 😀 Look here!
                 cmap="YlOrBr", 
                 shade=True, shade_lowest=False)

ax = sns.kdeplot(red_wine['sulphates'], # <== 😀 Look here!
                 red_wine['alcohol'],   # <== 😀 Look here!
                 cmap="Reds", 
                 shade=True, shade_lowest=False)

output_126_0


For box plots [📦] and violin plots [🎻], we can split them based on wine_type:

f, (ax1, ax2) = plt.subplots(1, 2, figsize=(14, 4))
f.suptitle('Wine Type - Quality - Acidity', fontsize=14)

#############
# Left Plot #
#############
sns.violinplot(data=wines, 
               x="quality",
               y="volatile acidity",
               inner="quart", linewidth=1.3,
               ax=ax1)

ax1.set_xlabel("Wine Quality",size=12,alpha=0.8)
ax1.set_ylabel("Wine Volatile Acidity",size=12,alpha=0.8)

##############
# Right Plot #
##############
sns.violinplot(data=wines,
               x="quality", 
               y="volatile acidity", 
               hue="wine_type", # <== 😀 Look here!
               split=True,      # <== 😀 Look here!
               palette={"red": "#FF9999",  # <== 😀 Look here!
                        "white": "white"}, # <== 😀 Look here!
               inner="quart", linewidth=1.3,
               ax=ax2)

ax2.set_xlabel("Wine Quality",size=12,alpha=0.8)
ax2.set_ylabel("Wine Volatile Acidity",size=12,alpha=0.8)
plt.legend(loc='upper right', title='Wine Type')

output_127_0


f, (ax1, ax2) = plt.subplots(1, 2, figsize=(14, 4))
f.suptitle('Wine Type - Quality - Alcohol Content', fontsize=14)

#############
# Left Plot #
#############
sns.boxplot(data=wines, 
            x="quality",
            y="alcohol", 
            hue="wine_type", # <== 😀 Look here!
            palette={"red": "#FF9999",  # <== 😀 Look here!
                     "white": "white"}, # <== 😀 Look here!
            ax=ax1)

ax1.set_xlabel("Wine Quality",size=12,alpha=0.8)
ax1.set_ylabel("Wine Alcohol %",size=12,alpha=0.8)

##############
# Right Plot #
##############
sns.boxplot(data=wines, 
            x="quality_label",
            y="alcohol", 
            hue="wine_type", # <== 😀 Look here!
            palette={"red": "#FF9999",  # <== 😀 Look here!
                     "white": "white"}, # <== 😀 Look here!
            ax=ax2)

ax2.set_xlabel("Wine Quality Class",size=12,alpha=0.8)
ax2.set_ylabel("Wine Alcohol %",size=12,alpha=0.8)
plt.legend(loc='best', title='Wine Type')

output_128_0


💥 More Dimensions?! 💥

cat-asking-for-more


4D: Visualizing Data in Four Dimensions

Factors:
  1. X-axis
  2. Y-axis
  3. Size
  4. Color
size = wines['residual sugar']*25
fill_colors = ['#FF9999' if wt=='red' else '#FFE888' for wt in list(wines['wine_type'])]
edge_colors = ['red' if wt=='red' else 'orange' for wt in list(wines['wine_type'])]

plt.scatter(wines['fixed acidity'], # <== 😀 1st DIMENSION
            wines['alcohol'],       # <== 😀 2nd DIMENSION
            s=size,                 # <== 😀 3rd DIMENSION
            color=fill_colors,      # <== 😀 4th DIMENSION             
            edgecolors=edge_colors,
            alpha=0.4)

plt.xlabel('Fixed Acidity')
plt.ylabel('Alcohol')
plt.title('Wine Alcohol Content - Fixed Acidity - Residual Sugar - Type',y=1.05)

output_134_1


Factors:
  1. X-axis
  2. Y-axis
  3. Color
  4. n-Columns ← 😀
g = sns.FacetGrid(wines, 
                  col="wine_type",            # 😀 TWO COLUMNS coz there're TWO "wine types"
                  col_order=['red', 'white'], # -> Specify the labels
                  hue='quality_label',        # ADD COLOR
                  hue_order=['low', 'medium', 'high'],
                  aspect=1.2, 
                  size=3.5, 
                  palette=sns.light_palette('navy', 4)[1:])

g.map(plt.scatter, 
      "volatile acidity", # <== x-axis
      "alcohol",          # <== y-axis
      alpha=0.9, 
      edgecolor='white', linewidth=0.5, s=100)

fig = g.fig 
fig.subplots_adjust(top=0.8, wspace=0.3)
fig.suptitle('Wine Type - Alcohol - Quality - Acidity', fontsize=14)
g.add_legend(title='Wine Quality Class')

output_136_0


Factors (same as the plot before):
  1. X-axis
  2. Y-axis
  3. Color
  4. n-Columns ← 😀
g = sns.FacetGrid(wines, 
                  col="wine_type",            # 😀 TWO COLUMNS coz there're TWO "wine types"
                  col_order=['red', 'white'], # -> Specify the labels
                  hue='quality_label',        # ADD COLOR
                  hue_order=['low', 'medium', 'high'],
                  aspect=1.2, 
                  size=3.5, 
                  palette=sns.light_palette('green', 4)[1:])

g.map(plt.scatter, 
      "volatile acidity",     # <== x-axis
      "total sulfur dioxide", # <== y-axis
      alpha=0.9, 
      edgecolor='white', linewidth=0.5, s=100)

fig = g.fig 
fig.subplots_adjust(top=0.8, wspace=0.3)
fig.suptitle('Wine Type - Sulfur Dioxide - Acidity - Quality', fontsize=14)
g.add_legend(title='Wine Quality Class')

output_137_0


5D: Visualizing Data in Five Dimensions

Factors:
  1. X-axis
  2. Y-axis
  3. Color
  4. n-Columns
  5. Size ← 😀
g = sns.FacetGrid(wines, 
                  col="wine_type",            # TWO COLUMNS coz there're TWO "wine types"
                  col_order=['red', 'white'], # -> Specify the labels
                  hue='quality_label',        # ADD COLOR
                  hue_order=['low', 'medium', 'high'],
                  aspect=1.2, 
                  size=3.5)

g.map(plt.scatter, 
      "residual sugar", # <== x-axis
      "alcohol",        # <== y-axis
      alpha=0.5, 
      edgecolor='white', 
      linewidth=0.5, 
      s=wines['total sulfur dioxide']*2) # <== 😀 Adjust the size

fig = g.fig 
fig.subplots_adjust(top=0.8, wspace=0.3)
fig.suptitle('Wine Type - Sulfur Dioxide - Residual Sugar - Alcohol - Quality', fontsize=14)
g.add_legend(title='Wine Quality Class')

output_141_0


6D: Visualizing Data in Six Dimensions

Factors:
  1. X-axis
  2. Y-axis
  3. Color
  4. n-Columns
  5. Size
  6. m-Rows ← 😀
g = sns.FacetGrid(wines, 
                  row='wine_type',     # <== 1) 😀 ROW
                  col="quality",       # <== 2) 😀 COLUMN
                  hue='quality_label', # <== 3) 😀 COLOR
                  size=4)

g.map(plt.scatter,  
      "residual sugar", # <== 4) 😀 x-axis
      "alcohol",        # <== 5) 😀 y-axis
      alpha=0.5, 
      edgecolor='k', 
      linewidth=0.5, 
      s=wines['total sulfur dioxide']*2) # <== 6) 😀 Size

fig = g.fig 
fig.set_size_inches(18, 8)
fig.subplots_adjust(top=0.85, wspace=0.3)
fig.suptitle('Wine Type - Sulfur Dioxide - Residual Sugar - Alcohol - Quality Class - Quality Rating', fontsize=14)
g.add_legend(title='Wine Quality Class')

output_145_0


🎉 Congrats on Completing the Series! 🎉

Badass Cat


~ The Complete Seaborn Series ~

Part #1 (1D)

Part #2 (2D)

Part #3 (📍)


If you enjoyed this post and want to buy me a cup of coffee...

The thing is, I'll always accept a cup of coffee. So feel free to buy me one.

Cheers! ☕️