Introduction to Matplotlib
Matplotlib is a comprehensive library for creating static, animated, and interactive visualizations in Python. It provides a MATLAB-like interface, particularly when used with the pyplot module. In this lesson, we'll explore the basics of Matplotlib and how to create various types of plots.
What is Matplotlib?
Matplotlib is a plotting library for Python that provides a wide variety of plots and figures. It's designed to be compatible with NumPy and is often used in conjunction with other libraries like Pandas and SciPy. Matplotlib is particularly useful for:
- Creating publication-quality plots
- Making interactive figures that can respond to user input
- Embedding plots in graphical user interfaces
- Generating plots for web applications
Installing Matplotlib
You can install Matplotlib using pip:
pip install matplotlib
Or using conda:
conda install matplotlib
Importing Matplotlib
The conventional way to import Matplotlib is:
import matplotlib.pyplot as plt
Basic Plotting
Let's start with some basic plotting examples.
Line Plot
import matplotlib.pyplot as plt
import numpy as np
# Create data
x = np.linspace(0, 10, 100)
y = np.sin(x)
# Create a figure and axis
plt.figure(figsize=(8, 4))
# Plot data
plt.plot(x, y)
# Add labels and title
plt.xlabel('x')
plt.ylabel('sin(x)')
plt.title('Sine Wave')
# Add grid
plt.grid(True)
# Show the plot
plt.show()
Multiple Lines
import matplotlib.pyplot as plt
import numpy as np
# Create data
x = np.linspace(0, 2*np.pi, 100)
y1 = np.sin(x)
y2 = np.cos(x)
# Create a figure and axis
plt.figure(figsize=(8, 4))
# Plot data
plt.plot(x, y1, label='sin(x)')
plt.plot(x, y2, label='cos(x)')
# Add labels and title
plt.xlabel('x')
plt.ylabel('y')
plt.title('Sine and Cosine Waves')
# Add legend
plt.legend()
# Add grid
plt.grid(True)
# Show the plot
plt.show()
Customizing Line Properties
import matplotlib.pyplot as plt
import numpy as np
# Create data
x = np.linspace(0, 10, 100)
y1 = np.sin(x)
y2 = np.cos(x)
# Create a figure and axis
plt.figure(figsize=(8, 4))
# Plot data with custom properties
plt.plot(x, y1, color='blue', linestyle='-', linewidth=2, marker='o', markersize=4, label='sin(x)')
plt.plot(x, y2, color='red', linestyle='--', linewidth=2, marker='s', markersize=4, label='cos(x)')
# Add labels and title
plt.xlabel('x')
plt.ylabel('y')
plt.title('Customized Line Plot')
# Add legend
plt.legend()
# Show the plot
plt.show()
Different Types of Plots
Matplotlib supports many different types of plots.
Scatter Plot
import matplotlib.pyplot as plt
import numpy as np
# Create random data
np.random.seed(42)
x = np.random.rand(50)
y = np.random.rand(50)
colors = np.random.rand(50)
sizes = 1000 * np.random.rand(50)
# Create a scatter plot
plt.figure(figsize=(8, 6))
plt.scatter(x, y, c=colors, s=sizes, alpha=0.5, cmap='viridis')
# Add labels and title
plt.xlabel('x')
plt.ylabel('y')
plt.title('Scatter Plot')
# Add colorbar
plt.colorbar()
# Show the plot
plt.show()
Bar Plot
import matplotlib.pyplot as plt
import numpy as np
# Create data
categories = ['A', 'B', 'C', 'D', 'E']
values = [25, 40, 30, 55, 15]
# Create a bar plot
plt.figure(figsize=(8, 6))
plt.bar(categories, values, color='skyblue')
# Add labels and title
plt.xlabel('Category')
plt.ylabel('Value')
plt.title('Bar Plot')
# Show the plot
plt.show()
Horizontal Bar Plot
import matplotlib.pyplot as plt
import numpy as np
# Create data
categories = ['A', 'B', 'C', 'D', 'E']
values = [25, 40, 30, 55, 15]
# Create a horizontal bar plot
plt.figure(figsize=(8, 6))
plt.barh(categories, values, color='salmon')
# Add labels and title
plt.xlabel('Value')
plt.ylabel('Category')
plt.title('Horizontal Bar Plot')
# Show the plot
plt.show()
Histogram
import matplotlib.pyplot as plt
import numpy as np
# Create random data
np.random.seed(42)
data = np.random.randn(1000)
# Create a histogram
plt.figure(figsize=(8, 6))
plt.hist(data, bins=30, color='skyblue', edgecolor='black', alpha=0.7)
# Add labels and title
plt.xlabel('Value')
plt.ylabel('Frequency')
plt.title('Histogram')
# Show the plot
plt.show()
Pie Chart
import matplotlib.pyplot as plt
import numpy as np
# Create data
labels = ['A', 'B', 'C', 'D', 'E']
sizes = [15, 30, 25, 10, 20]
explode = (0, 0.1, 0, 0, 0) # explode the 2nd slice
# Create a pie chart
plt.figure(figsize=(8, 8))
plt.pie(sizes, explode=explode, labels=labels, autopct='%1.1f%%',
shadow=True, startangle=90)
# Equal aspect ratio ensures that pie is drawn as a circle
plt.axis('equal')
# Add title
plt.title('Pie Chart')
# Show the plot
plt.show()
Box Plot
import matplotlib.pyplot as plt
import numpy as np
# Create random data
np.random.seed(42)
data = [np.random.normal(0, std, 100) for std in range(1, 6)]
# Create a box plot
plt.figure(figsize=(8, 6))
plt.boxplot(data, labels=['A', 'B', 'C', 'D', 'E'])
# Add labels and title
plt.xlabel('Category')
plt.ylabel('Value')
plt.title('Box Plot')
# Show the plot
plt.show()
Subplots
Matplotlib allows you to create multiple plots in a single figure using subplots.
Basic Subplots
import matplotlib.pyplot as plt
import numpy as np
# Create data
x = np.linspace(0, 10, 100)
y1 = np.sin(x)
y2 = np.cos(x)
y3 = np.tan(x)
y4 = np.exp(-x/10)
# Create a figure and subplots
fig, axs = plt.subplots(2, 2, figsize=(10, 8))
# Plot on each subplot
axs[0, 0].plot(x, y1)
axs[0, 0].set_title('Sine')
axs[0, 0].set_xlabel('x')
axs[0, 0].set_ylabel('sin(x)')
axs[0, 1].plot(x, y2)
axs[0, 1].set_title('Cosine')
axs[0, 1].set_xlabel('x')
axs[0, 1].set_ylabel('cos(x)')
axs[1, 0].plot(x, y3)
axs[1, 0].set_title('Tangent')
axs[1, 0].set_xlabel('x')
axs[1, 0].set_ylabel('tan(x)')
axs[1, 1].plot(x, y4)
axs[1, 1].set_title('Exponential')
axs[1, 1].set_xlabel('x')
axs[1, 1].set_ylabel('exp(-x/10)')
# Adjust layout
plt.tight_layout()
# Show the plot
plt.show()
Different Types of Plots in Subplots
import matplotlib.pyplot as plt
import numpy as np
# Create data
np.random.seed(42)
x = np.linspace(0, 10, 100)
y = np.sin(x)
data = np.random.randn(1000)
categories = ['A', 'B', 'C', 'D']
values = [25, 40, 30, 55]
# Create a figure and subplots
fig, axs = plt.subplots(2, 2, figsize=(10, 8))
# Line plot
axs[0, 0].plot(x, y)
axs[0, 0].set_title('Line Plot')
axs[0, 0].set_xlabel('x')
axs[0, 0].set_ylabel('sin(x)')
# Histogram
axs[0, 1].hist(data, bins=30, color='skyblue', edgecolor='black')
axs[0, 1].set_title('Histogram')
axs[0, 1].set_xlabel('Value')
axs[0, 1].set_ylabel('Frequency')
# Bar plot
axs[1, 0].bar(categories, values, color='salmon')
axs[1, 0].set_title('Bar Plot')
axs[1, 0].set_xlabel('Category')
axs[1, 0].set_ylabel('Value')
# Scatter plot
scatter_x = np.random.rand(50)
scatter_y = np.random.rand(50)
colors = np.random.rand(50)
sizes = 500 * np.random.rand(50)
axs[1, 1].scatter(scatter_x, scatter_y, c=colors, s=sizes, alpha=0.5, cmap='viridis')
axs[1, 1].set_title('Scatter Plot')
axs[1, 1].set_xlabel('x')
axs[1, 1].set_ylabel('y')
# Adjust layout
plt.tight_layout()
# Show the plot
plt.show()
Customizing Plots
Matplotlib provides many options for customizing plots.
Colors and Styles
import matplotlib.pyplot as plt
import numpy as np
# Create data
x = np.linspace(0, 10, 100)
y = np.sin(x)
# Available styles
print(plt.style.available)
# Set style
plt.style.use('seaborn-darkgrid')
# Create a figure and axis
plt.figure(figsize=(8, 4))
# Plot with custom colors
plt.plot(x, y, color='#FF5733', linewidth=2)
# Add labels and title
plt.xlabel('x', fontsize=12)
plt.ylabel('sin(x)', fontsize=12)
plt.title('Customized Plot', fontsize=14, fontweight='bold')
# Customize ticks
plt.xticks(fontsize=10)
plt.yticks(fontsize=10)
# Add grid with custom properties
plt.grid(True, linestyle='--', alpha=0.7)
# Show the plot
plt.show()
Adding Text and Annotations
import matplotlib.pyplot as plt
import numpy as np
# Create data
x = np.linspace(0, 10, 100)
y = np.sin(x)
# Create a figure and axis
plt.figure(figsize=(8, 4))
# Plot data
plt.plot(x, y)
# Add text
plt.text(4, 0.8, 'sin(x) function', fontsize=12, color='blue')
# Add annotation with arrow
plt.annotate('Maximum', xy=(np.pi/2, 1), xytext=(np.pi/2 + 1, 0.8),
arrowprops=dict(facecolor='black', shrink=0.05))
# Add labels and title
plt.xlabel('x')
plt.ylabel('sin(x)')
plt.title('Text and Annotations')
# Show the plot
plt.show()
Customizing Legends
import matplotlib.pyplot as plt
import numpy as np
# Create data
x = np.linspace(0, 10, 100)
y1 = np.sin(x)
y2 = np.cos(x)
# Create a figure and axis
plt.figure(figsize=(8, 4))
# Plot data
plt.plot(x, y1, label='sin(x)')
plt.plot(x, y2, label='cos(x)')
# Add labels and title
plt.xlabel('x')
plt.ylabel('y')
plt.title('Customized Legend')
# Add legend with custom properties
plt.legend(loc='lower left', fontsize=12, frameon=True, facecolor='lightgray', edgecolor='black')
# Show the plot
plt.show()
Saving Plots
Matplotlib allows you to save plots in various formats.
import matplotlib.pyplot as plt
import numpy as np
# Create data
x = np.linspace(0, 10, 100)
y = np.sin(x)
# Create a figure and axis
plt.figure(figsize=(8, 4))
# Plot data
plt.plot(x, y)
# Add labels and title
plt.xlabel('x')
plt.ylabel('sin(x)')
plt.title('Saving Plots')
# Save the plot in different formats
plt.savefig('plot.png', dpi=300, bbox_inches='tight') # PNG format
plt.savefig('plot.pdf', bbox_inches='tight') # PDF format
plt.savefig('plot.svg', bbox_inches='tight') # SVG format
# Show the plot
plt.show()
3D Plotting
Matplotlib also supports 3D plotting through the mplot3d toolkit.
import matplotlib.pyplot as plt
from mpl_toolkits.mplot3d import Axes3D
import numpy as np
# Create data
x = np.linspace(-5, 5, 50)
y = np.linspace(-5, 5, 50)
X, Y = np.meshgrid(x, y)
Z = np.sin(np.sqrt(X**2 + Y**2))
# Create a figure and 3D axis
fig = plt.figure(figsize=(10, 8))
ax = fig.add_subplot(111, projection='3d')
# Create a surface plot
surf = ax.plot_surface(X, Y, Z, cmap='viridis', edgecolor='none')
# Add labels and title
ax.set_xlabel('X')
ax.set_ylabel('Y')
ax.set_zlabel('Z')
ax.set_title('3D Surface Plot')
# Add colorbar
fig.colorbar(surf, ax=ax, shrink=0.5, aspect=5)
# Show the plot
plt.show()
Interactive Plotting
Matplotlib can be used in interactive mode, which is useful for exploratory data analysis.
import matplotlib.pyplot as plt
import numpy as np
# Turn on interactive mode
plt.ion()
# Create data
x = np.linspace(0, 10, 100)
y = np.sin(x)
# Create a figure and axis
fig, ax = plt.subplots()
line, = ax.plot(x, y)
# Add labels and title
ax.set_xlabel('x')
ax.set_ylabel('sin(x)')
ax.set_title('Interactive Plot')
# Update the plot
for i in range(100):
line.set_ydata(np.sin(x + i/10))
fig.canvas.draw()
fig.canvas.flush_events()
plt.pause(0.1)
# Turn off interactive mode
plt.ioff()
plt.show()
Integration with Pandas
Matplotlib integrates well with Pandas, making it easy to plot data from DataFrames.
import matplotlib.pyplot as plt
import pandas as pd
import numpy as np
# Create a DataFrame
np.random.seed(42)
df = pd.DataFrame({
'A': np.random.randn(100).cumsum(),
'B': np.random.randn(100).cumsum(),
'C': np.random.randn(100).cumsum(),
'D': np.random.randn(100).cumsum()
})
# Plot using Pandas
df.plot(figsize=(10, 6))
plt.title('Line Plot from DataFrame')
plt.xlabel('Index')
plt.ylabel('Value')
plt.grid(True)
plt.show()
# Bar plot
df.iloc[::10].plot.bar(figsize=(10, 6))
plt.title('Bar Plot from DataFrame')
plt.xlabel('Index')
plt.ylabel('Value')
plt.show()
# Histogram
df.plot.hist(bins=20, alpha=0.5, figsize=(10, 6))
plt.title('Histogram from DataFrame')
plt.xlabel('Value')
plt.ylabel('Frequency')
plt.show()
# Scatter plot
df.plot.scatter(x='A', y='B', c='C', cmap='viridis', s=df['D'].abs()*100, alpha=0.5, figsize=(10, 6))
plt.title('Scatter Plot from DataFrame')
plt.show()
Best Practices for Data Visualization
Here are some best practices to follow when creating visualizations:
- Keep it simple: Avoid cluttering your plots with unnecessary elements
- Choose appropriate plot types for your data
- Use clear and descriptive titles, labels, and legends
- Use color effectively, but be mindful of color blindness
- Ensure text is readable (appropriate font sizes and styles)
- Include units of measurement when applicable
- Use consistent styling across related plots
- Consider your audience when designing visualizations
Try experimenting with Matplotlib in the code playground below!
Code Playground
Code output will appear here...