The foundational library for creating static, animated, and interactive visualizations in Python. Highly customizable and the industry standard for publication-quality figures. Use for 2D plotting, scientific data visualization, heatmaps, contours, vector fields, multi-panel figures, LaTeX-formatted plots, custom visualization tools, and plotting from NumPy arrays or Pandas DataFrames.
Install
npx skillscat add tondevrel/scientific-agent-skills/matplotlib Install via the SkillsCat registry.
Matplotlib - Data Visualization
The most widely used library for 2D (and basic 3D) plotting. It provides full control over every element of a figure, from line styles to axis spines.
When to Use
- Creating publication-quality 2D plots (Line, Scatter, Bar, Hist)
- Visualizing scientific data (Heatmaps, Contours, Vector fields)
- Generating complex multi-panel figures
- Fine-tuning plots for papers/reports (LaTeX support)
- Building custom visualization tools and dashboards
- Plotting data directly from NumPy arrays or Pandas DataFrames
Reference Documentation
Official docs: https://matplotlib.org/stable/index.html
Gallery: https://matplotlib.org/stable/gallery/index.html (Essential for finding examples)
Search patterns: plt.subplots, ax.set_title, ax.legend, plt.savefig, matplotlib.colors
Core Principles
Two Interfaces: Choose Wisely
| Interface | Method | Use Case |
|---|---|---|
| Object-Oriented (OO) | fig, ax = plt.subplots() |
Recommended. Best for complex, reproducible plots. |
| Pyplot (State-based) | plt.plot(x, y) |
Quick interactive checks. Avoid for scripts/modules. |
Use Matplotlib For
- High-level control over figure layout.
- Precise styling for publication.
- Embedding plots in GUI applications.
Do NOT Use For
- Interactive web dashboards (use Plotly or Bokeh).
- Rapid statistical exploration (use Seaborn — it's built on Matplotlib but simpler for stats).
- Very large datasets (>1M points) in real-time (use Datashader or VisPy).
Quick Reference
Installation
pip install matplotlibStandard Imports
import matplotlib.pyplot as plt
import numpy as np
import pandas as pd
from matplotlib import gridspecBasic Pattern - The OO Interface (The "Proper" Way)
import matplotlib.pyplot as plt
import numpy as np
x = np.linspace(0, 10, 100)
y = np.sin(x)
# 1. Create Figure and Axis objects
fig, ax = plt.subplots(figsize=(8, 5))
# 2. Plot data
ax.plot(x, y, label='Sine Wave', color='tab:blue', linewidth=2)
# 3. Customize
ax.set_xlabel('Time (s)')
ax.set_ylabel('Amplitude')
ax.set_title('Oscillation Example')
ax.legend()
ax.grid(True, linestyle='--')
# 4. Show or Save
plt.show()
# fig.savefig('plot.pdf', dpi=300, bbox_inches='tight')Critical Rules
✅ DO
- Use the OO interface (
ax.method()) - It prevents errors in multi-plot scripts. - Use
bbox_inches='tight'- When saving, to ensure labels aren't cut off. - Set dpi - Use 300+ for print, 72-100 for web.
- Close figures - Use
plt.close('all')in loops to avoid memory leaks. - Label everything - Every axis must have a label and units.
- Vector formats - Save as
.pdfor.svgfor academic papers (lossless scaling). - Colorblind-friendly - Use
tab10orviridiscolormaps.
❌ DON'T
- Mix
plt.andax.- It leads to "hidden state" bugs. - Use
plt.show()in loops - It blocks execution; usefig.savefig()instead. - Manual legend placement - Let
ax.legend(loc='best')try first. - Hardcode font sizes - Use
plt.rcParams.update({'font.size': 12})for consistency. - Use "Rainbow" (Jet) - It creates false gradients; use perceptually uniform maps like
magmaorinferno.
Anti-Patterns (NEVER)
# ❌ BAD: Mixing interfaces (State-based + OO)
plt.figure()
ax = plt.gca()
plt.plot(x, y) # Confusing state
ax.set_title('Test')
# ✅ GOOD: Consistent OO interface
fig, ax = plt.subplots()
ax.plot(x, y)
ax.set_title('Test')
# ❌ BAD: Overlapping subplots
fig, axs = plt.subplots(2, 2)
# Plots look squashed and titles overlap
# ✅ GOOD: Use constrained_layout or tight_layout
fig, axs = plt.subplots(2, 2, constrained_layout=True)Anatomy of a Plot
Labels, Ticks, and Styles
fig, ax = plt.subplots()
ax.plot(x, y, 'o-', color='red', markersize=4, alpha=0.7)
# Explicitly setting limits
ax.set_xlim(0, 10)
ax.set_ylim(-1.5, 1.5)
# Controlling Ticks
ax.set_xticks([0, 2.5, 5, 7.5, 10])
ax.set_xticklabels(['Start', '1/4', 'Mid', '3/4', 'End'])
# Spines (Box around the plot)
ax.spines['top'].set_visible(False)
ax.spines['right'].set_visible(False)
# Adding text and arrows
ax.annotate('Local Max', xy=(1.5, 1), xytext=(3, 1.2),
arrowprops=dict(facecolor='black', shrink=0.05))Advanced Layouts
Subplots and GridSpec
# Simple 2x2 grid
fig, axs = plt.subplots(2, 2, figsize=(10, 10))
axs[0, 0].plot(x, y) # Top left
axs[1, 1].scatter(x, y) # Bottom right
# Complex grid (Uneven sizes)
fig = plt.figure(figsize=(10, 6))
gs = gridspec.GridSpec(2, 2, width_ratios=[2, 1], height_ratios=[1, 2])
ax1 = fig.add_subplot(gs[0, 0]) # Top left (large width)
ax2 = fig.add_subplot(gs[0, 1]) # Top right
ax3 = fig.add_subplot(gs[1, :]) # Bottom spanning all columnsScientific Plot Types
Heatmaps and Colorbars
data = np.random.rand(10, 10)
fig, ax = plt.subplots()
im = ax.imshow(data, cmap='viridis', interpolation='nearest')
# Add colorbar
cbar = fig.colorbar(im, ax=ax, label='Intensity [a.u.]')
# Proper alignment of colorbar
from mpl_toolkits.axes_grid1 import make_axes_locatable
divider = make_axes_locatable(ax)
cax = divider.append_axes("right", size="5%", pad=0.05)
fig.colorbar(im, cax=cax)Histograms and Error Bars
# Histogram
data = np.random.normal(0, 1, 1000)
ax.hist(data, bins=30, density=True, alpha=0.6, color='g', edgecolor='black')
# Error bars
x = np.arange(10)
y = x**2
yerr = np.sqrt(y)
ax.errorbar(x, y, yerr=yerr, fmt='o', capsize=5, label='Data with noise')3D Plotting
from mpl_toolkits.mplot3d import Axes3D
fig = plt.figure()
ax = fig.add_subplot(111, projection='3d')
X = np.arange(-5, 5, 0.25)
Y = np.arange(-5, 5, 0.25)
X, Y = np.meshgrid(X, Y)
R = np.sqrt(X**2 + Y**2)
Z = np.sin(R)
surf = ax.plot_surface(X, Y, Z, cmap='coolwarm', linewidth=0, antialiased=False)
fig.colorbar(surf, shrink=0.5, aspect=5)Formatting for Publication
Using LaTeX and RcParams
# Global styling
plt.style.use('seaborn-v0_8-paper') # or 'ggplot', 'bmh'
# LaTeX for labels
plt.rcParams.update({
"text.usetex": True,
"font.family": "serif",
"font.serif": ["Computer Modern Roman"],
"axes.labelsize": 14,
})
fig, ax = plt.subplots()
ax.plot(x, y)
ax.set_xlabel(r'$\alpha_{i} + \beta \sin(\omega t)$') # LaTeX stringPractical Workflows
1. Multi-dataset Comparison Workflow
def plot_comparison(datasets, labels):
fig, ax = plt.subplots(figsize=(10, 6))
colors = plt.cm.viridis(np.linspace(0, 1, len(datasets)))
for data, label, color in zip(datasets, labels, colors):
ax.plot(data['x'], data['y'], label=label, color=color, lw=1.5)
ax.fill_between(data['x'], data['y']-data['std'], data['y']+data['std'],
alpha=0.2, color=color)
ax.set_title('Experiment Results Comparison')
ax.legend(frameon=False)
return fig, ax2. Monitoring Real-time Data (Interactive)
# Use this in a Jupyter environment or script
plt.ion() # Interactive mode on
fig, ax = plt.subplots()
line, = ax.plot([], [])
for i in range(100):
new_data = np.random.rand(10)
line.set_data(np.arange(len(new_data)), new_data)
ax.relim()
ax.autoscale_view()
fig.canvas.draw()
fig.canvas.flush_events()
plt.pause(0.1)3. Creating a Cluster Map / Correlation Matrix
import pandas as pd
df = pd.DataFrame(np.random.rand(10, 4), columns=['A', 'B', 'C', 'D'])
corr = df.corr()
fig, ax = plt.subplots()
im = ax.imshow(corr, cmap='RdBu_r', vmin=-1, vmax=1)
ax.set_xticks(np.arange(len(corr.columns)), labels=corr.columns)
ax.set_yticks(np.arange(len(corr.index)), labels=corr.index)
# Loop over data dimensions and create text annotations.
for i in range(len(corr.index)):
for j in range(len(corr.columns)):
text = ax.text(j, i, f"{corr.iloc[i, j]:.2f}",
ha="center", va="center", color="black")Performance Optimization
Plotting Large Data
# 1. Use 'agg' backend for non-interactive rendering
import matplotlib
matplotlib.use('Agg')
# 2. Use PathCollection for scatter plots with many points
ax.scatter(x, y, s=1) # slow for 1M points
# 3. Use marker='' (none) and only lines for speed
ax.plot(x, y, marker=None)
# 4. Decimate data before plotting
ax.plot(x[::10], y[::10]) # Plot every 10th pointCommon Pitfalls and Solutions
Date/Time Axis issues
# ❌ Problem: Dates look like a black blob
# ✅ Solution: Use AutoDateLocator and AutoDateFormatter
import matplotlib.dates as mdates
fig, ax = plt.subplots()
ax.plot(dates, values)
ax.xaxis.set_major_locator(mdates.MonthLocator())
ax.xaxis.set_major_formatter(mdates.DateFormatter('%Y-%m'))
fig.autofmt_xdate() # Rotates labelsMultiple Legends on one plot
# ❌ Problem: Calling ax.legend() twice replaces the first one
# ✅ Solution: Manually add the first artist back
fig, ax = plt.subplots()
line1, = ax.plot([1, 2], [1, 2], label='Line 1')
line2, = ax.plot([1, 2], [2, 1], label='Line 2')
first_legend = ax.legend(handles=[line1], loc='upper left')
ax.add_artist(first_legend) # Add back
ax.legend(handles=[line2], loc='lower right')Image Saving Quality (Clipping)
# ❌ Problem: Legend or Axis title is cut off in the .png file
# ✅ Solution:
fig.savefig('output.png', bbox_inches='tight')Best Practices
- Always use the OO interface (
fig, ax = plt.subplots()) for scripts and modules - Save figures with appropriate formats - Use PDF/SVG for publications, PNG for web
- Set DPI appropriately - 300+ for print, 72-100 for screen
- Use
bbox_inches='tight'when saving to prevent clipping - Close figures in loops to prevent memory leaks
- Use colorblind-friendly colormaps - Avoid 'jet', prefer 'viridis', 'plasma', 'inferno'
- Label all axes with descriptive names and units
- Use
constrained_layout=Truefor subplots to prevent overlap - Configure global styles with
plt.rcParamsfor consistency - Test plots at target resolution before finalizing