Jupyter Notebook Development Skill

A Claude Code skill for building interactive data science and analysis workflows with Jupyter Notebooks — covering notebook architecture, kernel management, visualization, reproducibility, and collaboration patterns.

When to Use This Skill

Choose this skill when:

Creating data analysis or exploration notebooks
Building reproducible research workflows
Setting up Jupyter environments for teams
Integrating notebooks into CI/CD and reporting pipelines
Optimizing notebook performance for large datasets
Converting notebooks to scripts, reports, or presentations

Consider alternatives when:

You need production data pipelines (use Airflow, Dagster)
You need a web application for data display (use Streamlit, Dash)
You need database administration (use a DBA tool)

Quick Start


# Install JupyterLab
pip install jupyterlab ipykernel pandas matplotlib

# Launch JupyterLab
jupyter lab

# Create a kernel for your virtual environment
python -m ipykernel install --user --name=myproject


# Standard notebook imports and configuration
import pandas as pd
import numpy as np
import matplotlib.pyplot as plt
import seaborn as sns

# Configure display options
pd.set_option('display.max_columns', 50)
pd.set_option('display.max_rows', 100)
plt.rcParams['figure.figsize'] = (12, 6)
sns.set_theme(style='whitegrid')

# Load and preview data
df = pd.read_csv('data/sales.csv', parse_dates=['date'])
df.head()

Core Concepts

Notebook Architecture

Section	Purpose	Cell Type
Header	Title, description, date	Markdown
Setup	Imports, configuration, data loading	Code
Exploration	Summary statistics, distributions	Code + Markdown
Analysis	Core analysis logic	Code + Markdown
Visualization	Charts and plots	Code
Conclusions	Key findings and next steps	Markdown

Visualization Patterns


# Interactive plots with Plotly
import plotly.express as px

fig = px.scatter(df, x='revenue', y='growth',
                 color='category', size='users',
                 hover_data=['product_name'],
                 title='Revenue vs Growth by Category')
fig.show()

# Multi-panel analysis
fig, axes = plt.subplots(2, 2, figsize=(14, 10))

df['revenue'].hist(ax=axes[0, 0], bins=30)
axes[0, 0].set_title('Revenue Distribution')

df.groupby('category')['revenue'].mean().plot.bar(ax=axes[0, 1])
axes[0, 1].set_title('Average Revenue by Category')

df.plot.scatter(x='users', y='revenue', ax=axes[1, 0], alpha=0.5)
axes[1, 0].set_title('Users vs Revenue')

df.groupby('month')['revenue'].sum().plot(ax=axes[1, 1])
axes[1, 1].set_title('Monthly Revenue Trend')

plt.tight_layout()
plt.show()

Reproducibility


# Pin random seeds for reproducibility
import random
random.seed(42)
np.random.seed(42)

# Record environment
!pip freeze > requirements.txt

# Use watermark for session info
%load_ext watermark
%watermark -v -p pandas,numpy,matplotlib,scikit-learn

Configuration

Parameter	Type	Default	Description
`kernel`	string	`"python3"`	Jupyter kernel: python3, R, julia
`lab_version`	string	`"4"`	JupyterLab major version
`extensions`	array	`[]`	JupyterLab extensions to install
`autosave_interval`	number	`120`	Autosave interval in seconds
`max_output_lines`	number	`1000`	Maximum output lines per cell
`inline_plots`	boolean	`true`	Display plots inline
`export_format`	string	`"html"`	Default export: html, pdf, slides

Best Practices

Structure notebooks with clear section headers — use markdown cells with ## headers to divide notebooks into Setup, Exploration, Analysis, and Conclusions; this makes notebooks navigable and reviewable.
Keep cells small and focused — each code cell should do one thing; splitting analysis into small cells makes debugging easier and allows selective re-execution.
Run cells top-to-bottom before sharing — use "Restart Kernel and Run All" to verify the notebook executes in order; notebooks with out-of-order execution state are unreproducible.
Use %matplotlib inline for static reports, plotly for interactive exploration — static plots are better for exported HTML/PDF reports; interactive plots are better for live exploration sessions.
Extract reusable functions into .py modules — when notebook code becomes production logic, move it to importable Python modules; notebooks should orchestrate, not implement complex algorithms.

Common Issues

Notebook runs out of memory on large datasets — Load data in chunks with pd.read_csv(chunksize=10000), use dtype specifications to reduce memory, or use Dask for out-of-core computation.

Cells execute out of order causing stale state — Hidden state from previous cell executions causes confusion. Use "Restart Kernel and Run All" regularly, and avoid modifying global variables in place.

Notebooks are impossible to code review — Notebook JSON diffs are unreadable. Use nbstripout to remove outputs before committing, and use jupytext to maintain a paired .py file for readable diffs.

⚠️ Loading Issue

Comprehensive Jupyter Module

Jupyter Notebook Development Skill

When to Use This Skill

Quick Start

Core Concepts

Notebook Architecture

Visualization Patterns

Reproducibility

Configuration

Best Practices

Common Issues

Reviews

Write a review

Similar Templates

Full-Stack Code Reviewer

Test Suite Generator

Pro Architecture Workspace