Comprehensive Jupyter Module
Enterprise-grade skill for user, asks, create, scaffold. Includes structured workflows, validation checks, and reusable patterns for development.
Jupyter Notebook Development Skill
A Claude Code skill for building interactive data science and analysis workflows with Jupyter Notebooks — covering notebook architecture, kernel management, visualization, reproducibility, and collaboration patterns.
When to Use This Skill
Choose this skill when:
- Creating data analysis or exploration notebooks
- Building reproducible research workflows
- Setting up Jupyter environments for teams
- Integrating notebooks into CI/CD and reporting pipelines
- Optimizing notebook performance for large datasets
- Converting notebooks to scripts, reports, or presentations
Consider alternatives when:
- You need production data pipelines (use Airflow, Dagster)
- You need a web application for data display (use Streamlit, Dash)
- You need database administration (use a DBA tool)
Quick Start
# Install JupyterLab pip install jupyterlab ipykernel pandas matplotlib # Launch JupyterLab jupyter lab # Create a kernel for your virtual environment python -m ipykernel install --user --name=myproject
# Standard notebook imports and configuration import pandas as pd import numpy as np import matplotlib.pyplot as plt import seaborn as sns # Configure display options pd.set_option('display.max_columns', 50) pd.set_option('display.max_rows', 100) plt.rcParams['figure.figsize'] = (12, 6) sns.set_theme(style='whitegrid') # Load and preview data df = pd.read_csv('data/sales.csv', parse_dates=['date']) df.head()
Core Concepts
Notebook Architecture
| Section | Purpose | Cell Type |
|---|---|---|
| Header | Title, description, date | Markdown |
| Setup | Imports, configuration, data loading | Code |
| Exploration | Summary statistics, distributions | Code + Markdown |
| Analysis | Core analysis logic | Code + Markdown |
| Visualization | Charts and plots | Code |
| Conclusions | Key findings and next steps | Markdown |
Visualization Patterns
# Interactive plots with Plotly import plotly.express as px fig = px.scatter(df, x='revenue', y='growth', color='category', size='users', hover_data=['product_name'], title='Revenue vs Growth by Category') fig.show() # Multi-panel analysis fig, axes = plt.subplots(2, 2, figsize=(14, 10)) df['revenue'].hist(ax=axes[0, 0], bins=30) axes[0, 0].set_title('Revenue Distribution') df.groupby('category')['revenue'].mean().plot.bar(ax=axes[0, 1]) axes[0, 1].set_title('Average Revenue by Category') df.plot.scatter(x='users', y='revenue', ax=axes[1, 0], alpha=0.5) axes[1, 0].set_title('Users vs Revenue') df.groupby('month')['revenue'].sum().plot(ax=axes[1, 1]) axes[1, 1].set_title('Monthly Revenue Trend') plt.tight_layout() plt.show()
Reproducibility
# Pin random seeds for reproducibility import random random.seed(42) np.random.seed(42) # Record environment !pip freeze > requirements.txt # Use watermark for session info %load_ext watermark %watermark -v -p pandas,numpy,matplotlib,scikit-learn
Configuration
| Parameter | Type | Default | Description |
|---|---|---|---|
kernel | string | "python3" | Jupyter kernel: python3, R, julia |
lab_version | string | "4" | JupyterLab major version |
extensions | array | [] | JupyterLab extensions to install |
autosave_interval | number | 120 | Autosave interval in seconds |
max_output_lines | number | 1000 | Maximum output lines per cell |
inline_plots | boolean | true | Display plots inline |
export_format | string | "html" | Default export: html, pdf, slides |
Best Practices
-
Structure notebooks with clear section headers — use markdown cells with
##headers to divide notebooks into Setup, Exploration, Analysis, and Conclusions; this makes notebooks navigable and reviewable. -
Keep cells small and focused — each code cell should do one thing; splitting analysis into small cells makes debugging easier and allows selective re-execution.
-
Run cells top-to-bottom before sharing — use "Restart Kernel and Run All" to verify the notebook executes in order; notebooks with out-of-order execution state are unreproducible.
-
Use
%matplotlib inlinefor static reports,plotlyfor interactive exploration — static plots are better for exported HTML/PDF reports; interactive plots are better for live exploration sessions. -
Extract reusable functions into
.pymodules — when notebook code becomes production logic, move it to importable Python modules; notebooks should orchestrate, not implement complex algorithms.
Common Issues
Notebook runs out of memory on large datasets — Load data in chunks with pd.read_csv(chunksize=10000), use dtype specifications to reduce memory, or use Dask for out-of-core computation.
Cells execute out of order causing stale state — Hidden state from previous cell executions causes confusion. Use "Restart Kernel and Run All" regularly, and avoid modifying global variables in place.
Notebooks are impossible to code review — Notebook JSON diffs are unreadable. Use nbstripout to remove outputs before committing, and use jupytext to maintain a paired .py file for readable diffs.
Reviews
No reviews yet. Be the first to review this template!
Similar Templates
Full-Stack Code Reviewer
Comprehensive code review skill that checks for security vulnerabilities, performance issues, accessibility, and best practices across frontend and backend code.
Test Suite Generator
Generates comprehensive test suites with unit tests, integration tests, and edge cases. Supports Jest, Vitest, Pytest, and Go testing.
Pro Architecture Workspace
Battle-tested skill for architectural, decision, making, framework. Includes structured workflows, validation checks, and reusable patterns for development.