NetworkX Smart

Build, analyze, and visualize complex networks and graphs using NetworkX. This skill covers graph construction, centrality analysis, community detection, shortest paths, network metrics, and visualization for social networks, biological networks, knowledge graphs, and more.

When to Use This Skill

Choose NetworkX Smart when you need to:

Construct and manipulate graph data structures (directed, undirected, multi-graphs)
Compute network metrics (centrality, clustering, shortest paths, connectivity)
Detect communities and identify important nodes in networks
Visualize small to medium networks with customizable layouts

Consider alternatives when:

You need to process graphs with millions of nodes (use graph-tool or NetworKit)
You need GPU-accelerated graph algorithms (use cuGraph or PyTorch Geometric)
You need interactive web-based network visualizations (use Cytoscape.js or D3)

Quick Start


pip install networkx matplotlib


import networkx as nx
import matplotlib.pyplot as plt

# Create a graph
G = nx.Graph()
G.add_edges_from([
    ("Alice", "Bob"), ("Alice", "Carol"),
    ("Bob", "Dave"), ("Carol", "Dave"),
    ("Dave", "Eve"), ("Eve", "Frank")
])

# Basic analysis
print(f"Nodes: {G.number_of_nodes()}")
print(f"Edges: {G.number_of_edges()}")
print(f"Density: {nx.density(G):.3f}")

# Centrality
centrality = nx.betweenness_centrality(G)
for node, score in sorted(centrality.items(), key=lambda x: -x[1]):
    print(f"{node}: {score:.3f}")

Core Concepts

Graph Types and Algorithms

Algorithm	Function	Description
Degree centrality	`nx.degree_centrality(G)`	Node importance by connections
Betweenness centrality	`nx.betweenness_centrality(G)`	Bridge nodes between communities
PageRank	`nx.pagerank(G)`	Recursive importance measure
Shortest path	`nx.shortest_path(G, s, t)`	Minimum-hop path between nodes
Connected components	`nx.connected_components(G)`	Isolated subgraphs
Community detection	`nx.community.louvain_communities(G)`	Group detection
Clustering coefficient	`nx.clustering(G)`	Local connectivity density
Minimum spanning tree	`nx.minimum_spanning_tree(G)`	Lightest connected subgraph

Network Analysis Pipeline


import networkx as nx
import pandas as pd

def analyze_network(G):
    """Comprehensive network analysis report."""
    report = {
        "nodes": G.number_of_nodes(),
        "edges": G.number_of_edges(),
        "density": nx.density(G),
        "avg_clustering": nx.average_clustering(G),
        "components": nx.number_connected_components(G),
    }

    # Only compute diameter for connected graphs
    if nx.is_connected(G):
        report["diameter"] = nx.diameter(G)
        report["avg_shortest_path"] = nx.average_shortest_path_length(G)

    # Centrality analysis
    degree = nx.degree_centrality(G)
    between = nx.betweenness_centrality(G)
    pagerank = nx.pagerank(G)

    # Top nodes by each metric
    top_degree = sorted(degree.items(), key=lambda x: -x[1])[:5]
    top_between = sorted(between.items(), key=lambda x: -x[1])[:5]
    top_pagerank = sorted(pagerank.items(), key=lambda x: -x[1])[:5]

    report["top_degree"] = top_degree
    report["top_betweenness"] = top_between
    report["top_pagerank"] = top_pagerank

    # Community detection
    communities = list(nx.community.louvain_communities(G, seed=42))
    report["n_communities"] = len(communities)
    report["community_sizes"] = [len(c) for c in communities]

    return report

# Example: analyze a social network
G = nx.karate_club_graph()
report = analyze_network(G)
for key, value in report.items():
    print(f"{key}: {value}")

Graph Construction from Data


import networkx as nx
import pandas as pd

# From edge list DataFrame
edges_df = pd.DataFrame({
    "source": ["A", "A", "B", "C", "D"],
    "target": ["B", "C", "C", "D", "E"],
    "weight": [1.0, 2.5, 1.5, 3.0, 0.8]
})
G = nx.from_pandas_edgelist(edges_df, "source", "target",
                            edge_attr="weight")

# From adjacency matrix
import numpy as np
adj_matrix = np.array([
    [0, 1, 1, 0],
    [1, 0, 1, 0],
    [1, 1, 0, 1],
    [0, 0, 1, 0]
])
G = nx.from_numpy_array(adj_matrix)

# Directed graph from transactions
DG = nx.DiGraph()
transactions = [
    ("User1", "User2", {"amount": 100, "date": "2024-01-15"}),
    ("User2", "User3", {"amount": 50, "date": "2024-01-16"}),
    ("User3", "User1", {"amount": 75, "date": "2024-01-17"})
]
DG.add_edges_from(transactions)

# Bipartite graph
B = nx.Graph()
B.add_nodes_from(["User1", "User2", "User3"], bipartite=0)
B.add_nodes_from(["Product_A", "Product_B"], bipartite=1)
B.add_edges_from([
    ("User1", "Product_A"), ("User1", "Product_B"),
    ("User2", "Product_A"), ("User3", "Product_B")
])

Configuration

Parameter	Description	Default
`graph_type`	Graph class (Graph, DiGraph, MultiGraph)	`nx.Graph`
`weight_attr`	Edge attribute for weighted algorithms	`"weight"`
`seed`	Random seed for stochastic algorithms	`None`
`max_iter`	Iteration limit for iterative algorithms	`100`
`tol`	Convergence tolerance	`1e-06`
`layout`	Node positioning algorithm	`"spring"`

Best Practices

Choose the right graph type upfront — Use nx.Graph() for undirected relationships, nx.DiGraph() for directed flows, and nx.MultiGraph() when multiple edges can connect the same nodes. Converting between types later loses information.
Use generators for large graphs — For graphs with known structure (Erdos-Renyi, Barabasi-Albert, Watts-Strogatz), use NetworkX generators like nx.barabasi_albert_graph(1000, 3) instead of constructing edges manually. These are optimized and produce well-characterized topologies.
Store node and edge attributes directly in the graph — Use G.nodes[n]['attr'] = value and G.edges[u, v]['weight'] = w rather than maintaining separate lookup dictionaries. This keeps data co-located with the graph structure and simplifies serialization.
Pre-compute centrality for repeated queries — Betweenness centrality is O(VE) and expensive on large graphs. Compute it once, store results as node attributes, and query the cached values rather than recomputing for every analysis.
Export to specialized formats for large-scale processing — For graphs with more than 50,000 nodes, export to GraphML or edge list format and switch to graph-tool, NetworKit, or SNAP for analysis. NetworkX is optimized for flexibility and correctness, not performance on massive graphs.

Common Issues

Algorithms fail on disconnected graphs — Functions like nx.diameter() and nx.average_shortest_path_length() raise errors on disconnected graphs. Check connectivity with nx.is_connected(G) first and either analyze the largest component G.subgraph(max(nx.connected_components(G), key=len)) or handle components individually.

Memory exhaustion on dense graphs — NetworkX stores graphs as nested dictionaries, which uses more memory than adjacency matrices. A complete graph with 10,000 nodes creates 100M edge entries. For dense graphs, use scipy.sparse adjacency matrices or switch to a more memory-efficient library.

Visualization is unreadable for large networks — Matplotlib-based visualization becomes a hairball above ~200 nodes. For larger networks, filter to show only important nodes (top centrality), use community-based layout coloring, or export to Gephi or Cytoscape for interactive exploration.

⚠️ Loading Issue

Networkx Smart

NetworkX Smart

When to Use This Skill

Quick Start

Core Concepts

Graph Types and Algorithms

Network Analysis Pipeline

Graph Construction from Data

Configuration

Best Practices

Common Issues

Reviews

Write a review

Similar Templates

Full-Stack Code Reviewer

Test Suite Generator

Pro Architecture Workspace