N

Networkx Smart

Production-ready skill that handles comprehensive, toolkit, creating, analyzing. Includes structured workflows, validation checks, and reusable patterns for scientific.

SkillClipticsscientificv1.0.0MIT
0 views0 copies

NetworkX Smart

Build, analyze, and visualize complex networks and graphs using NetworkX. This skill covers graph construction, centrality analysis, community detection, shortest paths, network metrics, and visualization for social networks, biological networks, knowledge graphs, and more.

When to Use This Skill

Choose NetworkX Smart when you need to:

  • Construct and manipulate graph data structures (directed, undirected, multi-graphs)
  • Compute network metrics (centrality, clustering, shortest paths, connectivity)
  • Detect communities and identify important nodes in networks
  • Visualize small to medium networks with customizable layouts

Consider alternatives when:

  • You need to process graphs with millions of nodes (use graph-tool or NetworKit)
  • You need GPU-accelerated graph algorithms (use cuGraph or PyTorch Geometric)
  • You need interactive web-based network visualizations (use Cytoscape.js or D3)

Quick Start

pip install networkx matplotlib
import networkx as nx import matplotlib.pyplot as plt # Create a graph G = nx.Graph() G.add_edges_from([ ("Alice", "Bob"), ("Alice", "Carol"), ("Bob", "Dave"), ("Carol", "Dave"), ("Dave", "Eve"), ("Eve", "Frank") ]) # Basic analysis print(f"Nodes: {G.number_of_nodes()}") print(f"Edges: {G.number_of_edges()}") print(f"Density: {nx.density(G):.3f}") # Centrality centrality = nx.betweenness_centrality(G) for node, score in sorted(centrality.items(), key=lambda x: -x[1]): print(f"{node}: {score:.3f}")

Core Concepts

Graph Types and Algorithms

AlgorithmFunctionDescription
Degree centralitynx.degree_centrality(G)Node importance by connections
Betweenness centralitynx.betweenness_centrality(G)Bridge nodes between communities
PageRanknx.pagerank(G)Recursive importance measure
Shortest pathnx.shortest_path(G, s, t)Minimum-hop path between nodes
Connected componentsnx.connected_components(G)Isolated subgraphs
Community detectionnx.community.louvain_communities(G)Group detection
Clustering coefficientnx.clustering(G)Local connectivity density
Minimum spanning treenx.minimum_spanning_tree(G)Lightest connected subgraph

Network Analysis Pipeline

import networkx as nx import pandas as pd def analyze_network(G): """Comprehensive network analysis report.""" report = { "nodes": G.number_of_nodes(), "edges": G.number_of_edges(), "density": nx.density(G), "avg_clustering": nx.average_clustering(G), "components": nx.number_connected_components(G), } # Only compute diameter for connected graphs if nx.is_connected(G): report["diameter"] = nx.diameter(G) report["avg_shortest_path"] = nx.average_shortest_path_length(G) # Centrality analysis degree = nx.degree_centrality(G) between = nx.betweenness_centrality(G) pagerank = nx.pagerank(G) # Top nodes by each metric top_degree = sorted(degree.items(), key=lambda x: -x[1])[:5] top_between = sorted(between.items(), key=lambda x: -x[1])[:5] top_pagerank = sorted(pagerank.items(), key=lambda x: -x[1])[:5] report["top_degree"] = top_degree report["top_betweenness"] = top_between report["top_pagerank"] = top_pagerank # Community detection communities = list(nx.community.louvain_communities(G, seed=42)) report["n_communities"] = len(communities) report["community_sizes"] = [len(c) for c in communities] return report # Example: analyze a social network G = nx.karate_club_graph() report = analyze_network(G) for key, value in report.items(): print(f"{key}: {value}")

Graph Construction from Data

import networkx as nx import pandas as pd # From edge list DataFrame edges_df = pd.DataFrame({ "source": ["A", "A", "B", "C", "D"], "target": ["B", "C", "C", "D", "E"], "weight": [1.0, 2.5, 1.5, 3.0, 0.8] }) G = nx.from_pandas_edgelist(edges_df, "source", "target", edge_attr="weight") # From adjacency matrix import numpy as np adj_matrix = np.array([ [0, 1, 1, 0], [1, 0, 1, 0], [1, 1, 0, 1], [0, 0, 1, 0] ]) G = nx.from_numpy_array(adj_matrix) # Directed graph from transactions DG = nx.DiGraph() transactions = [ ("User1", "User2", {"amount": 100, "date": "2024-01-15"}), ("User2", "User3", {"amount": 50, "date": "2024-01-16"}), ("User3", "User1", {"amount": 75, "date": "2024-01-17"}) ] DG.add_edges_from(transactions) # Bipartite graph B = nx.Graph() B.add_nodes_from(["User1", "User2", "User3"], bipartite=0) B.add_nodes_from(["Product_A", "Product_B"], bipartite=1) B.add_edges_from([ ("User1", "Product_A"), ("User1", "Product_B"), ("User2", "Product_A"), ("User3", "Product_B") ])

Configuration

ParameterDescriptionDefault
graph_typeGraph class (Graph, DiGraph, MultiGraph)nx.Graph
weight_attrEdge attribute for weighted algorithms"weight"
seedRandom seed for stochastic algorithmsNone
max_iterIteration limit for iterative algorithms100
tolConvergence tolerance1e-06
layoutNode positioning algorithm"spring"

Best Practices

  1. Choose the right graph type upfront — Use nx.Graph() for undirected relationships, nx.DiGraph() for directed flows, and nx.MultiGraph() when multiple edges can connect the same nodes. Converting between types later loses information.

  2. Use generators for large graphs — For graphs with known structure (Erdos-Renyi, Barabasi-Albert, Watts-Strogatz), use NetworkX generators like nx.barabasi_albert_graph(1000, 3) instead of constructing edges manually. These are optimized and produce well-characterized topologies.

  3. Store node and edge attributes directly in the graph — Use G.nodes[n]['attr'] = value and G.edges[u, v]['weight'] = w rather than maintaining separate lookup dictionaries. This keeps data co-located with the graph structure and simplifies serialization.

  4. Pre-compute centrality for repeated queries — Betweenness centrality is O(VE) and expensive on large graphs. Compute it once, store results as node attributes, and query the cached values rather than recomputing for every analysis.

  5. Export to specialized formats for large-scale processing — For graphs with more than 50,000 nodes, export to GraphML or edge list format and switch to graph-tool, NetworKit, or SNAP for analysis. NetworkX is optimized for flexibility and correctness, not performance on massive graphs.

Common Issues

Algorithms fail on disconnected graphs — Functions like nx.diameter() and nx.average_shortest_path_length() raise errors on disconnected graphs. Check connectivity with nx.is_connected(G) first and either analyze the largest component G.subgraph(max(nx.connected_components(G), key=len)) or handle components individually.

Memory exhaustion on dense graphs — NetworkX stores graphs as nested dictionaries, which uses more memory than adjacency matrices. A complete graph with 10,000 nodes creates 100M edge entries. For dense graphs, use scipy.sparse adjacency matrices or switch to a more memory-efficient library.

Visualization is unreadable for large networks — Matplotlib-based visualization becomes a hairball above ~200 nodes. For larger networks, filter to show only important nodes (top centrality), use community-based layout coloring, or export to Gephi or Cytoscape for interactive exploration.

Community

Reviews

Write a review

No reviews yet. Be the first to review this template!

Similar Templates