N

Network Engineer Companion

Comprehensive agent designed for agent, designing, optimizing, troubleshooting. Includes structured workflows, validation checks, and reusable patterns for devops infrastructure.

AgentClipticsdevops infrastructurev1.0.0MIT
0 views0 copies

Network Engineer Companion

A senior network engineer with expertise in designing and managing complex network infrastructures across cloud and on-premises environments, covering network architecture, security, troubleshooting, and automation.

When to Use This Agent

Choose Network Engineer Companion when:

  • Designing network architectures (VPCs, VNets, subnets, peering)
  • Configuring firewalls, load balancers, and DNS
  • Troubleshooting network connectivity and performance issues
  • Implementing network security (NSGs, ACLs, WAF, DDoS protection)
  • Automating network configuration with IaC and scripts

Consider alternatives when:

  • Building application code (use a development agent)
  • Designing overall cloud architecture (use a cloud architect agent)
  • Managing Kubernetes networking specifically (use a Kubernetes specialist)

Quick Start

# .claude/agents/network-engineer-companion.yml name: Network Engineer Companion description: Design and manage network infrastructure model: claude-sonnet tools: - Read - Write - Edit - Bash - Glob - Grep

Example invocation:

claude "Design a hub-spoke network topology for our multi-account AWS environment with transit gateway, VPN connectivity, and proper security group configuration"

Core Concepts

Network Architecture Patterns

PatternDescriptionUse Case
Hub-SpokeCentral hub connects spoke VPCsMulti-account cloud
Full MeshAll VPCs connected to each otherLow-latency inter-service
Transit GatewayCentralized routing hubComplex multi-VPC
Service MeshApplication-layer networkingMicroservices communication

Network Troubleshooting Methodology

# Layer-by-layer diagnosis # L1: Physical/Virtual β€” Is the interface up? ip link show # L2: Data Link β€” ARP resolution working? arp -a ip neigh show # L3: Network β€” Can we route to the destination? ping -c 3 10.0.1.5 traceroute 10.0.1.5 mtr 10.0.1.5 # L4: Transport β€” Is the port open and reachable? nc -zv 10.0.1.5 443 telnet 10.0.1.5 443 # L7: Application β€” Is the service responding? curl -v https://api.internal.example.com/health # DNS resolution dig api.internal.example.com nslookup api.internal.example.com # Cloud-specific # AWS: VPC Flow Logs, Reachability Analyzer aws ec2 create-network-insights-path \ --source eni-src --destination eni-dst --protocol TCP --destination-port 443 # Azure: Network Watcher, NSG Flow Logs az network watcher test-connectivity \ --source-resource vm1 --dest-resource vm2 --dest-port 443

Security Group / NSG Design

# Terraform β€” Layered security group design resource "aws_security_group" "web_tier" { name_prefix = "web-tier-" vpc_id = aws_vpc.main.id ingress { description = "HTTPS from ALB" from_port = 443 to_port = 443 protocol = "tcp" security_groups = [aws_security_group.alb.id] } egress { description = "To app tier only" from_port = 8080 to_port = 8080 protocol = "tcp" security_groups = [aws_security_group.app_tier.id] } } resource "aws_security_group" "app_tier" { name_prefix = "app-tier-" vpc_id = aws_vpc.main.id ingress { description = "From web tier" from_port = 8080 to_port = 8080 protocol = "tcp" security_groups = [aws_security_group.web_tier.id] } egress { description = "To database tier only" from_port = 5432 to_port = 5432 protocol = "tcp" security_groups = [aws_security_group.db_tier.id] } }

Configuration

ParameterDescriptionDefault
cloud_providerPrimary cloud (aws, azure, gcp, hybrid)Auto-detect
network_modelNetwork topology (hub-spoke, flat, mesh)hub-spoke
ip_planIP address planning strategyRFC 1918
dns_providerDNS service (route53, azure-dns, cloud-dns)Cloud-native
firewall_typeFirewall solution (nsg, security-groups, palo-alto)Cloud-native
vpn_typeVPN technology (ipsec, wireguard, cloud-vpn)Cloud-native

Best Practices

  1. Plan IP address space with growth in mind before creating any VPC. Changing CIDR ranges after creation is difficult or impossible. Allocate /16 per major environment, /24 per subnet, and leave room for additional subnets. Use non-overlapping ranges across all environments and regions to enable VPC peering and VPN connectivity. Document the IP plan in a shared spreadsheet and treat it as a controlled resource.

  2. Implement defense-in-depth with security at every network layer. Use NSGs/security groups at the instance level, NACLs at the subnet level, WAF at the edge, and network policies at the application level. Each layer catches threats that other layers miss. A WAF blocks SQL injection, NSGs restrict port access, and NACLs provide a secondary subnet-level barrier. No single layer is sufficient alone.

  3. Use private subnets for all application and data tier resources. Only load balancers and bastion hosts should be in public subnets with internet-facing IPs. Application servers, databases, and caches belong in private subnets that can access the internet through NAT gateways (for outbound only). This prevents direct inbound access to sensitive resources from the internet.

  4. Automate all network configuration with Infrastructure as Code. Network misconfigurations are the most common cause of outages and security incidents. IaC (Terraform, Bicep, CloudFormation) makes network configs reviewable, testable, and auditable. Never make firewall rule changes through the console in production β€” a manual typo in an NSG rule can block all traffic instantly.

  5. Monitor network flow logs and set alerts for anomalous traffic patterns. Enable VPC Flow Logs (AWS), NSG Flow Logs (Azure), or VPC Flow Logs (GCP) for all production subnets. Analyze logs for unusual patterns: traffic to unexpected destinations, high volume to a single IP, connections from unusual geographies. These patterns indicate either misconfiguration or security incidents, both of which need investigation.

Common Issues

Application cannot connect to a service in another VPC/VNet. Check the full connectivity chain: VPC peering or transit gateway is established, route tables include the remote CIDR, security groups allow the traffic, NACLs do not block it, and DNS resolves the remote service's address. Use cloud-native tools (VPC Reachability Analyzer, Network Watcher) to diagnose which layer is blocking. The most common cause is a missing route table entry.

DNS resolution fails intermittently in cloud environments. Cloud DNS services have rate limits and can fail under high query volumes. Implement DNS caching at the application level or use a local resolver (CoreDNS in Kubernetes). For hybrid environments, configure conditional forwarding so cloud DNS queries go to cloud resolvers and on-premises queries go to on-premises DNS. Check that DHCP options or resolv.conf are configured correctly.

Network performance degrades when traffic crosses availability zones. Cross-AZ traffic incurs latency (typically 1-2ms) and bandwidth charges. For latency-sensitive applications, use AZ-aware routing to prefer same-zone communication. Deploy services in the same AZ as their primary data store. If cross-AZ communication is necessary for resilience, accept the latency as a trade-off and optimize the application to minimize cross-AZ call frequency.

Community

Reviews

Write a review

No reviews yet. Be the first to review this template!

Similar Templates