S

Senior Secops System

Comprehensive skill designed for comprehensive, secops, skill, application. Includes structured workflows, validation checks, and reusable patterns for development.

SkillClipticsdevelopmentv1.0.0MIT
0 views0 copies

Senior SecOps System

A comprehensive skill for senior security operations engineers covering threat detection, incident response, security monitoring, vulnerability management, and security automation in cloud-native environments.

When to Use This Skill

Choose this skill when:

  • Setting up security monitoring and SIEM/SOAR pipelines
  • Building automated threat detection and response workflows
  • Implementing vulnerability scanning and patch management processes
  • Designing security incident response procedures and runbooks
  • Hardening cloud infrastructure and container security

Consider alternatives when:

  • Writing secure application code → use a security compliance skill
  • Doing penetration testing → use a security testing skill
  • Setting up authentication/authorization → use an auth skill
  • Managing network firewalls only → use a network security skill

Quick Start

# Security monitoring pipeline with Falco rules - rule: Detect shell in container desc: Alert when a shell is spawned inside a container condition: > container.id != host and proc.name in (bash, sh, zsh, csh) and not proc.pname in (allowed_parent_processes) output: "Shell spawned in container (user=%user.name container=%container.name image=%container.image.repository cmd=%proc.cmdline)" priority: WARNING tags: [container, shell] - rule: Detect sensitive file access desc: Alert on read access to sensitive files condition: > open_read and fd.name in (/etc/shadow, /etc/passwd, /root/.ssh/authorized_keys) and not proc.name in (sshd, login, passwd) output: "Sensitive file access (user=%user.name file=%fd.name proc=%proc.name)" priority: CRITICAL tags: [filesystem, sensitive]

Core Concepts

Security Operations Framework

DomainToolsObjective
DetectionSIEM, IDS/IPS, FalcoIdentify threats in real-time
ResponseSOAR, runbooks, IR planContain and remediate incidents
Vulnerability MgmtTrivy, Grype, SnykFind and fix known vulnerabilities
ComplianceCIS Benchmarks, policiesMeet regulatory requirements
Threat IntelMITRE ATT&CK, feedsUnderstand adversary tactics
ForensicsLog analysis, memory dumpsPost-incident investigation

Automated Incident Response

# SOAR playbook for compromised credentials class CredentialCompromisePlaybook: def __init__(self, siem_client, iam_client, notify_client): self.siem = siem_client self.iam = iam_client self.notify = notify_client async def execute(self, alert: SecurityAlert): # 1. Containment — disable compromised account user = alert.metadata['username'] await self.iam.disable_user(user) await self.iam.revoke_all_sessions(user) # 2. Investigation — gather evidence login_history = await self.siem.query( f'user:{user} AND event_type:login', timerange='7d' ) anomalous_ips = [l for l in login_history if l.geo_risk > 0.7] affected_resources = await self.siem.query( f'user:{user} AND event_type:data_access', timerange='24h' ) # 3. Notification await self.notify.send_incident( severity='HIGH', title=f'Compromised credentials: {user}', details={ 'anomalous_logins': len(anomalous_ips), 'resources_accessed': len(affected_resources), 'containment_status': 'Account disabled, sessions revoked', }, ) # 4. Remediation — force password reset await self.iam.force_password_reset(user) return IncidentReport(status='contained', user=user)

Vulnerability Management Pipeline

# Container image scanning in CI/CD # Scan with Trivy — fail on HIGH/CRITICAL trivy image --severity HIGH,CRITICAL --exit-code 1 \ --ignore-unfixed myapp:$TAG # Dependency scanning trivy fs --security-checks vuln,secret,config \ --severity HIGH,CRITICAL . # Infrastructure scanning trivy config --severity HIGH,CRITICAL \ --policy ./security-policies/ ./terraform/

Configuration

ParameterTypeDefaultDescription
siemPlatformstring'elastic'SIEM: elastic, splunk, or sentinel
scanSchedulestring'daily'Vulnerability scan frequency
severityThresholdstring'HIGH'Minimum severity to alert on
responseTimeTargetobject{critical: '15m', high: '1h'}Incident response SLA
complianceFrameworkstring'cis'Compliance: CIS, SOC2, PCI-DSS, or HIPAA
retentionDaysnumber365Security log retention period

Best Practices

  1. Automate detection and containment, escalate investigation — Credential revocation, IP blocking, and container isolation can be automated safely. Investigation and root cause analysis require human judgment. Automate the first 5 minutes of every incident.

  2. Build detection rules from MITRE ATT&CK framework — Map your detection coverage to ATT&CK techniques. Identify gaps in your detection matrix and prioritize rules that cover the most common attack patterns for your industry.

  3. Scan containers and dependencies in CI, not just registries — Catching a vulnerable dependency in a pull request is cheaper than discovering it in production. Block merges with known critical CVEs and provide automated fix suggestions.

  4. Maintain an asset inventory linked to vulnerability data — You can't secure what you don't know about. Automatically discover all cloud resources, containers, and endpoints. Link vulnerability scan results to asset owners for accountability.

  5. Practice incident response through tabletop exercises — Monthly tabletop exercises keep response procedures sharp and reveal gaps in runbooks. Simulate realistic scenarios: ransomware, data breach, supply chain compromise, insider threat.

Common Issues

Alert fatigue from too many low-priority detections — Tune detection rules with contextual enrichment: a failed login from a known VPN IP is different from one from a TOR exit node. Use risk scoring to prioritize alerts and suppress low-confidence detections.

Vulnerability scan results overwhelm developers — Thousands of CVEs with no prioritization leads to nothing getting fixed. Filter by exploitability (EPSS score), reachability (is the vulnerable function actually called?), and environment exposure (internet-facing vs internal).

Incident response playbooks not tested regularly — Written procedures that haven't been exercised fail under pressure. Schedule quarterly fire drills, rotate incident commanders, and update runbooks after every real incident with lessons learned.

Community

Reviews

Write a review

No reviews yet. Be the first to review this template!

Similar Templates