Dynatrace Expert Partner

Master Dynatrace observability, APM, and DQL analytics for incident response, capacity planning, and security posture monitoring.

When to Use This Agent

Choose this agent when you need to:

Investigate production incidents using distributed traces, service-flow analysis, and Davis AI to pinpoint root cause
Write or optimize DQL queries for custom dashboards, SLO definitions, and alerting rules across full-stack environments
Assess application security through Dynatrace RASP, vulnerability detection, and attack-path analysis

Consider alternatives when:

Your monitoring stack is Prometheus, Grafana, or Datadog and you need vendor-specific guidance
You require code-level profiling beyond what OneAgent captures automatically

Quick Start

Configuration


name: dynatrace-expert-partner
type: agent
category: observability

Example Invocation


claude agent:invoke dynatrace-expert-partner "Investigate checkout service latency spike and build a DQL dashboard"

Example Output

Incident - Checkout Service Latency
Environment: prod-us-east | 2026-03-15 08:00-09:30 UTC

Root Cause: Davis AI anomaly at 08:12 UTC
  P95 latency: 180ms -> 2,400ms
  Cause: DB connection pool exhaustion (100/100 saturated)
  Trigger: checkout-api:v3.8.2 deployed at 08:10 (missing connection release)

DQL Query:
  timeseries avg_latency = avg(dt.service.request.response_time),
    filter: dt.entity.service.name == "checkout-api", interval: 1m
  | fieldsAdd threshold = 500

Core Concepts

Dynatrace Observability Overview

Aspect	Details
OneAgent	Auto code-level injection for Java, .NET, Node.js, Go, Python providing traces, hotspots, and RUM
Davis AI	Causal AI correlating topology, metrics, events, and logs to identify root cause and impact scope
DQL	Pipe-based query language for logs, metrics, events, entities with fetch, filter, summarize, timeseries
Smartscape	Real-time dependency map spanning hosts, processes, services, and applications with call relationships
Grail	Unified data lakehouse for all signals with schema-on-read and retention up to 10 years

Dynatrace Investigation Architecture

+----------------+     +------------------+     +----------------+
| OneAgent       | --> | Grail Data       | --> | Davis AI       |
| Instrumentation|     | Lakehouse        |     | Correlation    |
+----------------+     +------------------+     +----------------+
        |                       |                       |
        v                       v                       v
+----------------+     +------------------+     +----------------+
| Smartscape     | --> | DQL Queries &    | --> | Dashboards &   |
| Topology       |     | Notebooks        |     | Alerts / SLOs  |
+----------------+     +------------------+     +----------------+

Configuration

Parameter	Type	Default	Description
dt_environment_url	string	-	Environment URL (e.g., https://abc12345.live.dynatrace.com)
dt_api_token	string	-	API token with Read metrics, entities, logs, problems scopes
default_timeframe	string	last 2 hours	Default query timeframe for investigations
management_zone	string	-	Zone to scope analysis to a specific application or team
slo_target	float	99.9	Default SLO availability target percentage

Best Practices

Start with Davis AI problems - Query the problems API first instead of raw metrics. Davis performs topology-aware root cause analysis that would take hours manually.
Use management zones for scoping - Scope dashboards and alerts to specific zones so teams see only their services, reducing noise and improving query performance.
Build DQL queries iteratively - Start with broad fetch, add filters, then aggregations. Test each stage in a Notebook before embedding in dashboard tiles.
Define SLOs before incidents - Establish latency, error rate, and availability objectives during calm periods. Error budget burn rate provides objective severity measurement.
Correlate deployments with anomalies - Ingest CI/CD deployment events so Davis considers them as root-cause candidates, frequently reducing MTTR.

Common Issues

DQL query timeouts - Queries on large datasets exceed execution limits. Add entity filters, reduce timeframes, or use larger summarize intervals.
OneAgent version mismatch - Different major versions produce inconsistent traces. Use the deployment API to automate rolling upgrades.
Alert fatigue from defaults - Tune Davis sensitivity per service and define metric-based alert profiles tied to SLO budgets.

⚠️ Loading Issue

Dynatrace Expert Partner

Dynatrace Expert Partner

When to Use This Agent

Quick Start

Configuration

Example Invocation

Example Output

Core Concepts

Dynatrace Observability Overview

Dynatrace Investigation Architecture

Configuration

Best Practices

Common Issues

Reviews

Write a review

Similar Templates

API Endpoint Builder

Documentation Auto-Generator

Ai Ethics Advisor Partner