Claude Platform Development Skill

A comprehensive Claude Code skill for building applications on the Anthropic Claude platform, covering API integration, prompt engineering, tool use, streaming, and production deployment patterns.

When to Use This Skill

Choose this skill when:

Building applications that integrate with the Claude API
Designing multi-turn conversational systems with Claude
Implementing tool use (function calling) with Claude models
Setting up streaming responses for real-time user experiences
Optimizing prompts and managing context windows effectively
Deploying Claude-powered applications to production

Consider alternatives when:

You need to use a different LLM provider (use a provider-specific skill)
You need fine-tuning capabilities (Claude uses prompt engineering, not fine-tuning)
You need local model inference (Claude is a cloud API service)

Quick Start


# Install the Anthropic SDK
npm install @anthropic-ai/sdk

# Set your API key
export ANTHROPIC_API_KEY="your-key-here"

# Add the skill to your project
claude mcp add claude-platform


import Anthropic from '@anthropic-ai/sdk';

const anthropic = new Anthropic();

// Basic message
const response = await anthropic.messages.create({
  model: 'claude-sonnet-4-20250514',
  max_tokens: 1024,
  messages: [{ role: 'user', content: 'Hello, Claude!' }]
});

console.log(response.content[0].text);

Core Concepts

Model Selection Guide

Model	Best For	Max Tokens	Relative Speed
claude-opus-4-20250514	Complex reasoning, code generation, analysis	32K output	Slower
claude-sonnet-4-20250514	Balanced performance, most use cases	16K output	Medium
claude-haiku-3-5-20241022	Fast responses, simple tasks, high volume	8K output	Fastest

Tool Use Pattern


const response = await anthropic.messages.create({
  model: 'claude-sonnet-4-20250514',
  max_tokens: 1024,
  tools: [{
    name: 'get_weather',
    description: 'Get current weather for a location',
    input_schema: {
      type: 'object',
      properties: {
        location: { type: 'string', description: 'City name' },
        units: { type: 'string', enum: ['celsius', 'fahrenheit'] }
      },
      required: ['location']
    }
  }],
  messages: [{ role: 'user', content: 'What is the weather in London?' }]
});

// Handle tool use response
for (const block of response.content) {
  if (block.type === 'tool_use') {
    const result = await callWeatherAPI(block.input);
    // Send tool result back to Claude
  }
}

Streaming Responses


const stream = await anthropic.messages.stream({
  model: 'claude-sonnet-4-20250514',
  max_tokens: 1024,
  messages: [{ role: 'user', content: 'Write a story' }]
});

for await (const event of stream) {
  if (event.type === 'content_block_delta' &&
      event.delta.type === 'text_delta') {
    process.stdout.write(event.delta.text);
  }
}

Configuration

Parameter	Type	Default	Description
`model`	string	`"claude-sonnet-4-20250514"`	Claude model to use for generation
`max_tokens`	number	`1024`	Maximum tokens in the response
`temperature`	number	`1.0`	Randomness of output (0.0-1.0)
`system`	string	`""`	System prompt for behavior configuration
`stop_sequences`	array	`[]`	Sequences that stop generation
`top_p`	number	`1.0`	Nucleus sampling threshold
`top_k`	number	—	Top-k sampling limit
`stream`	boolean	`false`	Enable streaming responses

Best Practices

Use system prompts for consistent behavior — define the assistant's role, constraints, and output format in the system message rather than repeating instructions in every user message.
Implement exponential backoff for rate limits — the Claude API returns 429 status codes when rate limited; implement retry logic with exponential backoff and jitter to handle traffic spikes gracefully.
Structure tool definitions with detailed descriptions — Claude performs better with tool use when each tool and parameter has a clear, specific description explaining what it does and when to use it.
Manage context windows proactively — track token usage in responses and implement conversation summarization or sliding window strategies before hitting context limits.
Cache prompt prefixes for cost savings — when sending the same system prompt or document prefix repeatedly, use prompt caching to reduce costs and latency on subsequent requests.

Common Issues

Responses cut off mid-sentence — This happens when max_tokens is too low for the expected response length. Increase max_tokens or check the stop_reason field — if it says max_tokens, the response was truncated and you need a higher limit.

Tool use loop without final answer — Claude may repeatedly call tools without producing a text response. Set a maximum tool-use iteration count (typically 5-10) and include instructions in the system prompt telling Claude to synthesize a final answer after gathering enough information.

High latency on complex requests — For long prompts or complex reasoning, response times can increase significantly. Use streaming to display partial results immediately, and consider breaking complex tasks into smaller sequential requests.

⚠️ Loading Issue

Advanced Claude Platform

Claude Platform Development Skill

When to Use This Skill

Quick Start

Core Concepts

Model Selection Guide

Tool Use Pattern

Streaming Responses

Configuration

Best Practices

Common Issues

Reviews

Write a review

Similar Templates

Full-Stack Code Reviewer

Test Suite Generator

Pro Architecture Workspace