Pro Service
Battle-tested skill for check, service, status, rename. Includes structured workflows, validation checks, and reusable patterns for railway.
Pro Service
A Railway skill for managing service lifecycle — checking health, updating properties, scaling instances, and performing advanced service operations. Pro Service handles day-to-day service administration beyond basic deployment.
When to Use This Skill
Choose Pro Service when:
- Checking service health, status, and deployment history
- Updating service properties (name, build settings, restart policies)
- Scaling services with replica count adjustments
- Managing service networking and port configuration
Consider alternatives when:
- Creating a brand new service (use New System)
- Managing database plugins specifically (use database tools)
- Handling domain/SSL configuration (use domain management tools)
Quick Start
claude "Check the health of my Railway services and scale the API to 3 replicas"
# View service status and details railway status # View service logs railway logs --tail # Check deployments railway status # Scale service (via dashboard or railway.json) # Update railway.json: { "deploy": { "numReplicas": 3 } } # Then redeploy railway up
Core Concepts
Service Properties
| Property | Description | Modifiable |
|---|---|---|
| Name | Service display name | Yes |
| Source | GitHub repo or Docker image | Yes |
| Root Directory | Build context path | Yes |
| Build Command | Custom build instruction | Yes |
| Start Command | Runtime entrypoint | Yes |
| Replicas | Instance count | Yes |
| Restart Policy | Failure handling behavior | Yes |
Service Health States
## Health Check Flow Healthy: Service running → Health endpoint returns 200 → Traffic routed Unhealthy: Service running → Health endpoint fails → Traffic stopped → Restart policy applied → Retry health check Crashed: Service exited → Restart policy evaluated → ON_FAILURE: Restart (up to maxRetries) → NEVER: Stay stopped → ALWAYS: Restart indefinitely
Service Scaling
// railway.json — Scaling configuration { "deploy": { "numReplicas": 3, "startCommand": "node server.js", "healthcheckPath": "/health", "restartPolicyType": "ON_FAILURE", "restartPolicyMaxRetries": 5 } }
// Ensure your app is stateless for multi-replica scaling // BAD: In-memory session storage const sessions = new Map(); // Lost when instance restarts // GOOD: External session storage import Redis from 'ioredis'; const redis = new Redis(process.env.REDIS_URL); app.use(session({ store: new RedisStore({ client: redis }) }));
Configuration
| Parameter | Description | Default |
|---|---|---|
numReplicas | Number of service instances | 1 |
restartPolicyType | ON_FAILURE, ALWAYS, NEVER | ON_FAILURE |
restartPolicyMaxRetries | Max restart attempts on failure | 3 |
healthcheckPath | HTTP path for health probes | None |
healthcheckTimeout | Health check timeout in seconds | 30 |
Best Practices
-
Make services stateless before scaling. Multi-replica services must not rely on in-memory state (sessions, caches, uploaded files). Move state to external stores (Redis, PostgreSQL, S3) so any replica can handle any request.
-
Set appropriate restart policies. Use
ON_FAILUREwith a retry limit for production services — this handles transient failures without masking persistent crashes.ALWAYScan hide bugs by endlessly restarting a broken service. -
Monitor all replicas, not just one. When running multiple replicas, check logs across all instances. A single unhealthy replica can cause intermittent errors that are hard to reproduce if you only watch one instance's logs.
-
Use graceful shutdown handlers. When Railway stops a service (during redeploy or scale-down), it sends SIGTERM. Handle this signal to finish in-progress requests, close database connections, and flush logs before exiting.
-
Test scaling on staging first. Scale your staging environment to match production replica count and run load tests. This catches stateful code, race conditions, and connection pool exhaustion before they affect real users.
Common Issues
Requests fail intermittently after scaling to multiple replicas. Your application likely has in-memory state that differs between replicas. Check for in-memory caches, session stores, or file uploads stored on disk. Move all shared state to Redis or a database.
Service restarts in a loop after deploying. Check railway logs for the crash reason. Common causes: missing environment variable (throws on startup), port conflict (another process using the port), and unhandled promise rejection in initialization code. Fix the root cause rather than increasing maxRetries.
Health check passes but service returns 502. The health check endpoint might be too basic — returning 200 without checking dependencies. Enhance your health check to verify database connectivity, Redis connection, and any critical external services. A health check that always returns 200 gives false confidence.
Reviews
No reviews yet. Be the first to review this template!
Similar Templates
Full-Stack Code Reviewer
Comprehensive code review skill that checks for security vulnerabilities, performance issues, accessibility, and best practices across frontend and backend code.
Test Suite Generator
Generates comprehensive test suites with unit tests, integration tests, and edge cases. Supports Jest, Vitest, Pytest, and Go testing.
Pro Architecture Workspace
Battle-tested skill for architectural, decision, making, framework. Includes structured workflows, validation checks, and reusable patterns for development.