P

Pro Service

Battle-tested skill for check, service, status, rename. Includes structured workflows, validation checks, and reusable patterns for railway.

SkillClipticsrailwayv1.0.0MIT
0 views0 copies

Pro Service

A Railway skill for managing service lifecycle — checking health, updating properties, scaling instances, and performing advanced service operations. Pro Service handles day-to-day service administration beyond basic deployment.

When to Use This Skill

Choose Pro Service when:

  • Checking service health, status, and deployment history
  • Updating service properties (name, build settings, restart policies)
  • Scaling services with replica count adjustments
  • Managing service networking and port configuration

Consider alternatives when:

  • Creating a brand new service (use New System)
  • Managing database plugins specifically (use database tools)
  • Handling domain/SSL configuration (use domain management tools)

Quick Start

claude "Check the health of my Railway services and scale the API to 3 replicas"
# View service status and details railway status # View service logs railway logs --tail # Check deployments railway status # Scale service (via dashboard or railway.json) # Update railway.json: { "deploy": { "numReplicas": 3 } } # Then redeploy railway up

Core Concepts

Service Properties

PropertyDescriptionModifiable
NameService display nameYes
SourceGitHub repo or Docker imageYes
Root DirectoryBuild context pathYes
Build CommandCustom build instructionYes
Start CommandRuntime entrypointYes
ReplicasInstance countYes
Restart PolicyFailure handling behaviorYes

Service Health States

## Health Check Flow Healthy: Service running → Health endpoint returns 200 → Traffic routed Unhealthy: Service running → Health endpoint fails → Traffic stopped → Restart policy applied → Retry health check Crashed: Service exited → Restart policy evaluated → ON_FAILURE: Restart (up to maxRetries) → NEVER: Stay stopped → ALWAYS: Restart indefinitely

Service Scaling

// railway.json — Scaling configuration { "deploy": { "numReplicas": 3, "startCommand": "node server.js", "healthcheckPath": "/health", "restartPolicyType": "ON_FAILURE", "restartPolicyMaxRetries": 5 } }
// Ensure your app is stateless for multi-replica scaling // BAD: In-memory session storage const sessions = new Map(); // Lost when instance restarts // GOOD: External session storage import Redis from 'ioredis'; const redis = new Redis(process.env.REDIS_URL); app.use(session({ store: new RedisStore({ client: redis }) }));

Configuration

ParameterDescriptionDefault
numReplicasNumber of service instances1
restartPolicyTypeON_FAILURE, ALWAYS, NEVERON_FAILURE
restartPolicyMaxRetriesMax restart attempts on failure3
healthcheckPathHTTP path for health probesNone
healthcheckTimeoutHealth check timeout in seconds30

Best Practices

  1. Make services stateless before scaling. Multi-replica services must not rely on in-memory state (sessions, caches, uploaded files). Move state to external stores (Redis, PostgreSQL, S3) so any replica can handle any request.

  2. Set appropriate restart policies. Use ON_FAILURE with a retry limit for production services — this handles transient failures without masking persistent crashes. ALWAYS can hide bugs by endlessly restarting a broken service.

  3. Monitor all replicas, not just one. When running multiple replicas, check logs across all instances. A single unhealthy replica can cause intermittent errors that are hard to reproduce if you only watch one instance's logs.

  4. Use graceful shutdown handlers. When Railway stops a service (during redeploy or scale-down), it sends SIGTERM. Handle this signal to finish in-progress requests, close database connections, and flush logs before exiting.

  5. Test scaling on staging first. Scale your staging environment to match production replica count and run load tests. This catches stateful code, race conditions, and connection pool exhaustion before they affect real users.

Common Issues

Requests fail intermittently after scaling to multiple replicas. Your application likely has in-memory state that differs between replicas. Check for in-memory caches, session stores, or file uploads stored on disk. Move all shared state to Redis or a database.

Service restarts in a loop after deploying. Check railway logs for the crash reason. Common causes: missing environment variable (throws on startup), port conflict (another process using the port), and unhandled promise rejection in initialization code. Fix the root cause rather than increasing maxRetries.

Health check passes but service returns 502. The health check endpoint might be too basic — returning 200 without checking dependencies. Enhance your health check to verify database connectivity, Redis connection, and any critical external services. A health check that always returns 200 gives false confidence.

Community

Reviews

Write a review

No reviews yet. Be the first to review this template!

Similar Templates