API Gateway + OpenFeign + Resilience4j

  1. What is an API Gateway? Purpose in microservices ?
  2. Spring Cloud Gateway vs Zuul (Zuul deprecated) ?
  3. Gateway predicates & filters — route matching, custom filters
  4. Load balancing in Gateway — with Eureka
  5. Rate limiting & circuit breaker in Gateway
  6. What is OpenFeign? Why use over RestTemplate?
  7. @FeignClient annotation
  8. Feign error handling
  9. What is Resilience4j? Main modules?
  10. Circuit Breaker pattern
  11. @CircuitBreaker annotation
  12. Retry annotation
  13. Rate limiter
  14. How Gateway + Feign + Resilience4j work together?
  15. Best practices for API Gateway + OpenFeign + Resilience4j ?

What is an API Gateway? Purpose in microservices ? 

=> An API Gateway is a single entry point (reverse proxy) that sits in front of your microservices and handles all incoming client requests.

=> It routes traffic to the appropriate backend services, applies cross-cutting concerns (security, rate limiting, logging, transformation), and returns responses to the client.

Main Responsibilities / Features of an API Gateway

ResponsibilityDescriptionWhy Important in Microservices
RoutingDirects incoming requests to the correct microservice based on path, headers, etc.Hides internal service structure from clients
Load BalancingDistributes traffic across multiple instances of the same serviceHandles scaling & high availability
Authentication & AuthorizationValidates JWT, OAuth tokens, API keys at the edgeCentralized security — no need to repeat in every service
Rate Limiting & ThrottlingPrevents abuse (e.g., 1000 requests/min per user)Protects backend from overload
Request/Response TransformationAdds/removes headers, protocol translation (REST → gRPC), aggregationSimplifies client integration
CachingCaches responses (e.g., static data, frequent GETs)Reduces backend load & latency
Logging & MonitoringCentralized request logging, metrics, tracingEasier debugging & observability
Circuit Breaking & Fault ToleranceStops routing to failing services (integrates with Resilience4j)Prevents cascading failures
Aggregation / BFFCombines data from multiple services into one response (Backend for Frontend)Reduces client round-trips
 

Spring Cloud Gateway vs Zuul (Zuul deprecated) ?

=> Both Spring Cloud Gateway and Zuul are API gateways in the Spring Cloud ecosystem, but Zuul is deprecated (since Spring Cloud Greenwich release, fully removed in newer releases), and Spring Cloud Gateway is the current, recommended replacement.

=> Spring Cloud Gateway built on Spring WebFlux (reactive, non-blocking). Since this is reactive and non-blocking, the performance is very high 

=> Zuul built on Spring MVC (blocking, servlet-based). Performance is low under high load 

AspectSpring Cloud Gateway (Current & Recommended)Zuul (Deprecated)Winner & Why
StatusActively maintained (Spring Cloud 2020+ onward)Deprecated since 2018 (Greenwich release), removed in later versionsGateway – Zuul is dead
ArchitectureBuilt on Spring WebFlux (reactive, non-blocking)Built on Spring MVC (blocking, servlet-based)Gateway – much better performance & scalability
PerformanceVery high (reactive, low memory, high throughput)Lower (blocking I/O, higher latency under load)Gateway
Startup TimeFast (WebFlux + Netty)Slower (Servlet container)Gateway
RoutingPredicate-based (Path, Header, Method, Query, etc.)Filter-based (pre/post/route filters)Both good, Gateway more flexible
FiltersGlobal, route-specific, ordered filters (Netty-style)Pre, route, post filters (Zuul 1 style)Gateway – more powerful & modern
Load BalancingBuilt-in with Spring Cloud LoadBalancerRibbon (built-in, but deprecated)Gateway (Ribbon is dead)
WebSocket SupportNative (reactive streams)Limited (Zuul 1 struggles)Gateway
Resilience & Fault ToleranceIntegrates easily with Resilience4jHystrix (deprecated)Gateway
ConfigurationYAML routes, programmatic, dynamic via DiscoveryYAML + Zuul filtersGateway – more options
Community & FutureActive development, future-proofNo updates, migration recommendedGateway
Migration EffortModerate (rewrite routes/filters)N/A (already deprecated)Gateway
Use CaseNew projects, high-traffic, reactive appsLegacy projects onlyGateway
Why Zuul is Deprecated & Gateway is the Future

=> Zuul 1 (original) was blocking (Servlet-based) → poor performance under high load
=> Zuul 2 (async/Netty-based) was never released as stable in Spring Cloud.
=> Spring team shifted to Spring WebFlux (reactive).
=> All new Spring Cloud features (Resilience4j, LoadBalancer, WebClient) integrate better with Gateway.
=> Official docs say: "Spring Cloud Gateway is the recommended API gateway for Spring Cloud users."
=> Use Gateway for new projects, Zuul only for legacy.

What does 'reactive' mean in Spring Cloud Gateway?

=> In simple terms, it means the application can handle many requests at the same time without creating a thread for each request, making it much more efficient under high load

Why "Reactive" Matters (Plain Explanation)

In traditional Spring MVC (used in Zuul and older apps):

=> Each incoming request gets its own thread from a thread pool.
=> If 1000 users hit the app → 1000 threads are used.
=> Threads wait (block) while doing slow work (DB call, external API, I/O).
=> Thread pool runs out → app slows down or crashes (thread starvation).

In reactive (Spring WebFlux + Gateway):

=> Uses few threads (e.g., 4–8) with event loop.
=> Requests are treated as streams of events — no blocking.
=> When waiting for DB/API → thread is freed to handle other requests. 
=> One thread can handle thousands of concurrent connections.

Real-World Example in Gateway

=> 10,000 users hit Gateway at once → traditional Zuul creates 10,000 threads → slow or crash.
=> Gateway (reactive) uses ~8 threads → handles all 10,000 requests efficiently by not blocking while waiting for backend services 
=> reactive = non-blocking + high scalability.

Gateway predicates & filters — route matching, custom filters 

=> Spring Cloud Gateway predicates & filters are the core building blocks for defining how incoming requests are routed and processed in a microservices API Gateway.

=> They make the gateway extremely flexible and powerful.

1. Predicates – Route Matching 

=> Predicates are the conditions that decide whether a request matches a particular route.

=> If all predicates for a route evaluate to true, the request is routed to the corresponding backend service.

How Predicates Work

=> Each route has one or more predicates.
=> They are checked in order — first match wins (or you can configure priority).
=> Predicates are defined in application.yml or programmatically

Common Built-in Predicates

Predicate NameExample in YAMLWhat It Does / When It Matches
PathPath=/api/employees/**Matches if request path starts with /api/employees/...
MethodMethod=GET,POSTMatches if HTTP method is GET or POST
HeaderHeader=X-Request-Id, \d+Matches if header X-Request-Id exists and is numeric
QueryQuery=page, \d+Matches if query param "page" exists and is numeric
HostHost=*.example.comMatches if host is subdomain of example.com
CookieCookie=sessionId, .+Matches if cookie sessionId exists
RemoteAddrRemoteAddr=192.168.1.0/24Matches client IP range
WeightWeight=group1, 80Weighted routing (80% traffic to one route)

Example in YAML (Route Definition)

spring:
  cloud:
    gateway:
      routes:
      - id: employee_route
        uri: lb://employee-service  # lb = load balanced via Eureka
        predicates:
          - Path=/api/employees/**
          - Method=GET,POST
          - Header=X-Auth-Token, Bearer .+
        filters:
          - StripPrefix=1
 

=> Matches only GET/POST to paths starting with /api/employees and with Bearer token header

2. Filters – Processing Requests & Responses 

=> Filters are executed after predicates match and before/after forwarding the request to the backend service.

Filters are of three types:

  1.     Pre-filters — run before forwarding request (modify request, add headers, rate limit, auth)
  2.     Post-filters — run after backend response (modify response, logging, metrics)
  3.     Global filters — apply to all routes (e.g., logging) 

Common Built-in Filters

Filter NameTypeExample in YAMLWhat It Does
AddRequestHeaderPreAddRequestHeader=X-Request-Id, #{random.int}Adds header to request
StripPrefixPreStripPrefix=1Removes first part of path (e.g., /api → forwards /)
RewritePathPreRewritePath=/api/(?<segment>.*), /$\{segment}Rewrites path dynamically
RequestRateLimiterPreRequestRateLimiter=redis-rate-limiterRate limits requests (requires Redis)
CircuitBreakerPreCircuitBreaker=myCircuitBreakerApplies Resilience4j circuit breaker
AddResponseHeaderPostAddResponseHeader=X-Response-Time, #{T(java.time.Instant).now()}Adds header to response
DedupeResponseHeaderPostDedupeResponseHeader=Access-Control-Allow-Origin, RETAIN_FIRSTRemoves duplicate headers
 

3. Custom Filters (Most Powerful Part)

=> You can create your own filters for custom logic (e.g., custom auth, logging, header manipulation). 

Example: Custom Pre-Filter

@Component
public class CustomLoggingFilter extends AbstractGatewayFilterFactory<CustomLoggingFilter.Config> {

    public CustomLoggingFilter() {
        super(Config.class);
    }

    @Override
    public GatewayFilter apply(Config config) {
        return (exchange, chain) -> {
            log.info("Request received: {} {}", exchange.getRequest().getMethod(), exchange.getRequest().getURI());

            return chain.filter(exchange).then(Mono.fromRunnable(() -> {
                log.info("Response sent: {}", exchange.getResponse().getStatusCode());
            }));
        };
    }

    public static class Config {
        // Add config properties if needed
    }
}

Usage in YAML:

routes:
  - id: employee_route
    uri: lb://employee-service
    predicates:
      - Path=/api/employees/**
    filters:
      - CustomLoggingFilter

 

Key points to remember

=> Predicates are conditions to match incoming requests to a route (Path, Method, Header, Query, etc.).
=> Filters modify requests/responses (pre-filters: before backend call; post-filters: after).
=> Predicates for matching and Filters for processing.

Load balancing in Gateway — with Eureka

=> Load Balancing in Spring Cloud Gateway with Eureka refers to how the API Gateway distributes incoming client requests across multiple instances of the same microservice, using Eureka as the service registry for discovery.

=> This is client-side load balancing performed by the Gateway itself (not server-side like traditional load balancers such as Nginx or AWS ALB).

How It Works (Step-by-Step Elaboration)

1. Eureka Registry Role

=> All instances of a microservice (e.g., employee-service) register with Eureka Server on startup.
=> Eureka maintains a dynamic list of healthy instances (IP:port) for each service name (e.g., EMPLOYEE-SERVICE).

2. Gateway Route Configuration

In Gateway's application.yml, routes use lb:// prefix (lb = load balanced):

spring:
  cloud:
    gateway:
      routes:
      - id: employee_route
        uri: lb://employee-service   # ← lb:// means load-balanced via Eureka
        predicates:
          - Path=/api/employees/**
        filters:
          - StripPrefix=1

3. Request Flow

=> Client calls Gateway: http://gateway:8080/api/employees
=> Gateway matches the route (Path=/api/employees/**).
=> Gateway queries Eureka: "Give me all healthy instances of employee-service".
=> Eureka returns a list (e.g., instance1:8081, instance2:8082, instance3:8083).
=> Gateway uses Spring Cloud LoadBalancer (default in Spring Boot 3+) to pick one instance (round-robin by default).
=> Gateway forwards the request to the chosen instance.
=> Response flows back to client through Gateway.

4. Load Balancing Algorithm

=> Default: Round-robin (cycles through instances one by one).
=> Can be configured (weighted, random, etc.) via Spring Cloud LoadBalancer properties.
=> If an instance is DOWN (no heartbeat), Eureka removes it → Gateway stops routing to it.

5. Health Checks

=> Eureka calls each instance's /actuator/health (if Actuator enabled).
=> If DOWN → instance removed from load balancing pool.
=> Gateway never routes to unhealthy instances.

Benefits of Load Balancing in Gateway with Eureka

=> No external load balancer needed — client-side LB is built-in.
=> Dynamic discovery — new instances auto-registered, old ones removed.
=> Fault tolerance — unhealthy instances automatically excluded.
=> Scalability — scale employee-service horizontally → Gateway distributes traffic automatically.
=> Centralized control — Gateway handles routing + LB in one place.

Rate limiting & circuit breaker in Gateway

=> Rate limiting and circuit breaker are two essential fault tolerance and protection mechanisms in Spring Cloud Gateway

=> They prevent overload, cascading failures in microservices architecture

1. Rate Limiting in Gateway

=> Purpose : Limit how many requests a client (or IP/user) can make in a given time window to protect backend services from being overwhelmed

How it works

=> Gateway tracks requests per client (usually by IP, header, or token).
=> If limit exceeded → returns 429 Too Many Requests (or custom status)
=> Common algorithms: Token Bucket or Sliding Window.

Spring Cloud Gateway Implementation

=> Use RequestRateLimiter filter
=> Add dependency spring-boot-starter-data-redis-reactive

<dependency>
    <groupId>org.springframework.boot</groupId>
    <artifactId>spring-boot-starter-data-redis-reactive</artifactId>
</dependency>

YAML Configuration Example

spring:
  cloud:
    gateway:
      routes:
      - id: employee_route
        uri: lb://employee-service
        predicates:
          - Path=/api/employees/**
        filters:
          - name: RequestRateLimiter
            args:
              redis-rate-limiter.replenishRate: 10   # 10 requests per second
              redis-rate-limiter.burstCapacity: 20   # Allow burst up to 20
              redis-rate-limiter.requestedTokens: 1  # Each request consumes 1 token
              key-resolver: '#{@ipKeyResolver}'      # Rate limit by IP

Custom Key Resolver (e.g., by IP)

@Bean
public KeyResolver ipKeyResolver() {
    return exchange -> Mono.just(exchange.getRequest().getRemoteAddress().getAddress().getHostAddress());

=> As per above configuration,
        11th request in 1 second returns the status 429 Too Many Requests
        Protects backend from overload

2. Circuit Breaker in Gateway

=> Purpose : Prevent cascading failures. If a backend service is slow or failing, Gateway stops routing to it temporarily (OPEN state), returns fallback response, and lets it recover. 

How Circuit Breaker Works

=> CLOSED — normal, routes requests.
=> OPEN — after failure threshold → stops routing, returns fallback (e.g., cached data or error).
=> HALF_OPEN — after timeout → lets a few requests through to test recovery. If successful → back to CLOSED; else back to OPEN.

Spring Cloud Gateway Implementation

=> Integrates with Resilience4j (recommended, since Hystrix is deprecated).
=> Add dependency 
<dependency>
    <groupId>io.github.resilience4j</groupId>
    <artifactId>resilience4j-spring-boot3</artifactId>
</dependency>
<dependency>
    <groupId>org.springframework.cloud</groupId>
    <artifactId>spring-cloud-starter-circuitbreaker-resilience4j</artifactId>
</dependency>

YAML Configuration

spring:
  cloud:
    gateway:
      routes:
      - id: employee_route
        uri: lb://employee-service
        predicates:
          - Path=/api/employees/**
        filters:
          - name: CircuitBreaker
            args:
              name: employeeCB
              fallbackUri: forward:/fallback/employee
              statusCodes: 500,502,503,504  # Trigger on these errors

Fallback Controller (simple example)

@RestController
public class FallbackController {

    @GetMapping("/fallback/employee")
    public ResponseEntity<String> employeeFallback() {
        return ResponseEntity.status(HttpStatus.SERVICE_UNAVAILABLE)
                .body("Employee service is temporarily unavailable. Try later.");
    }
}

=> If employee-service fails repeatedly → circuit opens.

=> Gateway returns fallback response → client gets graceful error instead of timeout.

Key points to remember 

=> Gateway uses RequestRateLimiter filter (with Redis backend) for rate limiting
=> If requests count exceed limit, returns the status 429 Too Many Requests
=> Circuit breaker prevents cascading failures. Integrates Resilence4j via the filter circuitbreaker
=> Both are configured per route in YAML

What is OpenFeign? Why use over RestTemplate?  

=> Spring Cloud OpenFeign is a declarative HTTP Client. 

=> It makes calling RESTful services (other microservices or external APIs) much simpler and more readable than traditional approaches like RestTemplate.

How to use OpenFeign ?

=> Instead of writing low-level HTTP code (creating RestTemplate, setting headers, handling JSON, error handling, etc.), you define an interface with annotations.

=> Spring Cloud generates the actual HTTP client code at runtime using Feign under the hood.     

Simple Example

@FeignClient(name = "payment-service", url = "${payment.service.url}")  // or use Eureka name
public interface PaymentClient {

    @PostMapping("/payments")
    PaymentResponse processPayment(@RequestBody PaymentRequest request);

    @GetMapping("/payments/{id}")
    PaymentResponse getPaymentStatus(@PathVariable("id") String id);
}

=> @FeignClient — defines the target service (can use Eureka name for discovery).
=> Method annotations (@PostMapping, @GetMapping, etc.) — look just like Spring MVC controllers.
=> Spring generates the implementation automatically.

Why Use OpenFeign over RestTemplate?

AspectRestTemplate (Traditional)OpenFeign (Declarative)Winner & Why
Code readabilityVerbose (manual URL, headers, JSON conversion)Clean, annotation-driven (looks like controller)OpenFeign – much shorter & readable
BoilerplateLots (create template, set headers, handle response)Minimal (just interface + annotations)OpenFeign
Error handlingManual try-catch, custom error decoderBuilt-in support + fallback factoryOpenFeign
Integration with EurekaManual URL construction or @LoadBalancedNative — use service name directlyOpenFeign
Load balancingNeeds @LoadBalanced RestTemplateAutomatic with Eureka + LoadBalancerOpenFeign
Timeout/retryManual configurationIntegrates with Resilience4j easilyOpenFeign
Dynamic URLsManual string concatenationBuilt-in placeholders + configOpenFeign
TestingHarder (mock RestTemplate)Easier (mock Feign client interface)OpenFeign
PerformanceGood (blocking)Good (can be reactive with WebClient backend)Tie
MaintenanceMore code = more bugsLess code = less bugsOpenFeign
When to Choose Which?

=> Use RestTemplate (or WebClient) if: 

        Very simple app, no microservices
        Need full control over HTTP calls
        Legacy code

=> Use OpenFeign (recommended for microservices):

        Calling other microservices (especially with Eureka)
        Want clean, maintainable code
        Need easy integration with Resilience4j, load balancing, fallbacks

@FeignClient annotation 

=> Spring Cloud OpenFeign is a declarative HTTP Client.

=> @FeignClient is a declarative annotation in Spring Cloud OpenFeign that allows you to define a client interface for calling RESTful services (other microservices or external APIs). Spring automatically generates the implementation at runtime using Feign (a lightweight HTTP client) 

=> Instead of writing low-level HTTP code with RestTemplate or WebClient (manual URL construction, headers, JSON serialization, error handling), you define an interface with annotations — Spring creates the actual HTTP client for you.

Key Attributes of @FeignClient

AttributeDescriptionExample
name (required)Logical name of the client (used for load balancing, metrics, logging)name = "employee-service"
url (optional)Fixed base URL (use when no service discovery)url = "http://localhost:8081"
path (optional)Common path prefix for all methodspath = "/api/employees"
configurationCustom config class (encoders, decoders, error handling, logging)configuration = FeignConfig.class
fallbackFallback class for circuit breaker (when service down)fallback = EmployeeClientFallback.class
fallbackFactoryFactory for fallback with exception detailsfallbackFactory = EmployeeFallbackFactory.class
Basic Example

import org.springframework.cloud.openfeign.FeignClient;
import org.springframework.web.bind.annotation.*;

@FeignClient(name = "employee-service")  // Eureka service name (or url if fixed)
public interface EmployeeClient {

    @GetMapping("/api/employees/{id}")
    EmployeeResponseDto getEmployeeById(@PathVariable("id") Long id);

    @PostMapping("/api/employees")
    EmployeeResponseDto createEmployee(@RequestBody EmployeeRequestDto dto);

    @GetMapping("/api/employees")
    List<EmployeeResponseDto> getAllEmployees();

Usage in another service

@Service
public class PaymentService {
    private final EmployeeClient employeeClient;

    public PaymentService(EmployeeClient employeeClient) {
        this.employeeClient = employeeClient;
    }

    public void validateEmployee(Long employeeId) {
        EmployeeResponseDto emp = employeeClient.getEmployeeById(employeeId);
        // ...
    }
}

Advanced Features

1. Load Balancing (with Eureka)

=> Use name = "employee-service" → Feign + Spring Cloud LoadBalancer automatically discovers instances from Eureka and load balances (round-robin).

2. Fallback for Fault Tolerance

@FeignClient(name = "employee-service", fallback = EmployeeClientFallback.class)
public interface EmployeeClient { ... }

@Component
public class EmployeeClientFallback implements EmployeeClient {
    @Override
    public EmployeeResponseDto getEmployeeById(Long id) {
        return new EmployeeResponseDto(id, "Fallback Employee", 0, "N/A", "fallback@example.com");
    }
    // Implement other methods

3. Custom Configuration

@Configuration
public class FeignConfig {
    @Bean
    public Logger.Level feignLoggerLevel() {
        return Logger.Level.FULL;  // Detailed logging
    }
}

@FeignClient(name = "employee-service", configuration = FeignConfig.class)
public interface EmployeeClient { ... } 

Feign error handling

=> Feign error handling refers to how Spring Cloud OpenFeign (the declarative HTTP client) manages and handles errors when calling remote services — such as 4xx/5xx HTTP errors, network failures, timeouts, or deserialization issues.

=> Unlike RestTemplate where you handle everything manually (try-catch, custom error handlers), Feign provides built-in, configurable, and elegant error handling mechanisms.

Main Error Handling Mechanisms in OpenFeign

1. Default Behavior

=> If the remote service returns 4xx or 5xx, Feign throws FeignException (or subclass like FeignException.BadRequest, FeignException.NotFound, etc.).

=> Example 

@FeignClient(name = "employee-service")
public interface EmployeeClient {
    @GetMapping("/employees/{id}")
    EmployeeResponseDto getById(@PathVariable("id") Long id);
}

=> Call getById(999) → if 404 → throws FeignException.NotFound.

2. Custom Error Decoder (Most Flexible)

=> Implement feign.codec.ErrorDecoder to map HTTP errors to custom exceptions.

=> Very common in real projects.

@Configuration
public class FeignErrorDecoder implements ErrorDecoder {

    @Override
    public Exception decode(String methodKey, feign.Response response) {
        if (response.status() == 404) {
            return new ResourceNotFoundException("Resource not found");
        }
        if (response.status() >= 400 && response.status() < 500) {
            return new FeignClientException("Client error: " + response.status());
        }
        if (response.status() >= 500) {
            return new FeignServerException("Server error: " + response.status());
        }
        return errorDecoder.decode(methodKey, response); // fallback to default
    }
}

Usage:

@FeignClient(name = "employee-service", configuration = FeignErrorDecoder.class)
public interface EmployeeClient { ... } 

3. Fallback (Circuit Breaker Style)

=> When service is down or throws error → return fallback response.

=> Two ways: fallback (simple) or fallbackFactory (with exception details).

Simple Fallback

@FeignClient(name = "employee-service", fallback = EmployeeFallback.class)
public interface EmployeeClient {
    @GetMapping("/employees/{id}")
    EmployeeResponseDto getById(@PathVariable("id") Long id);
}

@Component
public class EmployeeFallback implements EmployeeClient {
    @Override
    public EmployeeResponseDto getById(Long id) {
        return new EmployeeResponseDto(id, "Fallback Employee", 0, "N/A", "fallback@example.com");
    }
}

Fallback Factory (recommended — gets exception details)

@FeignClient(name = "employee-service", fallbackFactory = EmployeeFallbackFactory.class)
public interface EmployeeClient { ... }

@Component
public class EmployeeFallbackFactory implements FallbackFactory<EmployeeClient> {
    @Override
    public EmployeeClient create(Throwable cause) {
        return new EmployeeClient() {
            @Override
            public EmployeeResponseDto getById(Long id) {
                log.error("Fallback triggered for getById: {}", cause.getMessage());
                return new EmployeeResponseDto(id, "Fallback", 0, "N/A", "fallback@example.com");
            }
        };
    }
}

4. Retry Mechanism

=> Feign supports retries out of the box.

=> Configure in application.yml:

feign:
  client:
    config:
      default:
        connectTimeout: 5000
        readTimeout: 5000
        retryer: feign.Retryer.Default(100, 1000, 3)  # 3 attempts

5. Integration with Resilience4j (Modern & Recommended)

=> Use @FeignClient with Resilience4j circuit breaker/retry.

=> Add dependency: resilience4j-spring-boot3

=> Configure in application.yml :

resilience4j:
  circuitbreaker:
    instances:
      employeeService:
        failureRateThreshold: 50
        waitDurationInOpenState: 10s
        permittedNumberOfCallsInHalfOpenState: 3

What is Resilience4j? Main modules? 

=> Resilience4j is a lightweight modern fault tolerance library for Java microservices

=> It helps make your services more resilient by handling failures gracefully, preventing cascading failures, and improving overall system stability

=> It is the recommended successor to Netflix Hystrix (which is deprecated) and is widely used in Spring Boot microservices (integrates seamlessly with Spring Cloud).

=> Resilience4j is not a single monolithic library — it’s a collection of independent modules. You only add the ones you need.

ModulePurposeWhen to Use ItKey Features / Configuration
CircuitBreakerPrevents cascading failures by stopping calls to a failing serviceCalling external services or slow/unreliable servicesOPEN/HALF_OPEN/CLOSED states, failure rate threshold, wait duration, ring buffer size
RetryAutomatically retries failed calls with backoffTransient failures (network glitches, temporary unavailability)Max attempts, wait duration, exponential backoff, retry on specific exceptions
RateLimiterLimits the rate of calls (e.g., 10 requests/sec)Protect backend from overload, API rate limitingPermits per second, burst capacity, timeout
BulkheadLimits concurrent calls to a service (thread pool isolation)Prevent one slow service from starving othersMax concurrent calls, max queued calls
TimeLimiterEnforces timeouts on calls (especially async)Long-running calls (e.g., external API)Timeout duration
CacheSimple in-memory caching for responsesCache frequent read operationsCache size, TTL
MetricsExposes metrics for monitoring (integrates with Micrometer/Prometheus)ObservabilityAll modules expose metrics
 

How to Use Resilience4j in Spring Boot (Quick Example)

=> Add dependency (for Spring Boot 3.x):

<dependency>
    <groupId>io.github.resilience4j</groupId>
    <artifactId>resilience4j-spring-boot3</artifactId>
</dependency>

CircuitBreaker example (most common):

@Service
public class PaymentService {

    @CircuitBreaker(name = "paymentService", fallbackMethod = "fallbackPayment")
    public PaymentResponse processPayment(PaymentRequest request) {
        // Call external payment API
        return paymentClient.process(request);
    }

    public PaymentResponse fallbackPayment(PaymentRequest request, Throwable t) {
        // Fallback logic
        return new PaymentResponse(false, "Payment service unavailable");
    }
}

application.yml (configure):

resilience4j:
  circuitbreaker:
    instances:
      paymentService:
        failureRateThreshold: 50
        waitDurationInOpenState: 10s
        permittedNumberOfCallsInHalfOpenState: 3
        slidingWindowSize: 100
        minimumNumberOfCalls: 10

Why Resilience4j over Hystrix?

=> Modern & lightweight — no extra dependencies like Netflix stack.
=> Modular — pick only what you need (Hystrix was all-or-nothing).
=> Better performance — non-blocking where possible.
=> Active maintenance — Hystrix is dead, Resilience4j is actively developed.
=> Spring Boot 3+ integration — native support via annotations + Actuator metrics.

Circuit Breaker pattern  

=> Circuit Breaker Pattern is a fault tolerance design pattern used in distributed systems (especially microservices) to prevent cascading failures when a service is slow, failing, or unavailable.

=> It acts like an electrical circuit breaker: when too many failures occur, it "opens" to stop further calls, allowing the failing service to recover, and then "half-opens" to test recovery.

Core Idea in Simple Terms

Imagine you're calling a remote service (e.g., payment service):

=> If it fails repeatedly (timeouts, errors), keep calling it will make your system slower and worse.

=> Circuit Breaker says: "Stop calling it for a while, return a fallback response, and give it time to recover."

States of Circuit Breaker

1. CLOSED (Normal state)

=> All requests go through to the service.
=> Successes are counted.
=> Failures (timeouts, errors) are monitored.
=> If failure rate exceeds threshold (e.g., 50% in last 100 calls) → circuit opens.

2. OPEN (Tripped / Protected state)

=> No requests go to the failing service.
=> Immediate fallback response (cached data, default value, error message).
=> Waits for a configured timeout (e.g., 10 seconds).
=> Prevents overload on the failing service and cascading failures.

3. HALF_OPEN (Testing recovery)

=> After timeout, allows a limited number of requests (e.g., 3–5) to test if the service has recovered.
=> If they succeed → circuit goes back to CLOSED.
=> If any fails → circuit re-opens for longer timeout.

Typical Configuration Parameters

ParameterPurposeTypical Value
Failure rate threshold% of failures to open circuit50%
Sliding window sizeNumber of calls to evaluate failure rate100 calls
Wait duration in open stateHow long to stay open before half-open10 seconds
Permitted calls in half-openHow many test calls in half-open state3–5
Slow call duration thresholdCalls slower than this count as failure5 seconds

Real-World Example (E-commerce Order Service)

1. Order Service calls Payment Service.
2. Payment Service fails 60% of the time (timeout).
3. Circuit Breaker opens → Order Service returns "Payment temporarily unavailable, try later" (fallback).
4. After 10 seconds → half-open → 3 test calls.
5. If 2/3 succeed → back to closed (normal).
6. If fails → open again for longer.

Benefits

=> Prevents cascading failures (one slow service doesn't kill the whole system).
=> Graceful degradation (fallback responses).
=> Gives failing service breathing room to recover.
=> Improves overall system resilience & availability.

Common Implementations in Spring Boot

=> Resilience4j (modern & recommended) — annotation-based (@CircuitBreaker), configurable via YAML.
=> Hystrix (deprecated, avoid in new projects).

@CircuitBreaker annotation

=> @CircuitBreaker is a powerful annotation provided by Resilience4j (integrated with Spring Boot via Spring Cloud Circuit Breaker) that applies the Circuit Breaker pattern to a method.

=> It protects your application from cascading failures when calling external services, slow APIs, or unreliable dependencies.

What @CircuitBreaker Does (Core Purpose)

When you annotate a method with @CircuitBreaker, Resilience4j wraps the method call with a circuit breaker. It monitors successes/failures over time and automatically:

=> Opens the circuit (stops calls) if failure rate exceeds a threshold.
=> Returns a fallback response immediately instead of letting the call hang or fail hard.
=> Half-opens after a wait period to test recovery.
=> Closes again if recovery succeeds.

This prevents your app from being overwhelmed by a failing downstream service.

Key Attributes of @CircuitBreaker

AttributeDescriptionTypical Value / Example
name (required)Unique name of the circuit breaker instance (used in config)name = "paymentService"
fallbackMethodMethod name to call when circuit is open or call failsfallbackMethod = "fallbackPayment"
ignoreExceptionsExceptions that do not count as failure (e.g., business validation)ignoreExceptions = ValidationException.class
recordExceptionsExceptions that do count as failure (overrides default)recordExceptions = HttpServerErrorException.class
Full Example (Controller + Service)

@Service
public class PaymentService {

    @CircuitBreaker(name = "paymentService", fallbackMethod = "fallbackProcessPayment")
    public PaymentResponse processPayment(PaymentRequest request) {
        // Call external payment API (this can fail or timeout)
        return paymentClient.process(request);
    }

    // Fallback method must have same signature + Throwable parameter (optional)
    public PaymentResponse fallbackProcessPayment(PaymentRequest request, Throwable t) {
        log.error("Payment service failed: {}", t.getMessage());
        return new PaymentResponse(false, "Payment temporarily unavailable");
    }
}

# application.yml - Configure the circuit breaker
resilience4j:
  circuitbreaker:
    instances:
      paymentService:
        failureRateThreshold: 50           # Open after 50% failures
        waitDurationInOpenState: 10s       # Stay open for 10 seconds
        permittedNumberOfCallsInHalfOpenState: 3  # Test 3 calls in half-open
        slidingWindowSize: 100             # Evaluate last 100 calls
        minimumNumberOfCalls: 10           # Need at least 10 calls to evaluate
        slowCallRateThreshold: 50          # Calls slower than slowCallDuration count as failure
        slowCallDurationThreshold: 5s      # Calls >5s are slow

States of Circuit Breaker

1. CLOSED (Normal state)

=> All requests go through to the service.
=> Successes are counted.
=> Failures (timeouts, errors) are monitored.
=> If failure rate exceeds threshold (e.g., 50% in last 100 calls) → circuit opens.

2. OPEN (Tripped / Protected state)

=> No requests go to the failing service.
=> Immediate fallback response (cached data, default value, error message).
=> Waits for a configured timeout (e.g., 10 seconds).
=> Prevents overload on the failing service and cascading failures.

3. HALF_OPEN (Testing recovery)

=> After timeout, allows a limited number of requests (e.g., 3–5) to test if the service has recovered.
=> If they succeed → circuit goes back to CLOSED.
=> If any fails → circuit re-opens for longer timeout.

Benefits

=> Prevents cascading failures (one slow service doesn't kill the whole system).
=> Graceful degradation (fallback responses).
=> Gives failing service breathing room to recover.
=> Improves overall system resilience & availability.

Retry annotation

=> @Retry is a powerful annotation from Resilience4j (integrated with Spring Boot) that automatically retries a method call when it fails due to transient errors (e.g., network timeouts, temporary service unavailability, 5xx errors).

=> It makes your microservices more resilient by handling intermittent failures without manual retry logic in your code. 

How @Retry Works (Core Mechanism)

1. Annotate the method

@Service
public class PaymentService {

    @Retry(name = "paymentRetry", fallbackMethod = "fallbackPayment")
    public PaymentResponse processPayment(PaymentRequest request) {
        // Call external payment API – this can fail transiently
        return paymentClient.process(request);
    }

    // Fallback method (optional but highly recommended)
    public PaymentResponse fallbackPayment(PaymentRequest request, Throwable t) {
        log.warn("Payment retry failed after attempts: {}", t.getMessage());
        return new PaymentResponse(false, "Payment service unavailable – try later");
    }
}

2. Configuration in application.yml (or properties)

resilience4j:
  retry:
    instances:
      paymentRetry:
        maxAttempts: 3                     # Retry 3 times (total 4 calls)
        waitDuration: 500ms                # Wait 500ms between retries
        enableExponentialBackoff: true     # Wait time doubles each retry (500ms → 1s → 2s)
        exponentialBackoffMultiplier: 2    # Double wait time
        retryExceptions:                   # Only retry on these
          - org.springframework.web.client.HttpServerErrorException
          - java.net.ConnectException
        ignoreExceptions:                  # Never retry on these
          - com.example.ValidationException

3. Execution Flow

=> First call fails (e.g., timeout) → wait 500ms → retry 1.
=> Fails again → wait 1s → retry 2.
=> Fails again → wait 2s → retry 3.
=> Still fails → call fallback method (if defined) → return fallback response.
=> If any retry succeeds → return successful response immediately.

Key Attributes & Configuration

AttributeDescriptionTypical Value / Example
name (required)Unique name of the retry instance (links to YAML config)paymentRetry
fallbackMethodMethod name for fallback (same signature + Throwable optional)fallbackPayment
maxAttemptsTotal attempts (including original call)3–5
waitDurationBase wait time between retries500ms–2s
enableExponentialBackoffIncreases wait time exponentiallytrue
exponentialBackoffMultiplierHow much to multiply wait time (e.g., 2 = double)2.0
retryExceptionsExceptions that trigger retryHttpServerErrorException, ConnectException
ignoreExceptionsExceptions that do NOT trigger retry (e.g., business errors)ValidationException

Best Practices

=> Use with external calls (Feign, RestTemplate, database) — not internal logic.
=> Always provide fallback — prevents 500 errors to clients.
=> Tune per service — different services may need different retry counts/times.
=> Log fallback — track when fallback is used (important for monitoring).
=> Combine with CircuitBreaker — retry first, then circuit breaker if retries fail repeatedly.
=> Monitor via Actuator — /actuator/retries shows metrics.

Rate limiter 

=> Purpose : Limit how many requests a client (or IP/user) can make in a given time window to protect backend services from being overwhelmed

How it works

=> Tracks requests per client (usually by IP, header, or token).
=> If limit exceeded → returns 429 Too Many Requests (or custom status)
=> Common algorithms: Token Bucket or Sliding Window. 

The most common algorithm is Token Bucket (used in Spring Cloud Gateway and Resilience4j):

=> Bucket holds a fixed number of tokens (capacity).
=> Tokens are refilled at a constant rate (e.g., 10 tokens per second).
=> Each request consumes 1 token (or more for expensive calls).
=> If bucket has tokens → request allowed, token removed.
=> If bucket empty → request rejected (usually 429 Too Many Requests).

Example:

Replenish rate: 10 requests/second
Burst capacity: 20 tokens

=> Allows bursts up to 20 requests instantly, then limits to 10/sec.

Rate limiter can be used in the following places : 

  1.  Rate Limiting in Spring Cloud Gateway (Most Common) (API Gateway)
  2.  Rate Limiting with Resilience4j (Per-Service)

1. Rate Limiting in Spring Cloud Gateway (Most Common) (API Gateway)

=> Gateway is the ideal place to apply rate limiting (edge protection).

=> Use RequestRateLimiter filter

=> Add dependency spring-boot-starter-data-redis-reactive

<dependency>
    <groupId>org.springframework.boot</groupId>
    <artifactId>spring-boot-starter-data-redis-reactive</artifactId>
</dependency>

YAML Configuration Example

spring:
  cloud:
    gateway:
      routes:
      - id: employee_route
        uri: lb://employee-service
        predicates:
          - Path=/api/employees/**
        filters:
          - name: RequestRateLimiter
            args:
              redis-rate-limiter.replenishRate: 10   # 10 requests per second
              redis-rate-limiter.burstCapacity: 20   # Allow burst up to 20
              redis-rate-limiter.requestedTokens: 1  # Each request consumes 1 token
              key-resolver: '#{@ipKeyResolver}'      # Rate limit by IP

Custom Key Resolver (e.g., by IP)

@Bean
public KeyResolver ipKeyResolver() {
    return exchange -> Mono.just(exchange.getRequest().getRemoteAddress().getAddress().getHostAddress());


=> As per above configuration,
        11th request in 1 second returns the status 429 Too Many Requests
        Protects backend from overload 

2. Rate Limiting with Resilience4j (Per-Service) 

=> You can also apply rate limiting inside individual services (not just gateway).

=> Dependency : 

<dependency>
    <groupId>io.github.resilience4j</groupId>
    <artifactId>resilience4j-spring-boot3</artifactId>
</dependency>

Annotation Example:

@RateLimiter(name = "employeeApiLimiter", fallbackMethod = "fallbackGetEmployees")
public List<Employee> getEmployees() {
    return repository.findAll();
}

YAML Config:

resilience4j:
  ratelimiter:
    instances:
      employeeApiLimiter:
        limitForPeriod: 50
        limitRefreshPeriod: 1s
        timeoutDuration: 0 
 

How Gateway + Feign + Resilience4j work together?

=> In a Spring Boot microservices architecture, Spring Cloud Gateway, OpenFeign, and Resilience4j are often used together to create a robust, scalable, and fault-tolerant system.

=> Gateway acts as the edge/entry point

=> Feign handles internal service-to-service communication

=> Resilience4j adds resilience to both layers

Overall Architecture Flow

1. Client → Gateway (External request)

=> Mobile/web/external client sends request to Gateway (single entry point).
=> Gateway authenticates, rate-limits, routes, and forwards to the appropriate backend service.

2. Gateway → Backend Service (via Eureka + Feign)

=> Gateway uses Eureka for discovery → finds healthy instances of target service.
=> Uses Feign (internally) to make the actual call (declarative HTTP client).
=> Feign call is protected by Resilience4j (circuit breaker, retry, etc.).

3. Backend Service → Another Service (internal call)

=> Backend service calls another microservice using Feign.
=> Feign call again protected by Resilience4j.

4. Resilience4j Protection (everywhere)

=> Applied to Feign clients (via @FeignClient + Resilience4j config).
=> Applied to Gateway routes (via filters).
=> Prevents cascading failures, retries transient errors, limits rate.

Detailed How They Work Together

1. Gateway (Edge Layer – Entry Point)

=> Receives all external requests.
=> Uses Eureka to discover backend services (lb://employee-service).
=> Applies Resilience4j via filters (e.g., CircuitBreaker filter on route).
=> Routes request → backend service (via internal Feign or direct HTTP).

Example Gateway route with Resilience4j:

spring:
  cloud:
    gateway:
      routes:
      - id: employee_route
        uri: lb://employee-service
        predicates:
          - Path=/api/employees/**
        filters:
          - CircuitBreaker=myEmployeeCB  # Resilience4j protection
          - name: RequestRateLimiter
            args:
              redis-rate-limiter.replenishRate: 10
              redis-rate-limiter.burstCapacity: 20

2. OpenFeign (Internal Communication)

=> Used inside services and sometimes in Gateway.
=> Declares HTTP clients with @FeignClient
=> Integrates with Eureka for discovery & load balancing.
=> Calls are wrapped by Resilience4j automatically (if configured).

Example Feign client with Resilience4j:

@FeignClient(name = "payment-service")
public interface PaymentClient {
    @CircuitBreaker(name = "paymentCB", fallbackMethod = "fallback")
    @PostMapping("/pay")
    PaymentResponse pay(@RequestBody PaymentRequest request);

    default PaymentResponse fallback(PaymentRequest request, Throwable t) {
        return new PaymentResponse(false, "Payment unavailable");
    }
}

resilience4j:
  circuitbreaker:
    instances:
      paymentCB:
        failureRateThreshold: 50
        waitDurationInOpenState: 10s 

3. Resilience4j (Fault Tolerance Layer)

=> Applied to both Gateway routes (filters) and Feign clients (annotations).
=> Protects external calls (from Gateway to service) and internal calls (service to service).
=> Ensures one failing service doesn't crash the entire system. 

Full Flow Example:

  1. Client → Gateway /api/employees
  2. Gateway matches route → applies RateLimiter → applies CircuitBreaker → discovers instance via Eureka → calls via internal Feign (or direct).
  3. If backend slow → Resilience4j circuit opens → fallback response from Gateway.
  4. If internal call (service to service) → Feign + Resilience4j handles retry/circuit/fallback. 

Summary Table – How They Work Together

ComponentRoleHow It Uses EurekaHow Resilience4j Protects ItHow It Uses Feign
GatewayEdge entry point, routingDiscovers services for routingCircuitBreaker & RateLimiter filtersOptional (internal calls)
OpenFeignInter-service HTTP callsUses Eureka for discovery & LB@CircuitBreaker, @Retry on methodsCore component
Resilience4jFault tolerance everywhereN/ACore protection layerWraps Feign calls
Best practices for API Gateway + OpenFeign + Resilience4j ?

1. API Gateway (Spring Cloud Gateway) Best Practices

=> Single entry point — All external traffic goes through Gateway (never expose internal services directly).
=> Use Eureka/LoadBalancer — Route with lb://service-name for dynamic discovery & client-side load balancing.
=> Path-based routing — Clean, readable paths (/api/employees/** → employee-service).
=> StripPrefix filter — Remove unnecessary prefixes (StripPrefix=1 for /api/**).
=> Global filters — For logging, correlation IDs, request tracing.
=> Security at edge — JWT validation, OAuth2, rate limiting in Gateway (not in individual services).
=> Rate limiting — Use Redis-based RequestRateLimiter (protect from abuse).
=> Circuit breaker — Apply at Gateway level for critical paths.
=> Caching — Cache frequent GET responses (e.g., static data).
=> Monitoring — Expose /actuator/gateway/routes and metrics.

2. OpenFeign Best Practices

=> Declarative clients — Always use @FeignClient interface instead of RestTemplate/WebClient.
=> Eureka integration — Use name = "service-name" (no hardcoded URLs).
=> Configuration class — Centralize logging, encoders/decoders, error handling.
=> Logging — Set feign.Logger.Level=FULL in dev, BASIC or HEADERS in prod.
=> Error handling — Custom ErrorDecoder or fallback factory.
=> Timeouts — Configure connect/read timeouts (avoid hanging calls).
=> Compression — Enable request/response compression.
=> Load balancing — Automatic with Eureka + Spring Cloud LoadBalancer.
=> Testing — Mock Feign clients in tests (@MockBean). 

3. Resilience4j Best Practices (with Feign & Gateway)

=> Apply to all external calls — Every Feign client and critical Gateway route.
=> CircuitBreaker — On Feign methods + Gateway routes (prevent cascading failures).
=> Retry — For transient errors (network, 5xx) — exponential backoff.
=> RateLimiter — At Gateway (Redis-backed) — protect from overload.
=> Bulkhead — Limit concurrent calls (especially for slow services).
=> TimeLimiter — Enforce timeouts (especially async calls).
=> Fallbacks — Always provide meaningful fallback (cached data, default response).
=> Separate configs per service — Different thresholds for payment vs employee service.
=> Monitoring — Expose Resilience4j metrics via Actuator → monitor in Prometheus/Grafana.
=> Combine with Gateway — Gateway for edge protection (rate limit, circuit breaker), Feign for internal calls (retry, circuit breaker).

How They Work Together (Best Practice Flow)

1. Client → Gateway

=> Gateway: Rate limiting, authentication, circuit breaker (edge protection).
=> Routes to backend service via Eureka + LoadBalancer.

2. Gateway → Backend Service

=> Internal call uses Feign (declarative).
=> Feign call wrapped with Resilience4j (retry, circuit breaker, timeout).  

3. Backend → Another Service

=> Again via Feign + Resilience4j.Full resilience stack — edge protection (Gateway) + internal fault tolerance (Feign + Resilience4j)

Result : Full resilience stack — edge protection (Gateway) + internal fault tolerance (Feign + Resilience4j)