How I Reduced My Lambda First Request Latency from 6s to <100ms

Aug 03, 2025

Ever wondered why your AWS Lambda function sometimes takes forever to respond on that very first request? That’s the infamous cold start in action — and it’s not just your code’s fault. What you’re really seeing is the time AWS takes to prepare a fresh runtime environment before any of your code even runs. Here’s what happens step by step:

AWS obtains a new sandbox (secure, isolated container for your function).
Your code is downloaded onto the sandbox.
Your code is initialized (static blocks, class initializers, etc.).
Only after these are done does Lambda move on to actually execute your code. (This is a simplified version, and if you’re using VPC functions, there are some additional steps, but you get the general idea.)

Recently I was tasked with reducing one of your services lambda function first request latency from 6s to below <500ms (I went a lil extra and reduced it to <100ms). Here are the steps that I followed to achieve the same.

Step 1: Provisioned Concurrency (going under 2s)

As a first step , I resorted to using Provisioned Concurrency. Provisioned concurrency keeps your Lambda functions “pre-initialized” and hyper-ready to serve requests. By pre-warming and holding execution environments open, AWS ensures your function doesn’t pay the cost for steps 1, 2, and 3 above. The result: cold start spikes all but disappear from the client’s perspective, and requests go straight to code execution instead of being held up by Lambda orchestration.

Result: Provisioned concurrency eliminates cold starts caused by AWS infrastructure. But if your own code does heavy initialization inside the handler, your users may still see an initial delay. In my case it reduced the first request latency to 1.8 sec from 6 secs.

Step 2: Using Dagger and Static Initialization (going under 1s)

The Next Bottleneck: Even with provisioned concurrency, my function’s first request would still be slow, because expensive setup (e.g., Dagger injection, object creation) happened inside the handler method.

Old approach:

public APIGatewayProxyResponseEvent handleRequest(...) {
    initializeLambda(context);
    // ...
}

This meant that initialization always ran as part of the first invocation in any new sandbox.

The fix: Move all such initialization into static blocks or fields. Since static initializers run when the Lambda environment itself starts (i.e., well before a real user call), this eliminates the first-invoke delay.

New approach:

private static final MyService MY_SERVICE =
    DaggerAppComponent.builder()
        .appModule(new AppModule())
        .build()
        .myService();

Now, your handler uses pre-initialized dependencies without blocking user requests.

Result: After this change, my first-request latency dropped to 800ms, as Dagger and other dependencies were all ready before the first handler call.

Step 3: Making a Describe Call to DynamoDB During Initialization (going below <100ms)

Even after static initialization, my Lambda’s first request would still take more than half a second. On further deep dive, I discovered that the first outbound HTTP call (to DynamoDB) was incurring significant setup costs (e.g., SSL handshake, new connection establishment).

The solution: Proactively make a dummy describeTable call to DynamoDB as part of static initialization, which primes the connection and completes the handshake before user traffic ever lands.

Example:

@Provides
DynamoDbClient provideDynamoDbClient() {
    final DynamoDbClient client = DynamoDbClient.builder()
        .httpClient(UrlConnectionHttpClient.builder().build())
        .region(region)
        .build();
    // Prime DDB connection
    client.describeTable(...);
    return client;
}

Result: After priming the connection up front, my Lambda served even the very first request in under 100ms — matching warm-invoke speeds.

Step 4: Key Lessons and Takeaways

Lambda cold start consists of both AWS’s environment setup and your initialization code. Provisioned Concurrency handles the former — your design fixes the latter.
Move all heavy initialization out of handlers and into static blocks or variables.
Prime connections to external services during cold start if those services are latency-sensitive.

It was a fun experience and I learned a lot. Hopefully you also learned a thing or two. If you want to see more of this. Please subscribe to my newsletter.

Himanshu’s Substack

Discussion about this post

Ready for more?