Edge-First And Milliseconds: How Vadim Goncharov Made Wrike A Lightning-Fast SaaS

In 2023, web application performance became a strategic competitive advantage for SaaS companies. Amid the global race for Core Web Vitals and rising UX expectations, Wrike implemented an edge-first caching architecture, reducing global response times to around 10 ms.

We spoke with Vadim Goncharov, Senior Web Developer, who designed this system, about how performance engineering directly impacts business metrics, SEO, and conversion in enterprise SaaS — turning technical expertise into measurable company value.

Vadim, this year, web application performance has become a critical competitive factor for SaaS companies. How do you see the role of website performance for the business, and why did global response time become a priority for Wrike at this particular moment?

Performance is a business metric. A SaaS marketing website is the first point of contact with a potential customer: if the page loads slowly, users will leave before they even see the product.

Google factors Core Web Vitals into rankings, and for Wrike, operating in a competitive space alongside companies like Monday.com, Asana, and Jira, every position in search translates into tens of thousands of leads.

Our website was serving 6–10 million unique visitors per month across 13,000+ pages. A 25-person marketing department was constantly launching new landing pages and A/B tests — and repeatedly ran into the same issue: the site performed poorly for users in regions like Asia, Latin America, and Australia.

A slow website directly limited the scalability of paid channels: driving more traffic to a slow site meant increasing costs while conversion dropped.

Within the web team, I promoted the principle “Performance is a feature.” Marketing would come with requirements for new components and landing pages; I would run them through performance reviews and propose solutions that addressed business needs while also improving speed where it mattered.

Ultimately, this led to measurable conversion growth — at enterprise SaaS scale, that means real money.

What technical and business challenges did the team face, and why could the existing web infrastructure no longer deliver the required level of speed and reliability?

Caching was built on Nginx Lua. It was fairly reliable, but costly to maintain: a custom language, logic scattered across Lua scripts and Nginx configs, and a lot of hard-to-maintain code that was hard to debug and evolve.

Cache invalidation at the scale of 13,000+ pages and 5–6 subprojects was fragile — when marketing updated content in the CMS, there was no reliable way to invalidate specific pages without risking others.

But the main issue was structural: the origin server was located in North America. Nginx cached pages locally, but on a cache miss the request had to travel to the data center — for users in Asia, Europe, and Latin America, that meant hundreds of milliseconds of network latency alone. Caching helped US users; for everyone else, it didn’t.

We were essentially hitting the limits of the traditional caching model, where even with a CDN, all logic and dependency on the origin remain centralized. Any cache miss meant an expensive round trip to the server.

At that point, it was clear we needed an architectural shift toward an edge-first approach, where request processing and caching happen as close to the user as possible, not around the origin.

From a business perspective, a slow site was hurting SEO rankings and reducing organic traffic. A 20-person web team — four of whom focused on infrastructure, performance, and dev tools — was spending resources maintaining the Lua solution instead of building new features.

How did you arrive at the decision to adopt an edge-first architecture, and why did it prove superior to traditional caching approaches?

The classic approach is caching in front of the server. But there’s a problem: “in front of the server” still means there’s a server. On a cache miss, the request goes to the origin — and for a user in Singapore, that’s an additional 200–300 ms just in RTT. At the scale of 13,000+ pages, cache misses are inevitable: pages get evicted, TTL expires, marketing updates content.

But if 85–90% of the content on a marketing site is static or semi-static, why go to the origin at all? Cloudflare Workers allow you to execute logic directly on edge nodes in 300+ locations worldwide. A Worker receives a request, checks the edge cache, and if the page is there — it’s delivered in milliseconds. The origin never even sees the request.

The key difference from a “regular CDN” is that Workers handle routing logic at the edge. At Wrike, both the marketing website and the product application lived under the same domain. Workers determined what to cache and what to pass through to the origin — marketing pages were aggressively cached, while product routes and APIs were passed through. This also improved maintainability: instead of Lua scripts in Nginx configs, we had standard JavaScript with a modular structure that could be split into files, tested, and evolved.

At the same time, we immediately identified a potential risk: some API requests could be accidentally cached at the edge and return data to other users if authorization headers and cookies weren’t handled properly. This became one of the reasons we designed strict route classification and a two-level caching system — to protect user data while accelerating the site.

It also became clear that a single edge cache layer wouldn’t be enough: at scale and with active marketing operations, cache misses, warm-up, and invalidation are inevitable. That’s why we designed a two-level caching system — with edge as the primary layer and Nginx as fallback — along with a dedicated tool for cache management and warming, so marketing could control content freshness without full invalidation.

Could you elaborate on the design and implementation of the two-level cache and Cache Heater?

As I mentioned, Cloudflare edge is the primary layer, and Nginx is the fallback. A Worker at the edge processes the request: if the page exists in the edge cache, it’s returned instantly; if not, the request goes to Nginx, which checks its own cache. Only if both layers miss does the request reach the application. Under normal conditions, the origin receives minimal traffic.

The most complex part was route classification. On wrike.com, both public landing pages and an authenticated application were hosted together. The Worker had to know exactly that /pricing is a static marketing page, while /app/* or any request with an authorization token is product traffic that must not be cached. This required rethinking the entire routing logic, not just doing a straightforward migration.

The migration was gradual, since a site with a 99.9% SLA can’t just be switched over. Traffic was shifted step by step, with Nginx Lua running in parallel as fallback and metrics monitored at each stage. The entire project — from analysis to full deployment — took 3–4 months.

Separately, I developed Cache Heater — a service built with NestJS and an Angular UI. It pulls pages from the sitemap of all subprojects and allows granular cache warming and invalidation: a single page, a section, or the entire site. After static deployments, warming is triggered automatically. When marketing updates content in the CMS, they can click a button and warm the required section. This eliminated the common issue of “the page was updated, but users still see the old version” without requiring full cache invalidation.

What results did you achieve after implementing the new system? How much did global response time improve, and how did it impact UX, marketing metrics, and conversion?

Response time decreased by 2.5x — pages were delivered in around 10 ms worldwide, regardless of region. In parallel, I optimized the database: monitoring slow queries, adding indexes, and integrating checks into CI/CD — this added another 20% improvement.

For marketing, the results were tangible: Core Web Vitals moved into the green zone, and ad campaigns stopped losing conversion due to slow pages. Overall, we recorded a 5–7% increase in conversion — this was a combined effect of improved speed and new landing builder components that I developed in parallel.

The number of production incidents dropped by 15%. The edge architecture reduced load on the origin: less strain, fewer situations where the backend struggled under peak traffic. The site consistently maintained a 99.9% SLA with 6–10 million monthly visitors.

You mentioned that during migration there was an issue with API caching and user data exposure. How did you identify and resolve it?

This happened because of what I mentioned earlier: marketing and product coexisting under the same domain. Wrike.com had 3 subdomains and over 13,000 URIs: landing pages, product routes, APIs, subpages. Every URI had to be covered by a rule — cache or pass to origin.

The issue was discovered during release: on some endpoints that should not be cached, the Cache-Control: no-store, no-cache header was not explicitly set. Cloudflare’s default behavior is to cache responses for 120 minutes if no header is present.

During testing, we validated other endpoints where headers were correctly configured, so the issue went unnoticed. In production, we saw cf-cache-status: HIT on responses that should have been personalized — the Worker had cached them and could return them to another user. We fixed it quickly, and there was no impact.

The solution was systemic: I created a full registry of routes that must not be cached, grouped them by characteristics — authentication cookies, Authorization headers, product prefixes — and added a validation layer in Workers before any cache lookup. If a request carries signs of an authenticated session, it is guaranteed to go to the origin. Additionally, I ensured that all non-cacheable endpoints explicitly set Cache-Control headers. This check became a mandatory checkpoint for any changes in Worker logic.

More broadly, how do you determine which technical solutions will deliver the most business value and which might be unnecessary?

I always start with the question: which measurable metric will this move? Will conversion increase, costs decrease, or time-to-market improve? We need at least one of those.

At Wrike, I had a choice: incrementally improve Nginx Lua or redesign the caching architecture. The incremental path is easier. But even a perfectly tuned Nginx Lua wouldn’t solve global latency — the origin dependency is structural and can’t be fixed with tuning. Edge-first addressed the root cause, not the symptom.

There’s a simple filter: if a technical solution requires more than two sentences to explain its value to a director, you’re probably optimizing the wrong thing. “The site will load 2.5x faster worldwide” is clear. “We’re moving to edge computing with two-level cache invalidation” is much harder to sell. The best solutions in my experience are those whose impact can be expressed in money or time.

You mentioned participating in hackathons and mentoring junior developers. What key principles and skills do you try to pass on, especially in the context of high-load systems?

I would highlight three key principles.

First: measure first, then optimize. Junior developers often design “for scale” from day one — microservices, Kubernetes, queues — for a system with 50 users. At Wrike, with 13,000+ pages and millions of visitors, the architecture stayed relatively simple. Complexity should be added only when data proves it’s necessary.

Second: find the bottleneck — don’t guess it. In high-load systems, the problem is rarely where it seems. At Wrike, both the backend and network latency to the origin were bottlenecks. But latency was the most actionable: the cache already existed, it just needed to be delivered faster — so we moved it closer to the user. Without monitoring and profiling, architectural decisions become guesswork.

Third: working code matters more than perfect code. At a hackathon, you have 48 hours; in production, you have deadlines, budgets, and SLAs. Deliver first, then measure and improve. I’ve seen many projects fail to reach users because teams spent too long polishing the architecture.

Finally, what modern approaches and technologies in high-performance web solutions do you consider the most promising, and why?

Edge computing is no longer niche — it’s becoming the default approach for products with a global audience. Cloudflare Workers, Deno Deploy, Vercel Edge Functions — all major platforms are moving toward executing logic close to the user. I’ve seen it firsthand: moving to the edge resulted in around 10 ms response times globally, and users feel the difference.

Second: observability as part of the architecture from day one, not an afterthought. At Wrike, I integrated slow query monitoring into CI/CD — performance issues were caught before reaching production, not after user complaints.

Third: thoughtful cache management. TTL and manual invalidation work, but teams need tools that provide control without requiring code changes. Cache Heater, which I built for Wrike, was exactly about that — granular warming and invalidation through a UI aligned with the site structure. At the same time, I wouldn’t recommend full automation: if marketing updates a page 100 times per hour during editing, automatic invalidation will create more problems than it solves.

Also Read:

Edge-First and Milliseconds: How Vadim Goncharov Made Wrike a Lightning-Fast SaaS

Sachin

Leave a Comment Cancel Reply