Performance, part 4

2025-10-15

In Part 1 and Part 2 of our performance series, we detailed how we boosted system responsiveness by engineering smaller, parallel requests, and initiating them early.

Another factor influences latency: the physical distance from the user to the servers.

Where should the servers be located?

Across the globe, round-trip delays typically range from 30 to 500 milliseconds. This is significant, especially in chains of sequential requests, when these delays accumulate and degrade the user experience.

Theoretically, the best way to minimize latency is to deploy servers across multiple continents, ensuring each user connects to the nearest instance.

The Cool Maze backend is deployed in a single region. This choice is less complex and less expensive than a multi-region setup. As a consequence, users farther from that region may experience higher latency on some HTTPS requests.

Surprisingly, geographic distance has had less impact on performance in practice than other optimization opportunities, such as:

Minimizing the total number of requests
Reducing the payload size, by resizing images
Running requests in parallel whenever feasible
Starting requests as early as possible
Using heuristics to pre-start likely requests ahead of time

As a result, by the time a user scans a QR code, many background requests have already completed — often including the upload itself. Only the very last requests contribute a small, residual delay to what the user perceives as system responsiveness.