The recent ICC Champions Trophy 2025 match between India and Pakistan was a thrilling spectacle, which not only showcased Virat Kohli’s stellar performance but also set unprecedented records in digital viewership. While the match was a treat for fans, it was a technical nightmare for the JioHotstar engineering team.
The streaming platform reported a staggering 60.2 crore (602 million) views during this high-stakes encounter. With the introduction of the free mobile subscription feature, the engineering team had to prepare for nearly 50 million simultaneous streams—a feat no streaming service had attempted before.
This required a fundamental rethinking of JioHotstar’s infrastructure, from API handling to network optimisation, to ensure a seamless experience for millions of cricket fans. To sum it up, the team did a God-level job.
CDNs are the Key
At the heart of JioHotstar’s live streaming architecture is a complex but efficient system that ensures users across mobile, web, and connected TVs get a smooth experience. When a viewer requests a live stream, the request first passes through content delivery networks (CDNs), which act as an external API gateway.
These CDNs are crucial not just for distributing content efficiently but also for handling security checks and routing traffic intelligently. From there, an internal API gateway, supported by Application Load Balancers, directs the request to the appropriate backend service, which fetches data from either a managed or self-hosted database.
With an anticipated spike in traffic during the last few overs of the match, this traditional workflow wasn’t going to be enough. One of the biggest issues was handling API calls at scale. Upon analysing traffic patterns, the team realised from previous events that not all API requests needed the same level of processing power.
Some, like live score updates and key match moments, could be easily cached and served with minimal computation, while others, like user authentication and content personalisation, required direct database queries.
This led to the creation of a new CDN domain dedicated to cacheable requests, allowing JioHotstar to reduce compute load and significantly improve response times.
The internal API gateway, which serves as the front door for all requests, was particularly resource-intensive. To mitigate this, JioHotstar deployed high-throughput nodes (over 10 Gbps) and enforced topology spread constraints, ensuring that no single node handled too many API requests at once.
Self-managed Kubernetes to EKS
While optimising traffic handling was a major step, JioHotstar also had to rethink how its cloud-based infrastructure scaled.
Previously, the platform relied on self-managed Kubernetes clusters, but these systems were already nearing their limits. As a result, JioHotstar migrated to Amazon Elastic Kubernetes Service (EKS), which offloaded the burden of cluster management to AWS and allowed the team to focus on optimising workloads.
However, migrating to EKS introduced new challenges, particularly around network throughput. One of the most pressing issues was NAT Gateway congestion—a bottleneck that limited the speed at which data could flow.
In a typical cloud setup, a single NAT Gateway per availability zone (AZ) handles traffic for multiple services. However, with millions of users streaming simultaneously, this setup quickly overloads. To solve this, the team shifted to a subnet-level NAT Gateway configuration, effectively distributing traffic more evenly across the network and eliminating the bottleneck.
Even within Kubernetes, scaling wasn’t as simple as adding more nodes. During peak load testing, the engineering team discovered that several backend services were consuming up to 9 Gbps of bandwidth per node, creating uneven traffic distribution across clusters.
While infrastructure optimisations played a crucial role in enabling scale, network constraints nearly derailed the effort. During internal load tests, the team encountered a critical IP address shortage in its Kubernetes clusters.
Despite configuring private subnets across multiple AZs, JioHotstar found that it was unable to scale beyond 350 nodes—far below the 400+ required to support peak traffic. The culprit? Over-provisioned IP address allocations.
One of the final hurdles came from Kubernetes service discovery. While scaling beyond 1,000 pods, JioHotstar discovered a hard limit in Kubernetes’ endpoints API, which tracks network locations for services.
Once the limit was exceeded, Kubernetes truncated endpoint data, creating unpredictable traffic distribution issues. Though modern EndpointSlices offer a solution, JioHotstar’s API Gateway didn’t support them, forcing the team to vertically scale services to stay below the 1,000-pod threshold.
Autoscaling Wasn’t Enough
Autoscaling struggles to handle sudden traffic surges. For major cricket matches, JioHotstar experiences spikes of nearly 1 million users per minute, drastically increasing the number of active viewers. If a star batsman gets out, traffic can drop by millions within the same minute, putting immense strain on backend services.
An unusual challenge here is that when users hit the back button instead of closing the browser, they are redirected to the homepage. If the homepage isn’t designed to handle high traffic, it can cause system failures.
There are additional concerns. What if AWS lacks the capacity in a specific AZ to provision servers? In such cases, autoscaling becomes ineffective. Even step-wise scaling, where 10 servers are added at a time with a target of scaling from 100 to 800, may be too slow to respond to real-time demand.
With 1 million requests per second and 10 terabytes of video bandwidth consumption per second, amounting to 75% of India’s total internet bandwidth, the scale of operations is staggering. Notably, even internet service providers (ISPs) struggle to deliver such massive traffic loads.
Streaming platforms frequently encounter these challenges during high-profile events like IPL finals, royal weddings, or political broadcasts. To prepare for such spikes, JioHotstar conducts extensive load testing using 3,000 machines, each equipped with 36 CPUs and 72GB of RAM, across multiple regions.
This rigorous testing process, known as ‘tsunami testing’, helps determine the breaking point of each service. The results are then used to plan infrastructure scaling effectively.