[27]
Introduction
The back of the envelope calculation is commonly used to evaluate different architectural designs. Although the optimal design depends on the system requirements, the back of the envelope calculation provides an opportunity to compare different architectural designs without prototyping. It is also known as ‘capacity planning.’ In general, measuring system performance and performing back of the envelope is important in any deployment to satisfy the service level agreement (SLA). It helps identify the request-response size, database size, cache size, counts of microservice, load balancers, etc. It is always a good idea to estimate the scale of the system before starting any high-level or low-level design. The following are some of the popular reasons to perform back of the envelope calculations:
To decide whether the database should be partitioned.
To evaluate the feasibility of a proposed architectural design.
To identify potential bottlenecks in the system.
How a back of the envelope calculation is done?
The back of the envelope calculation should be performed for efficient resource management and to meet scalability requirements. It is important to note that neither the steps are 100% accurate nor are the results. The purpose is to reach the best fit or best match or ballpark figures quicker than the formal calculations. The different approaches to scaling a web service are:
Vertical Scaling
Horizontal Scaling
Diagonal Scaling
Vertical Scaling
Vertical Scaling is achieved by increasing the number of CPU cores, memory and disk storage on a single server. However, the server will reach a point of saturation where increasing the number of components does not further increase the throughput.
Horizontal Scaling
Horizontal Scaling is a method of adding multiple machines to scale the web service. The drawback of horizontal scaling is increased complexity. However, it does add some operational complexities like synchronization challenges such as maintaining consistent states across vertically upgraded and horizontally replicated servers.
Diagonal Scaling
Diagonal Scaling is a combination of both vertical and horizontal scaling. It brings the best of both worlds by increasing or upgrading the components on a single server and replicating the server.
High Availability
It is the ability of a service to remain reachable and not lose data when a failure occurs. High Availability is typically measured in terms of a percentage of uptime over a given period. The uptime is the amount of time that a system or service is available and operational.
Real world back of the envelope calculation
The performance of a system is not directly proportional to the system’s capacity. In layman’s terms, the performance of the system depends on the asymptotic complexity of the system implementation. Obviously, the latter statement is oversimplified and I am aware that real-time systems have a wide range of facets. While asymptotic complexity measures algorithm efficiency, practical performance also hinges on factors like latency, hardware constraints, I/O operations and concurrency overheads. When distributed nodes lose connectivity, consistency vs. availability decisions (CAP theorem) affect uptime. Strategies like gossip protocols or eventual consistency help.
What to calculate?
Number of servers
RAM
Storage capacity
Formula
$$x \hspace{1mm} million \hspace{1mm} users \hspace{1mm} * \hspace{1mm} y \hspace{1mm} MBs \hspace{1mm} = xy \hspace{1mm} * \hspace{1mm} TBs$$
Because both million and MBs have 6 zeroes each. For example, 5 million users * 2 KBs = 10 GB (because million has 6 zeroes and KB has 3 zeroes).
Example: Designing a ride-sharing service like Uber
Estimate Reasonable Values
Number of daily rides: Let's assume a major city like New York has around 5 million daily rides.
Average ride duration: 20 minutes
Number of drivers: 1 million drivers
Average driver availability: 8 hours per day
Peak hour requests: Let's assume 10,000 requests per second during peak hours.
Data Storage Requirements
User Profiles
Assume 100KB per user (name, contact, payment, ride history, etc.)
$$5 \hspace{1mm}million \hspace{1mm} users * 100 \hspace{1mm} KB/user = 500 \hspace{1mm} GB$$
Driver Profiles
Assume 50KB per driver (similar data to user profiles)
$$1 \hspace{1mm} million \hspace{1mm} drivers * 50 \hspace{1mm} KB/driver = 50 \hspace{1mm} GB$$
Ride History
Assume 1KB per ride (basic trip details)
$$5 \hspace{1mm} million \hspace{1mm} rides/day * 1 \hspace{1mm} KB/ride = 5GB/day $$
$$For \hspace{1mm} a \hspace{1mm} month \hspace{1mm} (30 \hspace{1mm} days): 5 \hspace{1mm} GB/day * 30 \hspace{1mm} days = 150 \hspace{1mm} GB$$
Driver Locations
Assume 100 bytes per driver location update (latitude/longitude)
If drivers update location every minute:
$$100 \hspace{1mm} bytes/driver *\hspace{1mm} 1 \hspace{1mm} million \hspace{1mm} drivers * \hspace{1mm} 60 \hspace{1mm} minutes/hour * 24 \hspace{1mm} hours/day = 144 \hspace{1mm} GB/day$$
- For a month (30 days):
$$144 \hspace{1mm} GB/day * 30 \hspace{1mm} days = 4320 \hspace{1mm} GB = 4.32 \hspace{1mm} TB$$
Calculate Approximate Storage Needs
Total Storage: User Profiles + Driver Profiles + Ride History + Driver Locations
~ 500GB (users) + 50GB (drivers) + 150GB (ride history) + 4.32TB (driver locations)
~ 5TB (approximate storage needs)
Server and RAM Considerations (High-Level)
Servers: The number of servers depends on factors like traffic load, data replication, and fault tolerance. A distributed system with multiple servers would be required.
RAM: Each server needs sufficient RAM to handle incoming requests, process data, and store frequently accessed data in memory (caching).
Disclaimer: This is a simplified back-of-the-envelope calculation. For a real-world system design, you would need to conduct more in-depth research and analysis.
References
Back of the envelope (2023), systemdesign.one