Monday, December 2, 2019

Performance Parameters for a System

Performance is characterized by the amount of useful work accomplished by a computer system compared to the time and resources used.

Depending on the context, this may involve achieving one or more of the following:
  • Short response time/low latency for a given piece of work
  • High throughput (rate of processing work)
  • Low utilization of computing resource(s)

Response Time / Latency

The time between a client sending a request and receiving the response. The response time is what the client sees which includes the service time of the request and network & queuing delay. 

Even if you make the same request time and again you will see a varying response time on every try. In practice, service or application handling a variety of request, the response time can vary a lot. One obvious reason is, a request for a user having a lot of data will be slower than another user which doesn't have much data. Other reasons could be - random additional latency, loss of a network packet during TCP transmission, GC pause, page fault forcing read from disk, other mechanical or network faults. That's why we need to think of response time not as a single number but as a distribution of values. 

If 95th percentile (p95) response time is 1.5 seconds, that means 95 out of 100 requests take less than 1.5 seconds, and 5 out of 100 requests take 1.5 seconds or more. 

low latency - achieving a short response time - is the most interesting aspect of performance, because it has a strong connection with physical (rather than financial) limitations.

In a distributed system, there is a minimum latency that cannot be overcome: the speed of light limits how fast information can travel, and hardware components have a minimum latency cost incurred per operation (think RAM and hard drives but also CPUs).


Throughput

The number of requests or records which can be processed per second, or the total time it takes to run a job on a dataset of a certain size.

There are tradeoffs involved in optimizing for any of these outcomes. For example, a system may achieve higher throughput by processing larger batches of work thereby reducing operational overhead. The tradeoff would be longer response times for individual pieces of work due to batching.

Resource Utilization

We want the optimal usage of the hardware resources which includes CPU, RAM, Network bandwidth. Or, in other words, do more with fewer resources. This will help in the scaling of the system. 

No comments:

Post a Comment