Queuing Theory Essentials

Fundamental concepts, definitions, and mathematical models for system optimization.

Fundamental Parameters

To understand a queue, we need to know how fast people arrive (λ), how fast they are served (μ), and how many people can do the work (c).
  • λ (Lambda) Arrival Rate: The average number of customers arriving per unit of time (e.g., 10 per hour).
  • μ (Mu) Service Rate: The average number of customers a server can handle per unit of time.
  • c (Servers) Server Count: The number of parallel service channels available in the system.
  • ρ (Rho) Utilization Factor: The fraction of time a server is busy. For stability in infinite systems, ρ < 1.
  • K (Capacity) System Capacity: The maximum number of customers allowed in the system (queue + service).

Kendall's Notation

This is a shorthand code used to describe any queue quickly, like a recipe name for a specific type of problem.

Standardized shorthand for describing queuing models in the format: A / B / c / K / m / Z

Symbol Meaning Common Values
A Arrival Distribution M, D, G
B Service Distribution M, D, G
c Number of Servers 1, 2, 3, ... (Parallel channels)
K System Capacity ∞ (Infinite), or finite integer (e.g., 10)
m Source Population ∞ (Infinite), or finite pool sizes

Distribution Basics

Poisson Arrival

Random Arrivals

When arrivals are independent and occur randomly at a constant average rate. The number of arrivals in a time interval follows a Poisson Distribution.

Exponential Service

Markovian Property

The service time distribution is "memoryless," meaning the time remaining to complete a task does not depend on how much time has already elapsed.

Deterministic

Constant Time

Service or Arrivals occur at perfectly fixed intervals. Variance is zero (σ2 = 0). Common in automated machinery.

Queue Discipline & Priority

FIFO

Standard Queuing

First-In-First-Out: Customers are served in the exact order they arrive. This is the most common and "fair" discipline for general service systems.

Non-Preemptive (NP)

Queue Jumping

Priority over line: High-priority customers move to the front of the queue but do not interrupt someone currently being served.

Preemptive (PR)

Immediate Service

Service Interruption: High-priority customers can stop the current service session to be handled immediately. Common in emergency medical triage.

Performance Metrics

Average Queue Length (Lq)

Expected number of customers waiting in the queue.

Average System Length (Ls)

Expected number of customers in the entire system.

Average Waiting Time (Wq)

Expected time a customer spends waiting in the queue.

Average Turnaround Time (Ws)

Total time spent in the system (Wait + Service).

Core Performance Formulas

M/M/1

Single Server Markovian

Lq = ρ21 - ρ
Real-World Use Case Bank teller with a single window, small coffee shops, or a single repair technician.
M/M/c

Multi-Server Markovian

P0 = [ ∑(λ/μ)nn! + (λ/μ)cc!(1-ρ) ]-1
Real-World Use Case Call centers with multiple agents, supermarket checkout lines, or hospital emergency rooms.
M/M/1/K

Finite Capacity Single Server

P0 = 1 - ρ1 - ρK+1 PK = ρK P0
Real-World Use Case Single repair shop with limited parking space for cars waiting for service.
M/M/c/K

Finite Multi-Server Markovian

Pf = PK = (λ/μ)Kc!(cK-c) P0
Real-World Use Case Parking lots with fixed spaces, buffer zones in assembly lines, or server clusters.
M/D/1

Deterministic Service

Lq = ρ22(1 - ρ)
Real-World Use Case Automated car wash, robotic assembly station, or periodic scheduled maintenance.
G/G/c

General Multi-Server (Approx)

WqP(L≥c)cμ(1-ρ) · Ca2 + Cs22
Real-World Use Case Manufacturing systems with non-standard arrival and service distributions.
M/G/1

General Service

Lq = λ2σ2 + ρ22(1 - ρ)
Real-World Use Case Post offices where different tasks (mailing, passport, etc.) take highly variable times.

Operational Efficiency & Costs

The Stability Rule

For any system with infinite capacity, the Stability Condition must be met: λ < cμ (Arrival rate must be less than total service capacity). If this is violated, the queue grows to infinity!

Wait Cost vs. Service Cost

Management must balance two opposing costs:

  • Service Cost: Cost of providing service (paying servers, electricity). Increases as c increases.
  • Waiting Cost: Cost of customer dissatisfaction or lost productivity. Decreases as c increases.

Total Cost = (c × Cserver) + (Ls × Cwait)

System Optimization

The "Sweet Spot" is the number of servers where the Total Cost is at its minimum value. This simulator helps you find that exact point by modeling different configurations.

System Failure Analysis (Pf)

Blocking Probability (Pf)

In finite capacity systems (K), arriving customers are blocked (dropped) if the system is full. This is often referred to as "System Failure" in certain engineering contexts.

  • Definition Pf is the probability that an arriving customer enters a system that already contains K customers.
  • Effective Rate λeff = λ(1 - Pf) is the actual rate at which customers enter and are served.

Operational Laws

Little's Law is a simple rule that says the number of people in a shop depends on how fast they enter and how long they stay.

Little's Law

The long-term average number of customers in a stationary system L is equal to the long-term average effective arrival rate λ multiplied by the average time a customer spends in the system W.