← Back to CSIR NET

Reliability & Queuing Theory

From failure rate analysis and life testing to Markovian queuing models and inventory optimization — the mathematics of system performance and operations research.

Reliability Concepts Life Testing System Reliability Elementary Queues Markovian Models Inventory Models
0 of 6 units completed0%
01
Reliability Concepts

Fundamental reliability measures: reliability function, hazard rate, failure rate classes, and mean time to failure.

🔧 Reliability & Hazard Functions
Definition — Reliability Function For a non-negative random variable \(T\) representing lifetime, the reliability function (survival function) is: \[R(t) = P(T > t) = 1 - F(t) = \int_t^\infty f(u)\,du\] where \(F(t)\) is the CDF and \(f(t)\) is the PDF. Properties: \(R(0) = 1\), \(R(\infty) = 0\), \(R\) is non-increasing.
Definition — Hazard Function (Failure Rate) The hazard function is the instantaneous failure rate given survival to time \(t\): \[h(t) = \frac{f(t)}{R(t)} = -\frac{d}{dt}\ln R(t)\] This gives the fundamental relationship: \[R(t) = \exp\!\left(-\int_0^t h(u)\,du\right) = e^{-H(t)}\] where \(H(t) = \int_0^t h(u)\,du\) is the cumulative hazard function.
Definition — Failure Rate Classes
  • IFR (Increasing Failure Rate): \(h(t)\) is non-decreasing — aging components.
  • DFR (Decreasing Failure Rate): \(h(t)\) is non-increasing — burn-in improvement.
  • Bathtub curve: DFR in early life (infant mortality), constant in useful life, IFR in wear-out phase.
  • Constant failure rate: \(h(t) = \lambda\) implies \(T \sim \text{Exp}(\lambda)\) — the memoryless property.
Theorem — Mean Time to Failure & Mean Residual Life The MTTF is: \[\text{MTTF} = E[T] = \int_0^\infty R(t)\,dt\] The mean residual life (MRL) at age \(t\) is: \[m(t) = E[T - t \mid T > t] = \frac{1}{R(t)}\int_t^\infty R(u)\,du\] For the exponential distribution, \(m(t) = 1/\lambda\) for all \(t\) (memoryless).
Example
A component has reliability \(R(t) = e^{-0.002t^2}\) for \(t \ge 0\). Find the hazard function and classify the failure rate.
\(h(t) = -\frac{d}{dt}\ln R(t) = -\frac{d}{dt}(-0.002t^2) = 0.004t\). Since \(h(t) = 0.004t\) is strictly increasing for \(t > 0\), this is an IFR distribution (Rayleigh-type). MTTF \(= \int_0^\infty e^{-0.002t^2}dt = \frac{1}{2}\sqrt{\pi/0.002} \approx 22.36\).
02
Life Testing & Censoring

Censoring schemes, Kaplan-Meier estimation, and parametric life models.

📉 Censoring & Estimation
Definition — Types of Censoring
  • Type I censoring: The test ends at a pre-specified time \(T_0\). Units surviving beyond \(T_0\) are censored. The number of failures \(r\) is random.
  • Type II censoring: The test ends when a pre-specified number \(r\) of failures occurs. The test duration \(T_{(r)}\) is random.
  • Random censoring: Each unit \(i\) has a potential censoring time \(C_i\) independent of \(T_i\). We observe \(\min(T_i, C_i)\) and an indicator \(\delta_i = I(T_i \le C_i)\).
Theorem — Kaplan-Meier (Product-Limit) Estimator Let \(t_1 < t_2 < \cdots < t_k\) be the distinct observed failure times. Let \(d_j\) = number of failures at \(t_j\) and \(n_j\) = number at risk just before \(t_j\). The Kaplan-Meier estimator of \(R(t)\) is: \[\hat{R}(t) = \prod_{j: t_j \le t}\left(1 - \frac{d_j}{n_j}\right)\] This is a non-parametric, consistent estimator. Its variance is estimated by Greenwood's formula: \[\hat{V}[\hat{R}(t)] = [\hat{R}(t)]^2 \sum_{j: t_j \le t}\frac{d_j}{n_j(n_j - d_j)}\]
Definition — Exponential Life Model Under Censoring For \(T_i \sim \text{Exp}(\lambda)\) with Type II censoring at \(r\) failures, the MLE of \(\lambda\) is: \[\hat{\lambda} = \frac{r}{\sum_{i=1}^r T_{(i)} + (n-r)T_{(r)}}\] The denominator is the total time on test (TTT). The quantity \(2\lambda \cdot \text{TTT}\) has a \(\chi^2_{2r}\) distribution, enabling exact confidence intervals.
Example
In a Type II censored test with \(n = 10\) items, 4 failures occur at times 100, 200, 350, 500. Estimate the mean life assuming an exponential model.
TTT \(= 100 + 200 + 350 + 500 + (10-4) \times 500 = 1150 + 3000 = 4150\). \(\hat{\lambda} = 4/4150 = 0.000964\). Mean life \(= 1/\hat{\lambda} = 4150/4 = 1037.5\).
03
System Reliability

Series, parallel, and k-out-of-n systems; coherent systems and structure functions.

🔗 System Configurations
Definition — Series & Parallel Systems For \(n\) independent components with reliabilities \(R_1, R_2, \ldots, R_n\):
  • Series system (fails if any component fails): \(R_s = \prod_{i=1}^n R_i\)
  • Parallel system (fails only if all fail): \(R_p = 1 - \prod_{i=1}^n (1 - R_i)\)
For identical components with reliability \(p\): \(R_s = p^n\), \(R_p = 1 - (1-p)^n\).
Definition — k-out-of-n System A k-out-of-n:G system works if at least \(k\) of \(n\) components function. For i.i.d. components with reliability \(p\): \[R_{k/n} = \sum_{j=k}^n \binom{n}{j}p^j(1-p)^{n-j}\] Special cases: \(n\)-out-of-\(n\) = series; \(1\)-out-of-\(n\) = parallel.
Theorem — Coherent Systems & Structure Functions A structure function \(\phi(\mathbf{x})\) maps component state vector \(\mathbf{x} = (x_1, \ldots, x_n) \in \{0,1\}^n\) to system state \(\{0,1\}\). A system is coherent if:
  • \(\phi\) is non-decreasing in each \(x_i\) (improving a component cannot harm the system)
  • Each component is relevant (there exists \(\mathbf{x}\) such that \(\phi(1_i, \mathbf{x}) \neq \phi(0_i, \mathbf{x})\))
Every coherent system can be expressed via minimal path sets \(P_j\) and minimal cut sets \(K_j\): \[\phi(\mathbf{x}) = \max_j \prod_{i \in P_j} x_i = \min_j \left[1 - \prod_{i \in K_j}(1-x_i)\right]\]
Definition — Standby Redundancy In standby redundancy, a backup component activates only when the primary fails. For two independent components with lifetimes \(T_1 \sim \text{Exp}(\lambda)\) and \(T_2 \sim \text{Exp}(\lambda)\) with perfect switching: \[R_{standby}(t) = P(T_1 + T_2 > t) = e^{-\lambda t}(1 + \lambda t)\] Standby redundancy is more reliable than active (parallel) redundancy when components are exponential.
Example
A bridge system has 5 components: 1-3 in series in the upper path, 2-4 in series in the lower path, and component 5 connecting nodes between 1&2 and 3&4. Each has reliability \(p = 0.9\). Find the system reliability.
The minimal path sets are \(\{1,3\}\), \(\{2,4\}\), \(\{1,5,4\}\), \(\{2,5,3\}\). Using inclusion-exclusion on the minimal paths (or equivalently computing from the structure function): \[R = 2p^2 + 2p^3 - 5p^4 + 2p^5\] At \(p = 0.9\): \(R = 2(0.81) + 2(0.729) - 5(0.6561) + 2(0.59049) = 1.62 + 1.458 - 3.2805 + 1.18098 = 0.97848\).
04
Elementary Queuing Models

Birth-death processes, Kendall's notation, and the foundations of queuing theory.

🚦 Queuing Foundations
Definition — Kendall's Notation A queuing system is described as \(A/B/c/K/N/D\) where:
  • \(A\) = inter-arrival time distribution (M = Markovian/exponential, D = deterministic, G = general)
  • \(B\) = service time distribution
  • \(c\) = number of servers
  • \(K\) = system capacity (default: \(\infty\))
  • \(N\) = calling population size (default: \(\infty\))
  • \(D\) = service discipline (default: FCFS)
Definition — Birth-Death Process A continuous-time Markov chain \(\{N(t)\}\) on \(\{0, 1, 2, \ldots\}\) with transition rates:
  • Birth rates: \(\lambda_n = \) rate of transition from state \(n\) to \(n+1\)
  • Death rates: \(\mu_n = \) rate of transition from state \(n\) to \(n-1\)
The steady-state probabilities satisfy the balance equations: \[\lambda_n p_n = \mu_{n+1} p_{n+1}, \quad n = 0, 1, 2, \ldots\] giving \(p_n = p_0 \prod_{i=0}^{n-1}\frac{\lambda_i}{\mu_{i+1}}\) with \(\sum p_n = 1\).
Theorem — Little's Law For any stable queuing system in steady state: \[L = \lambda W, \quad L_q = \lambda W_q, \quad W = W_q + \frac{1}{\mu}\] where \(L\) = expected number in system, \(L_q\) = expected number in queue, \(W\) = expected time in system, \(W_q\) = expected waiting time in queue, \(\lambda\) = effective arrival rate, \(\mu\) = service rate. This holds regardless of distributions or discipline.
Example
Customers arrive at a rate of 20 per hour and spend an average of 15 minutes in the system. Find \(L\), \(L_q\), and \(W_q\) if the service rate is 30 per hour.
\(W = 15/60 = 0.25\) hours. By Little's law: \(L = \lambda W = 20 \times 0.25 = 5\). \(W_q = W - 1/\mu = 0.25 - 1/30 = 0.25 - 0.0333 = 0.2167\) hours \(\approx 13\) minutes. \(L_q = \lambda W_q = 20 \times 0.2167 = 4.33\).
05
Markovian Queuing Models

M/M/1, M/M/c, M/M/1/K, and M/G/1 queues with performance measures.

📊 M/M/1 and Extensions
Theorem — M/M/1 Queue Poisson arrivals (rate \(\lambda\)), exponential service (rate \(\mu\)), single server. For \(\rho = \lambda/\mu < 1\) (stability): \[p_n = (1-\rho)\rho^n, \quad L = \frac{\rho}{1-\rho}, \quad L_q = \frac{\rho^2}{1-\rho}\] \[W = \frac{1}{\mu - \lambda}, \quad W_q = \frac{\rho}{\mu - \lambda} = \frac{\lambda}{\mu(\mu - \lambda)}\] The probability that the system is busy: \(P(\text{busy}) = \rho\).
Theorem — M/M/1/K Queue (Finite Capacity) With system capacity \(K\), steady-state exists for all \(\rho\): \[p_n = \begin{cases}\frac{(1-\rho)\rho^n}{1-\rho^{K+1}} & \rho \neq 1 \\ \frac{1}{K+1} & \rho = 1\end{cases}, \quad n = 0, 1, \ldots, K\] The effective arrival rate is \(\lambda_{\text{eff}} = \lambda(1 - p_K)\) since arrivals when the system is full are lost.
Theorem — M/M/c Queue (Multiple Servers) With \(c\) servers, \(\rho = \lambda/(c\mu) < 1\): \[p_0 = \left[\sum_{n=0}^{c-1}\frac{(c\rho)^n}{n!} + \frac{(c\rho)^c}{c!(1-\rho)}\right]^{-1}\] The Erlang-C formula gives the probability of waiting: \[C(c, c\rho) = P(\text{wait}) = \frac{(c\rho)^c}{c!(1-\rho)} \cdot p_0\] \[W_q = \frac{C(c,c\rho)}{c\mu(1-\rho)}, \quad L_q = \lambda W_q\]
Theorem — M/G/1: Pollaczek-Khinchine Formula For M/G/1 (general service distribution with mean \(1/\mu\) and variance \(\sigma_s^2\)): \[L_q = \frac{\rho^2(1 + \mu^2\sigma_s^2)}{2(1-\rho)} = \frac{\lambda^2 E[S^2]}{2(1-\rho)}\] where \(E[S^2] = \sigma_s^2 + 1/\mu^2\) is the second moment of service time. This reduces to the M/M/1 result when \(\sigma_s^2 = 1/\mu^2\).
Example
An M/M/1 queue has arrival rate \(\lambda = 4\)/hr and service rate \(\mu = 6\)/hr. Find the probability that there are more than 3 customers in the system.
\(\rho = 4/6 = 2/3\). \(P(N > 3) = \rho^4 = (2/3)^4 = 16/81 \approx 0.1975\).
This follows because \(P(N \ge n) = \rho^n\) in the M/M/1 queue, so \(P(N > 3) = P(N \ge 4) = \rho^4\).
06
Inventory Models

Deterministic and stochastic inventory models: EOQ, newsvendor, and reorder point policies.

📦 Deterministic Inventory Models
Theorem — Economic Order Quantity (Wilson Formula) For constant demand rate \(D\), ordering cost \(K\) per order, and holding cost \(h\) per unit per time: \[Q^* = \sqrt{\frac{2KD}{h}}, \quad TC(Q^*) = \sqrt{2KDh}\] The optimal order interval is \(T^* = Q^*/D\) and the number of orders per unit time is \(D/Q^*\). At the optimum, ordering cost equals holding cost.
Definition — EOQ with Shortages When backorders are allowed with shortage cost \(p\) per unit per time: \[Q^* = \sqrt{\frac{2KD}{h}} \cdot \sqrt{\frac{h+p}{p}}, \quad S^* = Q^* \cdot \frac{h}{h+p}\] where \(S^*\) is the maximum shortage level. As \(p \to \infty\), this reduces to the basic EOQ.
🎲 Stochastic Inventory Models
Theorem — Newsvendor Problem A single-period model with random demand \(D\), unit cost \(c\), selling price \(p\), and salvage value \(v\) (\(v < c < p\)). The optimal order quantity \(Q^*\) satisfies: \[F(Q^*) = \frac{p - c}{p - v} = \frac{c_u}{c_u + c_o}\] where \(c_u = p - c\) is the underage cost and \(c_o = c - v\) is the overage cost. This is the critical ratio solution.
Definition — (s, S) Policy & Reorder Point In continuous-review inventory with stochastic demand:
  • Reorder point: \(s = \mu_L + z_\alpha \sigma_L\) where \(\mu_L\) = mean demand during lead time, \(\sigma_L\) = std dev of demand during lead time, \(z_\alpha\) = safety factor.
  • Safety stock: \(SS = z_\alpha \sigma_L\), providing protection against demand uncertainty during lead time.
  • (s, S) policy: When inventory drops to or below \(s\), order up to level \(S\). This is optimal under certain cost structures (fixed + variable ordering cost).
Example
A newsvendor buys newspapers at $0.50 and sells at $1.00. Unsold papers are worthless. Daily demand is \(N(100, 20^2)\). Find the optimal order quantity.
Critical ratio: \(\frac{p - c}{p - v} = \frac{1.00 - 0.50}{1.00 - 0} = 0.5\). \(F(Q^*) = 0.5\), so \(Q^* = \mu = 100\) (the median of the normal distribution). Since the normal is symmetric, the newsvendor orders exactly the mean demand when the critical ratio equals 0.5.
Key Takeaways
Practice Problems
Problem 1
A system has Weibull lifetime with \(R(t) = e^{-(t/100)^{2.5}}\). Find the hazard function, MTTF, and classify the failure rate.
Show Solution ▼
\(h(t) = \frac{2.5}{100}\left(\frac{t}{100}\right)^{1.5} = \frac{2.5 t^{1.5}}{100^{2.5}}\). Since the exponent \(\beta = 2.5 > 1\), this is IFR. MTTF \(= 100 \cdot \Gamma(1 + 1/2.5) = 100 \cdot \Gamma(1.4) = 100 \times 0.8873 = 88.73\).
Problem 2
Five items are tested; failures at 50, 120, 200 and two items censored at 150, 180. Compute the Kaplan-Meier estimate \(\hat{R}(160)\).
Show Solution ▼
At \(t = 50\): \(n_1 = 5, d_1 = 1\), \(\hat{R} = 4/5\). At \(t = 120\): \(n_2 = 4, d_2 = 1\), \(\hat{R} = (4/5)(3/4) = 3/5\). Censored at 150, 180 (after 120, before next failure at 200). At \(t = 160\): \(\hat{R}(160) = 3/5 = 0.6\).
Problem 3
A 2-out-of-3:G system has component reliabilities \(R_1 = 0.9, R_2 = 0.8, R_3 = 0.7\). Compute the system reliability.
Show Solution ▼
\(R_{sys} = R_1 R_2 + R_1 R_3 + R_2 R_3 - 2R_1 R_2 R_3\) \(= 0.72 + 0.63 + 0.56 - 2(0.504) = 1.91 - 1.008 = 0.902\). This uses inclusion-exclusion on the event that at least 2 of 3 components work.
Problem 4
An M/M/2 queue has \(\lambda = 6\)/hr and \(\mu = 4\)/hr per server. Find \(p_0\) and the expected number in queue \(L_q\).
Show Solution ▼
\(\rho = \lambda/(c\mu) = 6/8 = 0.75\). \(c\rho = 1.5\). \(p_0 = \left[1 + 1.5 + \frac{1.5^2}{2(1-0.75)}\right]^{-1} = \left[1 + 1.5 + 4.5\right]^{-1} = 1/7 \approx 0.143\). \(C(2, 1.5) = \frac{1.5^2}{2 \times 0.25} \times p_0 = 4.5/7 \approx 0.643\). \(L_q = \frac{C(2,1.5) \times \rho}{1-\rho} = \frac{0.643 \times 0.75}{0.25} = 1.929\).
Problem 5
Annual demand is 10,000 units, ordering cost $100 per order, holding cost $2 per unit per year. Find the EOQ, total cost, and optimal number of orders per year.
Show Solution ▼
\(Q^* = \sqrt{2 \times 100 \times 10000 / 2} = \sqrt{1000000} = 1000\) units. \(TC = \sqrt{2 \times 100 \times 10000 \times 2} = \sqrt{4000000} = 2000\) per year. Orders per year \(= D/Q^* = 10000/1000 = 10\). Order interval \(= 365/10 = 36.5\) days.
Self-Assessment Quiz
1. The memoryless property \(P(T > s+t \mid T > s) = P(T > t)\) characterizes which distribution?
A Weibull
B Exponential
C Normal
D Lognormal
2. In Type II censoring with \(n\) items, the test ends when:
A A pre-specified time is reached
B A pre-specified number of failures occurs
C All items have failed
D Items are randomly withdrawn
3. For an M/M/1 queue with \(\rho = 0.8\), the expected number in the system is:
A 3.2
B 4
C 0.8
D 5
4. Little's Law \(L = \lambda W\) requires:
A Poisson arrivals
B Single server and FCFS discipline
C Only that the system is in steady state
D Exponential service times
5. In the EOQ model, at the optimal order quantity:
A Annual ordering cost equals annual holding cost
B Total cost is zero
C Holding cost dominates ordering cost
D Average inventory equals demand
🔝