CHAPTER 01

Introduction to Performance Testing

Understanding the fundamentals of performance testing and why it's critical for modern QA and SDET roles.

What is Performance Testing?

Performance testing is a type of non-functional testing that evaluates how a system performs under various workload conditions. Unlike functional testing that verifies "what" the system does, performance testing validates "how well" it does it.

For QA and SDET professionals, performance testing ensures applications can handle expected user loads, respond quickly, and remain stable under stress. It's essential for preventing production failures that could impact user experience and business revenue.

💡 Why Performance Testing Matters

A one-second delay in page load time can result in a 7% reduction in conversions. Amazon found that every 100ms delay costs them 1% in sales. Performance testing helps prevent these costly issues before they reach production.

Types of Performance Testing

📊
Load Testing
Tests system behavior under expected user load. Validates response times and throughput meet requirements with normal traffic patterns.
💪
Stress Testing
Pushes system beyond normal capacity to find breaking points. Helps identify maximum operating capacity and failure modes.
Spike Testing
Validates system handles sudden, dramatic increases in load. Common for flash sales, breaking news, or viral content scenarios.
Endurance Testing
Runs tests for extended periods (hours/days) to detect memory leaks, resource exhaustion, and degradation over time.

Key Performance Metrics

Response Time

Total time from sending a request to receiving the complete response. Critical for user experience. Measured in milliseconds or seconds.

Example Response Time Breakdown
// Typical API response time components:
DNS Lookup:        20ms
Connection:        30ms
SSL Handshake:    40ms
Server Processing: 150ms
Data Transfer:    60ms
─────────────────────────
Total Response Time: 300ms

Throughput

Amount of data or number of requests processed in a given time period. Measured in requests/second, transactions/second, or MB/second.

Concurrent Users

Number of virtual users simultaneously interacting with the system. Not the same as total users - focuses on those actively making requests.

Error Rate

Percentage of failed requests during testing. Acceptable rates vary by application but typically should stay below 1% under normal load.

Calculating Error Rate
// Error Rate Formula
Error Rate = (Failed Requests / Total Requests) × 100

// Example:
Total Requests: 10,000
Failed Requests: 75
Error Rate: (75 / 10000) × 100 = 0.75%

Hands-On Exercise

Scenario: You're testing an e-commerce checkout API that should support 500 concurrent users during peak hours.

Tasks:

  1. Define what performance metrics you would measure
  2. Determine acceptable thresholds for each metric
  3. Identify which type of performance test you would run first
  4. Plan what data you would include in your test report

✅ Sample Solution

Metrics: Response time (avg, 95th percentile), throughput (TPS), error rate, concurrent users

Thresholds: Avg response < 1.5s, 95th percentile < 2s, error rate < 0.5%, support 500 concurrent users

Test Type: Start with load testing to establish baseline, then stress testing to find limits

Interview Prep Questions

What's the difference between load testing and stress testing?
Answer: Load testing validates system performance under expected user loads to ensure it meets requirements. Stress testing pushes the system beyond normal capacity to identify breaking points and maximum limits. Load testing confirms "it works as expected," while stress testing finds "where it fails."
How do you determine the right number of concurrent users for load testing?
Answer: Analyze production analytics to find peak concurrent users. Consider business requirements (expected growth, marketing campaigns, seasonal peaks). Use Little's Law: Concurrent Users = Throughput × Response Time. For example, if you expect 100 requests/second with 3-second response time: 100 × 3 = 300 concurrent users.
CHAPTER 02

Performance Testing Lifecycle

A systematic approach to planning, executing, and analyzing performance tests in agile and DevOps environments.

The Performance Testing Process

Performance Testing Lifecycle Phases
1. Requirements Gathering
    Identify performance goals and SLAs
   
2. Test Planning
    Define scenarios, metrics, and success criteria
   
3. Test Environment Setup
    Configure infrastructure matching production
   
4. Script Development
    Create test scripts and scenarios
   
5. Test Execution
    Run tests and monitor system behavior
   
6. Analysis & Reporting
    Identify bottlenecks and provide recommendations
   
7. Optimization & Retest
    Validate improvements and iterate

Phase 1: Requirements Gathering

The foundation of successful performance testing starts with clear, measurable requirements.

📋 Sample SLA Document

performance-sla.txt
E-Commerce Application Performance SLA

// Critical User Journeys
Homepage Load:          < 1.5s (95th percentile)
Product Search:          < 2.0s
Add to Cart:            < 1.0s
Checkout Process:       < 3.0s

// Load Requirements
Concurrent Users:       1000 (normal), 3000 (peak)
Throughput:             100 TPS minimum
Error Rate:             < 0.5%

Real-World CI/CD Integration

Jenkins Pipeline with Performance Gates
pipeline {
    agent any
    
    stages {
        stage('Build') {
            steps {
                sh 'mvn clean package'
            }
        }
        
        stage('Smoke Performance Test') {
            steps {
                // Quick 5-minute test with 50 users
                sh 'jmeter -n -t smoke-test.jmx -l results.jtl'
                
                script {
                    def avgResponseTime = readJMeterResults('results.jtl')
                    if (avgResponseTime > 2000) {
                        error "Performance degradation detected!"
                    }
                }
            }
        }
    }
}

Hands-On Exercise

Scenario: Create a performance test plan for a REST API that processes loan applications.

Requirements:

  • API must handle 200 applications per minute
  • 95% of requests complete within 3 seconds
  • System handles up to 500 concurrent users
  • Error rate below 0.1%

Interview Prep Questions

How do you integrate performance testing into CI/CD without slowing releases?
Answer: Use tiered approach: (1) Quick smoke tests (5-10 min) on every commit, (2) Medium load tests (20-30 min) on main branch merges, (3) Full stress tests nightly. Set performance budgets as quality gates.
What's the difference between think time and pacing?
Answer: Think time is the pause between individual user actions (simulates user behavior). Pacing controls how long to wait before starting the next iteration. Think time is about realism; pacing controls throughput.
CHAPTER 03

Apache JMeter

Master the industry-standard open-source tool for performance testing web applications and APIs.

What is Apache JMeter?

Apache JMeter is a powerful, Java-based open-source tool designed for load testing and performance measurement. Originally created for testing web applications, it has evolved to support HTTP, HTTPS, SOAP, REST APIs, databases, and more.

💡 Why JMeter is Popular in QA

  • Industry Standard: Widely recognized in QA job requirements
  • GUI & CLI Support: Visual test building + command-line for CI/CD
  • Extensive Protocol Support: HTTP, JDBC, LDAP, SOAP, JMS
  • Free & Open Source: No licensing costs

Installation & Setup

Windows Installation
# 1. Download from https://jmeter.apache.org
# 2. Extract to C:\jmeter
# 3. Test installation

cd C:\jmeter\bin
jmeter -v

# Launch GUI
jmeter.bat
macOS/Linux Installation
# Using Homebrew (macOS)
brew install jmeter

# Manual installation
wget https://dlcdn.apache.org/jmeter/binaries/apache-jmeter-5.6.tgz
tar -xzf apache-jmeter-5.6.tgz
./bin/jmeter -v

JMeter Core Concepts

Test Plan Structure
Test Plan
 ├── Thread Group (Virtual Users)
 │   ├── Samplers (Requests)
 │   │   ├── HTTP Request
 │   │   └── JDBC Request
 │   ├── Config Elements
 │   │   ├── CSV Data Set
 │   │   └── HTTP Cookie Manager
 │   ├── Timers (Think Time)
 │   ├── Assertions (Validations)
 │   └── Listeners (Reports)

Creating Your First Test

Thread Group Configuration
// Right-click Test Plan → Add → Threads → Thread Group

Number of Threads: 100
Ramp-Up Period: 60 seconds
Loop Count: 10

// This creates 100 users over 60 seconds
// Each user executes test 10 times
// Total: 1,000 requests
HTTP Request Example
Protocol: https
Server: api.example.com
Method: GET
Path: /users/${__Random(1,100)}

// ${__Random(1,100)} generates random IDs 1-100

Correlation & Parameterization

JSON Extractor for API Token
// Extract auth token from login response

Variable name: auth_token
JSON Path: $.token
Match No: 0
Default: TOKEN_ERROR

// Use in next request header:
Authorization: Bearer ${auth_token}
CSV Data Set Config
// users.csv file:
username,password,email
testuser1,pass123,user1@test.com
testuser2,pass456,user2@test.com

// Config:
Filename: users.csv
Variable Names: username,password,email
Recycle on EOF: True

// Use in request:
{
  "username": "${username}",
  "password": "${password}"
}

Running Tests from CLI

Non-GUI Execution
# Basic execution
jmeter -n -t test.jmx -l results.jtl

# With HTML report
jmeter -n -t test.jmx -l results.jtl -e -o html-report

# Override parameters
jmeter -n -t test.jmx -l results.jtl -Jthreads=200

Hands-On Exercise

Task: Create a JMeter test for an e-commerce cart workflow.

  1. Thread Group: 50 users, 30s ramp-up
  2. Add requests: GET /products, GET /products/${id}, POST /cart/add
  3. Extract productId using JSON Extractor
  4. Add 2-5 second think times
  5. Add HTTP 200 assertions
  6. Run in non-GUI mode

Interview Prep Questions

Why should you remove listeners before running production-scale tests?
Answer: Listeners consume significant memory and CPU storing request/response data. For large tests, this causes OutOfMemoryErrors and skews results. Use -l flag in CLI instead.
How do you handle dynamic authentication tokens in JMeter?
Answer: Use Setup Thread Group to login once, extract token with JSON/RegEx Extractor, store in variable, then use in HTTP Header Manager for main Thread Group.
CHAPTER 04

Gatling

Learn the developer-friendly, code-based performance testing framework powered by Scala.

What is Gatling?

Gatling is a powerful open-source load testing tool designed for continuous integration. Unlike GUI-based tools, Gatling uses code (Scala DSL) to define test scenarios, making it version-control friendly and developer-centric.

📝
Code-First
Scenarios written in Scala DSL - easy to version control and maintain like application code.
🚀
High Performance
Built on Akka actors - simulates thousands of users from a single machine.
📊
Beautiful Reports
Generates stunning HTML reports with interactive charts automatically.
🔗
CI/CD Ready
Integrates seamlessly with Maven, Gradle, Jenkins, and GitLab CI.

Installation

Download & Setup
# Download bundle
wget https://repo1.maven.org/maven2/io/gatling/highcharts/\
gatling-charts-highcharts-bundle/3.9.5/\
gatling-charts-highcharts-bundle-3.9.5-bundle.zip

unzip gatling-charts-highcharts-bundle-3.9.5-bundle.zip
cd gatling-charts-highcharts-bundle-3.9.5

# Test installation
./bin/gatling.sh -v

Creating Your First Simulation

BasicSimulation.scala
package simulations

import io.gatling.core.Predef._
import io.gatling.http.Predef._
import scala.concurrent.duration._

class BasicSimulation extends Simulation {

  // HTTP configuration
  val httpProtocol = http
    .baseUrl("https://api.example.com")
    .acceptHeader("application/json")
    .userAgentHeader("Gatling Load Test")

  // Scenario definition
  val scn = scenario("Basic API Test")
    .exec(
      http("Get Users")
        .get("/users")
        .check(status.is(200))
    )
    .pause(2) // Think time

  // Load injection
  setUp(
    scn.inject(
      rampUsers(100) during (60 seconds)
    )
  ).protocols(httpProtocol)
}

Advanced Scenarios

E-Commerce User Journey
val browseProducts = scenario("Browse & Purchase")
  .exec(http("Homepage").get("/"))
  .pause(2, 5)
  
  .exec(
    http("Search Products")
      .get("/api/products?q=laptop")
      .check(jsonPath("$.products[0].id").saveAs("productId"))
  )
  .pause(3)
  
  .exec(
    http("View Product")
      .get("/api/products/${productId}")
  )
  .pause(5, 10)
  
  .exec(
    http("Add to Cart")
      .post("/api/cart")
      .body(StringBody("""{"productId":"${productId}","qty":1}"""))
      .asJson
  )

Load Injection Profiles

Different Load Patterns
// Constant load
constantUsersPerSec(10) during (5 minutes)

// Ramp up
rampUsers(1000) during (10 minutes)

// Steps
incrementUsersPerSec(5)
  .times(10)
  .eachLevelLasting(30 seconds)

// Spike
atOnceUsers(5000)

Running Tests

Execute Simulation
# Interactive mode
./bin/gatling.sh

# Non-interactive (CI/CD)
./bin/gatling.sh -sf simulations -s BasicSimulation

# Results in: results/basicsimulation-[timestamp]/index.html

Hands-On Exercise

Task: Create a Gatling simulation for testing a login flow.

  1. Configure HTTP protocol with base URL
  2. Create scenario: POST /login with credentials
  3. Extract auth token from response
  4. Use token in subsequent GET /profile request
  5. Inject 50 users ramped over 30 seconds
  6. Add assertions for 95th percentile < 2s

Interview Prep Questions

What are the advantages of Gatling over JMeter?
Answer: Code-based (version control friendly), higher performance (Akka actors), better reports, easier CI/CD integration, and more maintainable for developer teams. However, JMeter has broader protocol support and larger community.
How do you correlate dynamic values in Gatling?
Answer: Use .check() with jsonPath() or regex() to extract values and .saveAs() to store in session variables. Then reference with ${variableName} in subsequent requests.
CHAPTER 05

k6 by Grafana

Modern, developer-friendly performance testing with JavaScript.

What is k6?

k6 is a modern load testing tool built for developers and DevOps teams. Written in Go with JavaScript scripting, it combines ease of use with powerful performance testing capabilities.

💡 Why Choose k6?

  • JavaScript Syntax: Familiar language for web developers
  • CLI-First: Designed for automation and CI/CD from day one
  • Cloud Integration: k6 Cloud for distributed testing
  • Modern Stack: Perfect for testing APIs and microservices

Installation

Installation Options
# macOS
brew install k6

# Windows (Chocolatey)
choco install k6

# Linux
sudo gpg -k
sudo gpg --no-default-keyring --keyring /usr/share/keyrings/k6-archive-keyring.gpg \
  --keyserver hkp://keyserver.ubuntu.com:80 --recv-keys C5AD17C747E3415A3642D57D77C6C491D6AC1D69
echo "deb [signed-by=/usr/share/keyrings/k6-archive-keyring.gpg] \
https://dl.k6.io/deb stable main" | sudo tee /etc/apt/sources.list.d/k6.list
sudo apt-get update
sudo apt-get install k6

# Verify
k6 version

Your First k6 Script

simple-test.js
import http from 'k6/http';
import { sleep, check } from 'k6';

// Test configuration
export const options = {
  vus: 10,           // Virtual users
  duration: '30s',   // Test duration
};

// Main test function
export default function() {
  const res = http.get('https://test-api.k6.io');
  
  // Assertions
  check(res, {
    'status is 200': (r) => r.status === 200,
    'response time < 500ms': (r) => r.timings.duration < 500,
  });
  
  sleep(1);
}
Run the Test
k6 run simple-test.js

Advanced k6 Features

Load Stages & Thresholds
export const options = {
  stages: [
    { duration: '2m', target: 100 },  // Ramp up
    { duration: '5m', target: 100 },  // Stay at 100
    { duration: '2m', target: 200 },  // Ramp to 200
    { duration: '2m', target: 0 },    // Ramp down
  ],
  
  thresholds: {
    'http_req_duration': ['p(95)<2000'],  // 95% < 2s
    'http_req_failed': ['rate<0.01'],     // Error < 1%
  },
};
API Testing with Authentication
import http from 'k6/http';
import { check } from 'k6';

export default function() {
  // Login and extract token
  const loginRes = http.post('https://api.example.com/login', 
    JSON.stringify({
      username: 'testuser',
      password: 'pass123'
    }),
    { headers: { 'Content-Type': 'application/json' } }
  );
  
  const token = JSON.parse(loginRes.body).token;
  
  // Use token in authenticated request
  const params = {
    headers: {
      'Authorization': `Bearer ${token}`,
    },
  };
  
  const res = http.get('https://api.example.com/profile', params);
  
  check(res, {
    'profile loaded': (r) => r.status === 200
  });
}

CI/CD Integration

GitHub Actions Workflow
name: Performance Test

on: [push]

jobs:
  k6-test:
    runs-on: ubuntu-latest
    steps:
      - uses: actions/checkout@v3
      
      - name: Run k6 test
        uses: grafana/k6-action@v0.3.0
        with:
          filename: tests/load-test.js
          
      - name: Upload results
        uses: actions/upload-artifact@v3
        with:
          name: k6-results
          path: summary.json

Hands-On Exercise

Task: Create a k6 test for a product API with multiple scenarios.

  1. Configure stages: ramp to 50 users in 1min, sustain 5min, ramp down
  2. Add thresholds: p95 < 1.5s, error rate < 1%
  3. Create GET /products request
  4. Extract random productId from response
  5. Add POST /cart with productId
  6. Add checks for status codes and response times

Interview Prep Questions

Compare k6 vs JMeter - when would you choose k6?
Answer: Choose k6 for: modern API testing, JavaScript teams, cloud-native apps, CI/CD automation, and better developer experience. JMeter for: GUI-based test creation, broader protocol support, larger community, and existing JMeter infrastructure.
How do you set performance thresholds in k6?
Answer: Define thresholds in options object using metric names and conditions. Example: 'http_req_duration': ['p(95)<2000'] means 95th percentile must be under 2 seconds. Tests fail if thresholds aren't met, perfect for CI/CD gates.
CHAPTER 06

Performance Metrics Analysis & Optimization

Learn to read reports, identify bottlenecks, and make data-driven optimization recommendations.

Understanding Performance Metrics

Metric Description Good Value Red Flag
Response Time (Avg) Mean time for requests < 1s for APIs > 3s
95th Percentile 95% of requests complete within < 2s > 5s
Throughput (TPS) Transactions per second Meets SLA target Below expected
Error Rate Percentage of failed requests < 0.5% > 2%
Concurrency Active users at same time Handles peak load System crashes

Reading Performance Reports

🔍 Key Things to Look For

  • Response Time Trends: Does performance degrade over time? (memory leaks)
  • Error Patterns: Do errors spike at certain load levels? (breaking point)
  • Percentile Gaps: Large gap between avg and 95th percentile? (inconsistent performance)
  • Throughput Plateau: Throughput stops increasing? (saturation)
Sample Test Results Analysis
// JMeter Aggregate Report Excerpt

Label              Samples  Average  Median  90%    95%    99%    Error%  Throughput
─────────────────────────────────────────────────────────────────────
GET /api/users     10000    245ms    198ms  421ms  534ms  892ms  0.23%   125.4/sec
POST /api/cart     8500     312ms    287ms  512ms  645ms  1.1s   0.41%   106.8/sec
GET /api/checkout  7200     1.2s     985ms  2.1s   2.8s   4.5s   1.87%   90.2/sec

// Analysis:
⚠️ ISSUES DETECTED:
1. Checkout endpoint: High response time (1.2s avg)
2. Large percentile gaps (median 985ms vs 99th 4.5s)  inconsistent
3. Error rate 1.87% exceeds 1% threshold
4. Lower throughput suggests bottleneck

Common Performance Bottlenecks

🗄️
Database Issues
Slow queries, missing indexes, connection pool exhaustion, N+1 queries, table locks
🌐
API/Network
Third-party API timeouts, no connection pooling, DNS lookups, SSL handshakes
💾
Memory Leaks
Objects not garbage collected, caching without limits, session data accumulation
⚙️
Server Resources
CPU throttling, insufficient memory, disk I/O bottlenecks, thread pool saturation

Identifying Bottlenecks

Bottleneck Investigation Checklist
// Step-by-Step Analysis Process

1. Identify Slow Endpoints
    Look at response time by URL/transaction
    Find which requests exceed SLA

2. Check Server Resources
    CPU utilization (high = computation bottleneck)
    Memory usage (growing = memory leak)
    Disk I/O (high = database/file system)
    Network bandwidth (saturated = data transfer issue)

3. Database Performance
    Slow query log
    Connection pool metrics
    Query execution plans
    Index usage statistics

4. Application Profiling
    Method execution times
    External API call duration
    Cache hit/miss rates
    Thread dumps (deadlocks)

5. Error Analysis
    HTTP error codes distribution
    Exception logs correlation
    Error rate vs load relationship

Optimization Recommendations

Common Optimization Strategies
Database Optimization:
 Add indexes on frequently queried columns
 Optimize slow queries (use EXPLAIN)
 Implement connection pooling
 Use read replicas for queries
 Add database caching (Redis, Memcached)

Application Layer:
 Enable HTTP caching (ETag, Cache-Control)
 Implement application-level caching
 Use async processing for heavy tasks
 Optimize serialization/deserialization
 Reduce payload size (compression, pagination)

Infrastructure:
 Horizontal scaling (add more servers)
 Vertical scaling (increase server resources)
 Configure auto-scaling
 Use CDN for static assets
 Optimize load balancer configuration

Code Level:
 Fix N+1 query problems
 Batch API calls
 Implement lazy loading
 Remove unnecessary computations
 Use efficient algorithms/data structures

Hands-On Exercise

Scenario: Analyze this performance test result and make recommendations.

Test Results to Analyze
Test: E-commerce checkout flow
Duration: 30 minutes
Users: Ramped from 0 to 500 over 10 min, sustained 500 for 15 min

Results:
- Average response time: Started at 800ms, ended at 2.5s
- Error rate: Started at 0.1%, spiked to 8% at 500 users
- Throughput: Plateaued at 85 TPS (target was 150 TPS)
- Memory: Grew from 2GB to 7.5GB (max 8GB)
- CPU: Averaged 45% throughout test
- Database connections: 480/500 pool exhausted

Tasks:

  1. Identify the primary bottleneck(s)
  2. Explain why response time degrades over time
  3. Propose 3 specific optimization actions
  4. Prioritize recommendations (quick wins vs long-term)

✅ Sample Analysis

Primary Issues:

  • Memory leak (2GB → 7.5GB growth)
  • Database connection pool exhaustion (480/500)
  • Response time degradation suggests resource exhaustion

Recommendations:

  1. Immediate: Increase DB connection pool to 1000, monitor for leaks
  2. Short-term: Profile application to find memory leak source
  3. Long-term: Implement connection pooling best practices, add database read replicas

Interview Prep Questions

How do you differentiate between front-end and back-end performance issues?
Answer: Use browser dev tools vs API testing tools separately. Front-end issues show in browser metrics (render time, DOM manipulation) but API tests show good response times. Back-end issues appear in both. Also check: TTFB (high = back-end), client-side JS execution time, network latency, asset load times.
What's the significance of the 95th percentile vs average response time?
Answer: Average can be misleading - a few slow requests don't significantly impact it. 95th percentile shows what 95% of users experience, better representing real user experience. Large gap between avg and p95 indicates inconsistent performance. SLAs should use percentiles, not averages.
How would you identify a memory leak during performance testing?
Answer: Monitor memory usage over time during sustained load. Memory leak shows as: continuous memory growth without leveling off, heap size increasing even during idle periods, eventual OutOfMemoryErrors. Use heap dumps, profiling tools, and garbage collection logs to pinpoint leak source.
CHAPTER 07

Real-World Projects & Scenarios

Apply your knowledge with practical mini-projects and interview scenario walkthroughs.

Project 1: E-Commerce Load Test

📦 Project Requirements

Create a comprehensive load test for an e-commerce platform covering the complete user journey from browsing to checkout.

Business Requirements:

  • Support 2000 concurrent users during Black Friday
  • Checkout must complete in under 5 seconds (95th percentile)
  • Error rate below 0.1% (payment criticality)
  • Product search under 1.5 seconds
JMeter Test Plan Structure
Thread Group: "Black Friday Shoppers"
  Threads: 2000
  Ramp-up: 600s (10 minutes)
  Duration: 1800s (30 minutes)

  HTTP Request Defaults:
    Server: shop.example.com
    Protocol: https

  CSV Data Set: "users.csv"
    Variables: username,password,cardNumber

  Transaction Controller: "Browse Products"
     GET / (Homepage)
     Uniform Random Timer (3000-7000ms)
     GET /search?q=${__Random(product1,product2,product3)}
     JSON Extractor (extract productId)
     Response Assertion (status 200)

  Transaction Controller: "Add to Cart"
     GET /product/${productId}
     Uniform Random Timer (10000-20000ms)
     POST /cart/add
     Duration Assertion (< 1000ms)

  Transaction Controller: "Checkout"
     GET /cart
     POST /checkout/initiate
     POST /payment/process
     Duration Assertion (< 5000ms)
     Response Assertion (contains "success")
k6 Alternative Implementation
import http from 'k6/http';
import { check, sleep } from 'k6';

export const options = {
  stages: [
    { duration: '10m', target: 2000 },
    { duration: '30m', target: 2000 },
  ],
  thresholds: {
    'http_req_duration{name:checkout}': ['p(95)<5000'],
    'http_req_duration{name:search}': ['p(95)<1500'],
    'http_req_failed': ['rate<0.001'],
  },
};

export default function() {
  const baseUrl = 'https://shop.example.com';
  
  // Browse products
  http.get(`${baseUrl}/`);
  sleep(Math.random() * 4 + 3);
  
  const searchRes = http.get(
    `${baseUrl}/search?q=laptop`,
    { tags: { name: 'search' } }
  );
  
  const products = JSON.parse(searchRes.body).products;
  const productId = products[0].id;
  
  // Checkout
  http.post(
    `${baseUrl}/checkout/process`,
    JSON.stringify({ productId }),
    { tags: { name: 'checkout' } }
  );
}

Project 2: API Microservices Test

⚡ Scenario: Spike Load on Auth Service

Your microservices architecture has an authentication service that all other services depend on. Marketing is planning a campaign that could cause 10x normal traffic spike.

Your Task: Design a spike test to validate the auth service and downstream dependencies can handle the load.

Gatling Spike Test
val authSpike = scenario("Auth Spike Test")
  .exec(
    http("Login")
      .post("/api/auth/login")
      .body(StringBody("""{"user":"${user}","pass":"${pass}"}"""))
      .check(status.is(200))
      .check(jsonPath("$.token").saveAs("token"))
  )

setUp(
  authSpike.inject(
    nothingFor(10 seconds),
    atOnceUsers(5000),       // Sudden spike
    nothingFor(2 minutes),
    constantUsersPerSec(100) during(5 minutes) // Normal load
  )
).assertions(
  global.responseTime.max.lt(10000),
  global.successfulRequests.percent.gt(99)
)

Interview Scenario Walkthrough

Common Interview Scenarios

Scenario: Response times spike during peak hours. How would you investigate and resolve?
Investigation Steps:
  1. Gather Data: Check monitoring dashboards for CPU, memory, DB connections during spike times
  2. Identify Pattern: Does it correlate with user count? Specific features? Batch jobs?
  3. Reproduce: Run load test simulating peak hour conditions
  4. Profile: Use APM tools to identify slow code paths
  5. Database: Check slow query logs, connection pool usage
  6. Test Fix: Implement optimization, re-run load test to validate
Common Causes: Unoptimized queries, connection pool exhaustion, inefficient caching, memory leaks, external API timeouts
How would you analyze a spike in response times at the 95th percentile?
Analysis Approach:

A spike in p95 (while average remains stable) indicates a subset of requests are very slow. This could be:

  • Cache Misses: Some requests hit slow path without cache
  • Geographic Issues: Users far from data center experience higher latency
  • Specific Endpoints: One endpoint is slow, others fast
  • Large Data Sets: Some users have large profiles/carts causing slow queries

Investigation: Break down metrics by endpoint, user segment, and geographic region. Look at p99 and max values. Profile slowest 5% of requests.

Design a performance testing strategy for a new microservice being added to production.
Testing Strategy:
  1. Unit Performance Tests: Test critical functions for performance regressions
  2. Component Load Test: Test service in isolation with expected load
  3. Integration Test: Test with dependent services, verify graceful degradation
  4. Spike Test: Validate auto-scaling and circuit breakers
  5. Endurance Test: 24-hour test to catch memory leaks
  6. Chaos Testing: Kill instances, test recovery
Success Criteria: Define SLAs (p95 response time, throughput, error rate), set up monitoring/alerts, create performance baseline
CHAPTER 08

Interview Preparation Guide

Comprehensive quick reference and must-know Q&A for QA and SDET performance testing interviews.

Performance Testing Quick Reference

Tool Best For Pros Cons
JMeter Industry standard, GUI-based Mature, widely used, extensive protocols, large community XML-based tests, resource intensive GUI, complex correlation
Gatling Developer teams, code-based Beautiful reports, high performance, Scala DSL, version control Steeper learning curve, smaller community
k6 Modern APIs, JavaScript teams Easy syntax, CLI-first, cloud integration, modern Limited protocol support vs JMeter, newer tool

Key Metrics Cheat Sheet

Essential Performance Metrics
Response Time Metrics:
 Average: Mean of all requests
 Median (50th): Middle value
 90th Percentile: 90% of requests faster than this
 95th Percentile: 95% of requests faster (common SLA)
 99th Percentile: 99% of requests faster
 Max: Slowest request

Throughput Metrics:
 TPS: Transactions Per Second
 RPS: Requests Per Second
 Hits/sec: Total HTTP requests per second

Error Metrics:
 Error Rate: Percentage of failed requests
 Error Types: HTTP 4xx, 5xx, timeouts, connection errors

Resource Metrics:
 CPU: Processor utilization %
 Memory: RAM usage and growth
 Disk I/O: Read/write operations
 Network: Bandwidth utilization

Must-Know Formulas

Little's Law & Key Calculations
// Little's Law
Concurrent Users = Throughput × Response Time

// Example:
Throughput: 100 requests/second
Response Time: 3 seconds
Concurrent Users: 100 × 3 = 300 users

// Error Rate
Error Rate = (Failed Requests / Total Requests) × 100

// Throughput from Duration
TPS = Total Transactions / Test Duration (seconds)

// Ramp-up Calculation
Thread Start Interval = Ramp-up Period / Number of Threads

Top 20 Interview Questions

1. What's the difference between load, stress, and spike testing?
Load: Expected user load to validate SLAs | Stress: Push beyond limits to find breaking point | Spike: Sudden dramatic increase to test auto-scaling
2. How do you determine baseline performance?
Run tests under normal/expected load conditions with production-like data and environment. Measure key metrics (response time, throughput, error rate) to establish acceptable ranges for comparison.
3. What causes response time to increase as load increases?
Resource contention (CPU, memory, DB connections), queueing delays, context switching, network congestion, database locking, garbage collection pauses, insufficient server capacity.
4. How would you test APIs that require authentication?
Use Setup Thread Group to login once and extract token, then reference in main tests. Or login per virtual user and cache token. Handle token expiration with conditional logic to refresh.
5. Explain the concept of think time and why it's important.
Think time simulates realistic user behavior - the pause between actions (e.g., reading search results before clicking). Without it, tests unrealistically hammer the system continuously, not reflecting real usage patterns.
6. What's the difference between client-side and server-side performance issues?
Server-side: High TTFB, slow database queries, API response times. Shows in API tests. Client-side: Rendering, JavaScript execution, asset loading. Shows in browser tests but not API tests.
7. How do you identify memory leaks during performance testing?
Monitor heap memory growth over sustained load. Memory leak shows continuous growth without plateau. Use heap dumps, profilers, GC logs. Test should run long enough (hours) to detect gradual leaks.
8. What metrics should be included in a performance test report?
Response times (avg, p95, p99), throughput (TPS), error rate/types, concurrent users, server resources (CPU, memory), database metrics, test configuration, bottlenecks identified, recommendations.
9. How would you test a microservices architecture?
Test each service independently (component testing), then integration tests across services. Test service-to-service communication, circuit breakers, fallback mechanisms, distributed tracing, and overall system under realistic traffic patterns.
10. What's the 95th percentile and why is it more important than average?
95th percentile means 95% of users experience response time at or below this value. More representative of user experience than average, which can hide outliers. SLAs should use percentiles.

Final Tips for Success

📚
Know Your Tools
Be ready to discuss pros/cons of JMeter, Gatling, k6. Mention hands-on experience with specific features.
💬
Real Examples
Prepare 2-3 stories about performance issues you found and how you resolved them.
🎯
Metrics First
Always frame answers with metrics: response time, throughput, error rate. Show data-driven thinking.
🔄
CI/CD Integration
Emphasize automation experience: Jenkins, GitHub Actions, performance gates, trend analysis.

🎓 You're Ready!

You've completed the Performance Testing Mastery course! You now have:

  • ✅ Strong foundation in performance testing concepts
  • ✅ Hands-on knowledge of JMeter, Gatling, and k6
  • ✅ Skills to analyze reports and identify bottlenecks
  • ✅ Interview preparation for QA and SDET roles
  • ✅ Real-world project experience

Next Steps: Practice with your own projects, contribute to open-source load testing, and keep learning!

CHAPTER 08

Performance Testing Glossary

Essential terms and definitions for performance testing professionals. Quick reference guide for common terminology.

Key Terms & Definitions

📊
TPS (Transactions Per Second)
Rate of completed transactions per second. Key metric for measuring system throughput and capacity.
⏱️
Response Time
Total time from sending a request to receiving the complete response. Critical for user experience.
🚀
Throughput
Number of requests or transactions processed per time unit (requests/sec, MB/sec).
Latency
Time from request sent to first byte received. Measures network and server processing delay.
👥
Concurrent Users
Number of virtual users simultaneously interacting with the system at any given moment.
📈
Ramp-Up Period
Time taken to gradually start all threads/users. Prevents sudden load spikes on the system.
🤔
Think Time
Pause between user actions to simulate real user behavior and reading/processing time.
📉
Error Rate
Percentage of failed requests during testing. Typically should stay below 1% under normal load.
🎯
Percentile (95th)
95% of requests completed within this time. Better metric than average for understanding user experience.
🔄
Load Testing
Testing system behavior under expected user load to validate it meets performance requirements.
💪
Stress Testing
Pushing system beyond normal capacity to identify breaking points and maximum operating limits.
Spike Testing
Validating system handles sudden, dramatic increases in load (flash sales, viral content).
Endurance Testing
Running tests for extended periods to detect memory leaks and performance degradation over time.
🎲
Baseline
Initial performance metrics captured under normal conditions for comparison with future tests.
🔍
Bottleneck
System component that limits overall performance (CPU, memory, database, network).
📊
SLA (Service Level Agreement)
Contractual commitment defining expected performance levels (response time, uptime, throughput).

🎓 Course Complete!

You've mastered all 8 chapters of Performance Testing! You now have comprehensive knowledge of:

  • ✅ Performance testing fundamentals and types
  • ✅ Industry-standard tools (JMeter, Gatling, k6)
  • ✅ Metrics analysis and bottleneck identification
  • ✅ Real-world project implementation
  • ✅ Interview preparation and best practices

Keep practicing and stay updated with the latest performance testing trends!

🎯 Ready to Test Your Knowledge?

Take our comprehensive 50-question quiz to validate your learning!

Take the Quiz 🚀