Go back to all articles

Reliability and Stability Testing

Mar 28, 2019
4 min read

Reliability and stability testing allows to model system behavior during regular and irregular situations, shutdowns or restarts of the various system components or lengthy loads on the system.

Problems it will solve

  • Minimizing the risks related to the inoperability of business processes or system components after the failure of several system components by promptly discovering problems during the reliability and fail-safety tests and providing recommendations for their overpassing
  • Minimizing the risks related to possible system performance degradation under loads after its restoration by comparing the system performance indicators during the reliability and stability tests

Deliverables

  • 01
    The report on the stability testing includes
  • Information about the number of defects discovered in the operation of different business processes and system components after the failure of a certain component, as well as their severity
  • A list of defects with a description of the problem and a method fafter the failure of a certain component
  • Information about the restoration time needed for the system component and business processes, and also about the necessary conditions
  • Information about the changes in the IT system performance after the restoration of the system operability, and about the precise parameters of the IT system response speed: response times of user operations (under different loads) and server-loading resources (CPU, Memory, I/O)
  • Recommendations for the system architecture and infrastructure improvements
  • Description of load profiles (MS Word)
  • 02
    Test data (The format of the test system used)
  • 03
    Load-testing scripts
  • 04
    External system emulators
  • 05
    Load scenarios
  • 06
    Scripts for generation/depersonification of the DB
  • 07
    Data pulls
  • 08
    Manual for conducting the tests

Scope of work

  • 01
    Creation of load testing methodology
  • Collection and analysis of production environment statistics
  • Coordination of the performance requirements
  • Determination of business processes and load scenarios for reliability testing
  • Determination of components for reliability testing
  • Description of the interactions with the external systems
  • Calculation of the intensity and determination of the load profiles for reliability testing
  • Description of the requirements for the DB volumes
  • Creation of a test plan
  • 02
    Creation of a test model
  • Development of load scripts
  • Development of external system emulators
  • Creation of load scenarios
  • Creation of scripts for generation/depersonification of the DB
  • Creation of data pulls
  • Manual for conducting the test
  • 03
    Test preparation
  • Checking the operability of the test environment
  • Installing the testing tool on the load stations
  • Tuning the monitoring tools
  • Conducting trial tests
  • 04
    Conducting stability tests
  • Launch tests for checking the system reliability in accordance with the load scenarios
  • Shutdown/restart of the chosen system components
  • Launch tests to check the fail-safety of the system
  • Results analysis
  • 05
    System analysis
  • Analysis of the bottlenecks in the system performance
  • Analysis of the influence of a shutdown/restart of the chosen components on the business processes
  • Analysis of the system restoration time after a shutdown/restart of the chosen components
  • Preparation of the recommendations for changes in the system architecture and infrastructure or the development of relevant regulations

Service Limitations

Stability testing is not functional and is not intended to discover functional bugs. However, all discovered functional defects will be noted and presented to the customer.

Related Services:

Tools and licences

  • LoadRunner
  • Apache JMeter
  • MS Visual Studio
  • IBM Rational Performance Tester
  • Silk Performer
Table of contents

Related insights in blog articles

Explore what we’ve learned from these experiences
7 min read

SRE Roles and Responsibilities: Key Insights Every Engineer Should Know

sre roles and responsibilities preview
Sep 11, 2024

Site Reliability Engineers (SREs) are crucial for maintaining the reliability and efficiency of software systems. They work at the intersection of development and operations to solve performance issues and ensure system scalability. This article will detail the SRE roles and responsibilities, offering vital insights into their duties and required skills. Key Takeaways Understanding Site Reliability […]

11 min read

Understanding Error Budgets: What Is Error Budget and How to Use It

understanding error budgets what is error budget and how to use it preview
Sep 10, 2024

An error budget defines the allowable downtime or errors for a system within a specific period, balancing innovation and reliability. In this article, you’ll learn what is error budget, how it’s calculated, and why it’s essential for maintaining system performance and user satisfaction. Key Takeaways Understanding Error Budgets: What Is Error Budget and How to […]

10 min read

Mastering Reliability: The 4 Golden Signals SRE Metrics

mastering reliability the 4 golden signals sre metrics preview
Sep 9, 2024

Introduction to Site Reliability Engineering Site Reliability Engineering is a modern IT approach designed to ensure that software systems are both highly reliable and scalable. By leveraging data and automation, SRE helps manage the complexity of distributed systems and accelerates software delivery. A key aspect of SRE is monitoring, which provides real-time insights into both […]

9 min read

Reliability vs Availability: Key Differences

reliability vs availability key differences preview
Sep 6, 2024

Defining Reliability and Availability What is Reliability? Reliability refers to the probability that a system will consistently perform as expected, delivering correct output over a set period of time. In the world of Site Reliability Engineering (SRE), reliability is a core metric that drives everything we do. It’s not just about whether a service works […]

  • Be the first one to know

    We’ll send you a monthly e-mail with all the useful insights that we will have found and analyzed