Project maintained by r9y-dev Hosted on GitHub Pages — Theme by mattgraham

Failure Testing in Prod

Failure testing in production (also known as chaos engineering) is the practice of deliberately introducing failures into a system in order to test its resilience and identify potential points of failure. This is done in a controlled manner, with the goal of improving the system’s overall reliability and stability.

There are a number of different ways to perform failure testing in production, including:

Failure testing in production can be a valuable tool for improving the reliability and stability of a system. However, it is important to note that this type of testing should be done carefully and with a clear understanding of the potential risks involved.

Examples of Failure Testing in Production:


Here are some tools and products that can help with failure testing in production:

Chaos Monkey:


Chaos Toolkit:


Resilience Platform:

These are just a few examples of the many tools and products that can be used for failure testing in production. The best tool for a particular organization will depend on the specific needs and requirements of the organization.

Here are some related terms to failure testing in production:

These terms are all related to the concept of ensuring the reliability and resilience of systems and applications.

In addition to the above, here are some other related terms that you may find interesting:

These terms are all related to the broader goal of improving the reliability, efficiency, and security of software systems.


Before you can do failure testing in production, you need to have a number of things in place, including:

In addition to the above, you may also need to put in place the following:

Once you have all of these things in place, you can start to develop and execute failure tests in production. It is important to start small and gradually increase the scope and complexity of your tests over time.

What’s next?

After you have failure testing in production, the next steps typically involve:

In addition to the above, you may also want to consider the following:

Failure testing in production is an iterative process. The goal is to continuously improve the resilience of the system and to reduce the likelihood and impact of failures.