Let's try to * in your environment? Here you can find some stuff from chaos engineering to performance engineering, from sysadmins to developers.
Most of then will come with examples, sources and books or videos. With time, I'll make some examples of each of them and publish on this repository and link than to their topic.
Let's contribute :D!
This section we will follow the Principles of Chaos Engineering. Basically, we will try to destroy and find ways to affect the availability in your environment.
Some tools that we can use/test.
- Kube Monkey - Bring the Chaos to Kubernetes. Wrote in Go and easy to use or customize.
- Example of use - An use example
- Janitor Monkey - Clean up the unused on AWS.
- Chaos Monkey - Bring the Chaos to AWS Instances. It will terminate, randomly, the instances on AWS.
- Chaos Toolkit - Bring the Chaos to everything that you want. Wrote in Python, easy to customize and already has a lot of extensions.
- Example of some extensions chaostoolkit-aws and chaostoolkit-azure.
This section is about theoretical stuff. Like best practices, some examples, feedback from other companies, roadmaps, etc.
Here we will look to error handling and how your system behaves with it.
- Istio - We can use Istio generate a lot of types of injection. Like networking delay, HTTP return, etc.
- tc - With it, we can create a lot of things related with Network. Like package loss, delay (many ways and you can combine than)
- Limiting Resources (CPU, Mem, etc) - Who our systems will behave if we can reduce the resources to less than they have or defined?
- Fault Injection Techniques and Tools
- Failure Injection and Chaos Engineering
- Lineage Driven Failure Injection (LDFI)
Looking for a bottleneck in network, OS, Kernel, JVM, etc.
- AQM Algorithms
- Understanding more about Networking
- Queueing in the linux network stack
- AQM - Controlling Queue Delay
TBD
What is the difference between {load,performance,stress} tests?
TBD
TBD
TBD