Chaos engineering and Ethereum client testing

See how LogRocket's Galileo AI surfaces the most severe issues for you

No signup required

Check it out

This article provides a working knowledge of the principles of chaos engineering, discusses its use in software development, and explores how its use may be extended to blockchain development.

The tutorial portion of this article demonstrates how to use the ChaosETH framework to leverage chaos engineering for the testing of Ethereum clients. This strategy can be helpful for identifying flaws (sometimes referred to as “dark debts”) in smart contracts before the contract is widely adopted by members of the network.

What is chaos engineering?
Why is chaos engineering useful in blockchain development?
Why Ethereum clients?
Implementing chaos testing for a full Ethereum client
Tutorial: Chaos engineering experiment with a Go-Ethereum client

What is chaos engineering?

Chaos engineering is the practice of performing experiments on a distributed system in order to make it resilient and more fault tolerant to turbulent conditions that may occur in a production environment. The concept is easily traced back to Netflix, where a team led by Casey Rosenthal was placed in charge of testing software availability and system resilience.

Chaos engineering has five advanced principles to guide chaos engineers. Follow these principles to ensure you are practicing chaos engineering properly:

Set a hypothesis that describes the steady-state behavior of the target system
Consider real-world situations and events
Execute experiments in the production environment to build confidence in that environment
Automate experiments to run continuously because distributed systems are complex
Minimize the blast radius to prevent experiments from affecting customers

As you can see, these principles are very different from traditional testing techniques.

Why is chaos engineering useful in blockchain development?

Blockchain technology is a subset of distributed ledger technology and is used to build distributed decentralized applications. This distributed status is achieved by creating a peer-to-peer network of nodes, which are actually computers. As a system becomes more widely adopted and connected to more computers, its complexity increases.

Cardano Quick News on Twitter: “If your email service has a bug, people will just complain, nothing serious.But if a blockchain has a bug, people may lose money.Now that’s serious.Safety and stability are THE most important things to a blockchain.And they are at the heart of Cardano.#Cardano $ada #ada / Twitter”

If your email service has a bug, people will just complain, nothing serious.But if a blockchain has a bug, people may lose money.Now that’s serious.Safety and stability are THE most important things to a blockchain.And they are at the heart of Cardano.#Cardano $ada #ada

Faults and weaknesses can occur on a blockchain via the clients as a result of overloaded operating systems, errors from memory management, or network partitions. Deploying an Ethereum client is only possible on an operating system that provides it with important resources.

Given that chaos engineering is well suited to distributed systems, it can be useful in ensuring the resilience of each participating client on a blockchain, such as the Ethereum network.

Here are some points to keep in mind when designing experiments to inject chaos engineering principles on a blockchain:

Chaos engineering experiments should focus on the consensus mechanism, the network, storage layers, identification and authorization of participating nodes, smart contracts, on-chain interaction, and governance
Experiments can be done on the development and testnets, but after this, they must be conducted in production
Minimizing the blast radius is important when experiments are conducted in production, as these applications will involve money
Knowledge of similar architectures and known vulnerabilities are expedient in causing chaos on a client application

Why Ethereum clients?

This article specifically covers incorporating chaos engineering into Ethereum client applications. However, it’s important to note that the concept of injecting chaos in Web3 applies to all decentralized applications of all blockchains.

Ethereum has become the operational backbone of major decentralized platforms and has:

Higher adoption than other blockchains
Very active developer communities
An easily accessible production environment
Greater simplicity compared to other blockchains

Implementing chaos testing for a full Ethereum client

Proper planning for chaos testing on a live Ethereum client should include the following:

A thorough understanding of the architecture of the Ethereum client that will be tested
Planning the system model to adopt
Handling the following based on the adopted Ethereum client:
- Calls based on improper fallback settings from the client
- Incorrectly set timeouts
- Dependencies that are not resilient enough or that are deprecated
- Single points of failure
- Cascading failures

Tutorial: Chaos engineering experiment with a Go-Ethereum client

In this tutorial, we’ll demonstrate how to use ChaosETH, a new framework that measures how resilient an Ethereum client is in production, to execute chaos engineering experiments on a Go-Ethereum (Geth) client.

ChaosETH

ChaosETH was created by Long Zhang and colleagues at KTH Royal Institute of Technology in Sweden. ChaosETH was designed to assess the resilience of Ethereum clients and thereby make the Ethereum blockchain more reliable. By way of operation, ChaosETH:

Monitors Ethereum clients to determine their steady-state behavior
Actively injects system call invocation errors in the clients
Monitors the resulting behavior of the error injection
Compares the resulting behavior to the steady-state behavior
Produces a resilience report directly from production

Let’s get started!

Step 1: Create the development environment

Select a cloud service provider where you will host a virtual machine, or install and configure Docker. Create a virtual machine instance running Ubuntu as OS and open port number 30303. This is the default port that the Ethereum client listens to.

Step 2: Build and run the target Ethereum client

Next, grab the latest stable version of the Ethereum client. Let’s go with the Geth client.

Build the client by following the documentation’s provided installation steps. Chaos engineering requires some observability features, hence you’ll need to add options to activate monitoring features in Geth’s documentation support for metrics.

There are many ways to install the Geth client, depending on your operating system or tooling. In this article, we’ll use Docker, and we’ll run the command on a shell:

docker pull ethereum/client-go
# and running it with:\\
docker run -it -p 30303:30303 ethereum/client-go

Step 3: Create a Docker container for observability

We’ll use InfluxDB alongside the Geth client to enable monitoring functionalities. Use the following command:

docker run -p 8086:8086 -d --name influxdb -v influxdb:/var/lib/influxdb influxdb:1:8

Now, configure the InfluxDB container by executing the following commands:

docker exec -it influxdb bash

Run this command inside the container:

influx

Next, execute these commands in the InfluxDB shell:

CREATE DATABASE chaoseth
CREATE RETENTION POLICY "rp_chaoseth" ON "chaoseth" DURATION 999d REPLICATION 1 DEFAULT
CREATE USER geth WITH PASSWORD xxx WITH ALL PRIVILEGES

Now the container is ready. You can proceed to run the Geth client along with the observability metrics and other options. Geth provides more than 500 different metrics from which we can choose.

The client must be run by a root user, even when it is being restarted after previous experiments. Therefore, sudo is necessary for the syscall monitoring and error injector.

Over 200k developers use LogRocket to create better digital experiences

Learn more →

The data directory must be specified as an option in the command, given the extra disc space of the instance. If this is not done, it will get persisted into the OS drive of the instance instead.

Consistent configurations are required from a client’s peers, so we’ll specify a target number of peers; we’ll use 50 since that is the default maximum number of peers for the Geth client.

The observability metrics are included for the application level monitoring.

Finally, you can make the Geth client run in the background to free up the terminal, and you can redirect the output to anywhere you like.

The resulting command will look like this:

sudo nohup ./geth --datadir=/data/eth-data \\
  --maxpeers 50 \\
  --metrics --metrics.expensive \\
  --metrics.influxdb --metrics.influxdb.database DB_NAME --metrics.influxdb.username geth --metrics.influxdb.password DB_PASS \\
  >> geth.log 2>&1 &

Step 4: Sync the client and observe the metrics

The entire synchronization process takes around three days and the status can be monitored on https://ethernodes.org/.

There is a client_monitor.py script that, when deployed, observes the steady-state behavioral metrics of the client after the sync is completed. The following command will attach the client monitor to the process and also feed the metric data as an endpoint in Prometheus in port 8000:

nohup sudo ./client_monitor.py -p CLIENT_PID -m -i 15 --data-dir=CLIENT_DATA_DIR >/dev/null 2>&1 &

To scrape the metrics data from Prometheus, include the following script in your config file:

scrape_configs:
  - job_name: 'client_monitoring'
    static_configs:
      - targets: ['172.17.0.1:8000']

Alternatively, you can visualize the data by creating a Grafana dashboard, like so: ./visualization/Grafana - Syscall Monitoring.json file.

The steady-state analysis in the original experiment shows the metrics of data captured during two different monitoring sessions.

Data Captured Chart

Conclusion

Chaos engineering and blockchain technology are both relatively new, but their importance has been proven and validated by wide adoption.

In this article, we provided an overview of chaos engineering principles, introduced the ChaosEth framework, and showed how to leverage the ChaosETH framework for resilience testing of a GETH client.

Implementing chaos engineering on Ethereum clients is critical for identifying potential faults that may occur during the lifecycle of a DApp or smart contract.

Join organizations like Bitso and Coinsquare that use LogRocket to proactively monitor their Web3 apps

Client-side issues that impact users’ ability to activate and transact in your apps can drastically affect your bottom line. If you’re interested in monitoring UX issues, automatically surfacing JavaScript errors, and tracking slow network requests and component load time, try LogRocket.

LogRocket lets you replay user sessions, eliminating guesswork around why bugs happen by showing exactly what users experienced. It captures console logs, errors, network requests, and pixel-perfect DOM recordings — compatible with all frameworks.

LogRocket's Galileo AI watches sessions for you, instantly identifying and explaining user struggles with automated monitoring of your entire product experience.

Modernize how you debug web and mobile apps — start monitoring for free.

#blockchain

Fix over-caching with dynamic IO caching in Next.js 15

Next.js 15 caching overhaul: Fix overcaching with Dynamic IO and the use cache directive.

David Omotayo

Aug 6, 2025 ⋅ 10 min read

LLMs are facing a QA crisis: Here’s how we could solve it

LLM QA isn’t just a tooling gap — it’s a fundamental shift in how we think about software reliability.

Rosario De Chiara

Aug 4, 2025 ⋅ 7 min read

Windsurf vs. Cursor: When to choose the challenger

Windsurf AI brings agentic coding and terminal control right into your IDE. We compare it to Cursor, explore its features, and build a real frontend project.

Chizaram Ken

Jul 31, 2025 ⋅ 9 min read

The CSS `if()` function: Conditional styling will never be the same

The CSS Working Group has approved the if() function for development, a feature that promises to bring true conditional styling directly to our stylesheets.

Ikeh Akinyemi

Jul 30, 2025 ⋅ 12 min read

View all posts

Advisory boards aren’t only for executives. Join the LogRocket Content Advisory Board today →

Leveraging chaos engineering to test Ethereum clients

See how LogRocket's Galileo AI surfaces the most severe issues for you

No signup required

Contents

What is chaos engineering?

Why is chaos engineering useful in blockchain development?

Why Ethereum clients?

Implementing chaos testing for a full Ethereum client

Tutorial: Chaos engineering experiment with a Go-Ethereum client

ChaosETH

Step 1: Create the development environment

Step 2: Build and run the target Ethereum client

Step 3: Create a Docker container for observability

Over 200k developers use LogRocket to create better digital experiences

Step 4: Sync the client and observe the metrics

Conclusion

Join organizations like Bitso and Coinsquare that use LogRocket to proactively monitor their Web3 apps

Stop guessing about your digital experience with LogRocket

Recent posts:

Fix over-caching with dynamic IO caching in Next.js 15

LLMs are facing a QA crisis: Here’s how we could solve it

Windsurf vs. Cursor: When to choose the challenger

The CSS `if()` function: Conditional styling will never be the same

Leave a ReplyCancel reply

Advisory boards aren’t only for executives. Join the LogRocket Content Advisory Board today →

See how LogRocket's Galileo AI surfaces the most severe issues for you

No signup required

Contents

What is chaos engineering?

Why is chaos engineering useful in blockchain development?

Why Ethereum clients?

Implementing chaos testing for a full Ethereum client

Tutorial: Chaos engineering experiment with a Go-Ethereum client

ChaosETH

Step 1: Create the development environment

Step 2: Build and run the target Ethereum client

Step 3: Create a Docker container for observability

Over 200k developers use LogRocket to create better digital experiences

Step 4: Sync the client and observe the metrics

Conclusion

Join organizations like Bitso and Coinsquare that use LogRocket to proactively monitor their Web3 apps

Stop guessing about your digital experience with LogRocket

Recent posts:

Fix over-caching with dynamic IO caching in Next.js 15

LLMs are facing a QA crisis: Here’s how we could solve it

Windsurf vs. Cursor: When to choose the challenger

The CSS if() function: Conditional styling will never be the same

Leave a ReplyCancel reply

The CSS `if()` function: Conditional styling will never be the same