Paul Mikulskis Developer advocate and systems engineer with a passion for making decentralized ecosystems approachable.

Leverage Ethereum blockchain data with JSON-RPC

7 min read 2014

Ethereum Logo Over a Brick Background

The blockchain, a concept constructed with modern-day protocols, networking, and cryptography, provides a variety of computational building blocks that open a whole new world of human interaction.

In this article, we’ll discuss how to use a remote procedure call (RPC) interface to access and interact with the vast amount of data available on the Ethereum blockchain.

Although the examples in this article are based on the Ethereum blockchain, an overwhelming number of blockchains, such as Avalanche, Polygon, BSC, and Harmony, use the same underlying state machine to keep track of their database. In other words, the concepts introduced in this article may also be applied to other blockchains that are based on the Ethereum Virtual Machine (EVM).

Blockchain database architecture

Despite any grandiose claims from any particular blockchain platform, all blockchains are fundamentally the same from an architecture perspective. Any given blockchain is physically constructed of the same types of resources, consisting of a set of independently operated nodes talking to each other via an agreed-upon protocol to expose an append-only database.

A blockchain database is not organized by tables. Instead, it is organized by blocks that are linked cryptographically.

Blockchain Basics

In order to use the data within this database of transactions to build an app or take other actions, we need the ability to look at and query this data. Additionally, and most importantly, we need a channel by which we can write to the database. Most blockchain networks accomplish these basic goals by specifying their own RPC interface.

RPC in blockchain

A remote procedure call refers to an interface between two systems. There are multiple types of RPC: gRPC (Google RPC), JSON-RPC, and XML-RPC to name a few.

The RPC interface is usually implemented over HTTP and can be used to call a variety of functions. For the sake of clarity, in this article, we’ll specifically consider one type of RPC and one system. We’ll be looking at how to use JSON-RPC with the Ethereum blockchain.

To start, let’s look at how RPC will come into play from the perspective of the user, and how it engages the blockchain at hand (in this case, Ethereum).

We made a custom demo for .
No really. Click here to check it out.

Let’s say I want to send you, dear reader, 10 Finney for taking the time to soak up some blockchain knowledge.

N.B., a Finney is a sub-denomination of Ethereum.

I would probably hop on my computer or my iPhone and use MetaMask to send this sum. MetaMask would use my private key (stored on the local device’s disk) to sign a transaction and invoke the eth_sendTransaction function via an HTTP request with the correct fields to a full node run by the MetaMask company.

In this case, the full node is one of several nodes in the blockchain network running the Ethereum software. It is connected to all other nodes in the network, and it has the ability to validate blocks. Therefore, it can validate my transaction of 10 Finney to you.

Building upon this idea, how would I find out how much money I have? With insufficient funds, the transaction would fail and I would waste money on gas fees.

MetaMask, or any client, can make an HTTP request to an Ethereum node with an available RPC interface, invoking the eth_getBalance function with my address as a parameter.

The request would look something like this:

curl --location --request POST 'some.ethereum.full.node/' \
--header 'Content-Type: application/json' \
--data-raw '{
    "jsonrpc":"2.0",
    "method":"eth_getBalance",
    "params":[
        "0x407d73d8a49eeb85d32cf465507dd71d507100c1",
    ]
}'

RPC can be thought of as a gateway protocol for engaging with (reading or writing to) the Ethereum network:

Gateway Network Diagram

Ethereum nodes and RPC functionality

In order to sign transactions and make requests, we‘ll need access to a blockchain network server (or node) running the Ethereum software. There are three different types of nodes: full node, archive node, and light node. Each of these node types has a different requirement to operate and provides different RPC functionalities:

  • A light node is just a downloaded account state on your computer. It can hold and retrieve information about a singular account. It can also conduct RPC transactions on behalf of that account with the help of a full node
  • A full node has enough data synchronized with other nodes on the network to validate blocks and contribute to securing the database that we call the blockchain. A full node implements most of the RPC functions, such as getting raw block data for a given block number, entering transactions, and retrieving account values
  • An archive node works hard to ensure it stores the entire history of the blockchain. It may or may not enlist in validating blocks like full nodes. An archive node can expose richer RPC endpoints that may have information (or denser data). For example, an archive node could expose a trace, which can be thought of as a transaction call stack for a given transaction in a given block

Depending on your use case, you might be just fine operating a simple light node on a tiny computer under your desk. However, if you want to work with blockchain data, run your own indexers, or make an application that allows users to scrub through their historical balances in a 3D visualization powered by RPC blockchain data, you’ll need a full node or an archive node.

Implementing JSON-RPC with the Ethereum blockchain

There are many providers that expose JSON-RPC interfaces, such as DataHub Figment, Infura, and Moralis to name just a few. However, before you enlist these services, be aware that scaling can come at quite a cost.

You may also opt to host your own archive node using an implementation of the OpenEthereum standard. Many open source clients come with an inbuilt RPC service that can read the data it builds up on your server, making setup a breeze.

Erigon is currently one of the most popular clients for synchronizing with the Ethereum blockchain and exposing a data-rich RPC interface. Its popularity is due to its speed and the optimizations it offers for Ethereum-specific data.

Erigon has many interesting properties that I hope to cover in more depth soon, but for now, we can consider it to be one of the best available Ethereum-specific implementations of an archive node. Running Erigon enables you to query an RPC interface as quickly as your hardware can keep up. For example, you may want to use this method to practice indexing blockchain data for lightning-fast query time on The Graph Network.

Of course, there are other more established, or more popular, implementations of the OpenEthereum standard that can provide a JSON-RPC interface. One example is Geth, the aptly named Go implementation of Ethereum.

Running these sorts of nodes does require a punchy set of hardware. Just for starting figures, Erigon should be given at least 8GiB of RAM.) and an i7CPU to synchronize the Ethereum blockchain in about one week. At the end of the week, you should expect 2TB of disk to be used. By comparison, Geth will take about a month to synchronize and will eat up about 10TB of disk. Yikes!

The ramifications of your node going down could be catastrophic if your business depends on such data. In this type of scenario, waiting a month to re-synchronize would be ridiculous. Running multiple nodes becomes an engineering challenge in itself. This option would be costly not only in time but also in raw hardware since these clients strongly prefer to be run on SSDs (with Erigon requiring this to hold true). Now you understand why I might want to connect with a provider in order to send over my 10 Finney!

Whether you choose to use a provider or run your own client and node, the goal is the same: connect to the decentralized database, and interact with it.

Now, let’s take a look at what some of Ethereum’s raw block data might look like from an RPC call!

Examining Ethereum block data from a JSON-RPC call

Let’s consider the following HTTP request to a server offering an Ethereum JSON-RPC API:

curl --location --request POST 'some.ethereum.node' 
  \--header 'Content-Type: application/json'
  \--data-raw '{
    "jsonrpc":"2.0",
    "method":"eth_getBlockByNumber",
    "params":[
        "0x4E4ee",
    ],
}'

Here, we’re calling the eth_getBlockByNumber function with the parameter 0x4E4ee, which is 320750 in decimal. If we were to shoot this request out to an RPC node, we could expect the body of the returned data to look something like this:

Returned Data Diagram

The above diagram illustrates how the raw block data might look. As indicated in the diagram’s legend, the strings are designated with yellow, and the numbers are designated with green. The values shown in blue are technically strings, but to be precise, they are numbers encoded as hexadecimal values.

Ethereum lives and breathes hex when it comes to storing its data in its state via a Patricia Tree. The RPC interface wants to return that truly raw data, and nothing less, thus warranting a hexadecimal expression for a variety of values.

By calling the eth_getBlockByNumber function in the raw body of our POST request to this RPC endpoint, we get information about that block (such as you’d expect to see in an Ethereum block header), along with a list of all the transactions in that block.

To find transactions that you sent on September 21, 2019, for example, you could query blocks until the timestamp field equals the query date in BigInt form: 1569091562. From there, you would check each block around that particular timestamp to detect all the transactions you sent.

If this sounds like a lot of HTTP requests to you, you’re correct! Here’s the problem – the data available via the RPC interfaces is not very rich and has limited indexes built up. This data is not easily searchable, and a simple query like the one above could quickly turn into a “needle in a haystack” endeavor.

To address this issue, some will use an indexing service to index their own data, siphoned out from an RPC node via its raw data as exposed above. Another option is to work with a software solution, such as The Graph, and pay other Indexers to do the heavy lifting via exposing a custom set of indexed fields from this raw gobbledygook of unlimited block data from the RPC interface.

Either way, it’s clear that the most fundamental way that blockchain nodes communicate with each other is over RPC. This raw-level interface is well suited for keeping the network operating smoothly via its actionable write operations such as sending a transaction or mining blocks, as well as letting people read data from this globally decentralized database as shown above when grabbing a piece of block-level data.

We previously stated that RPC has a limited amount of read functionality because indexing the raw data in detail can pose an extremely challenging task for the RPC node at hand, and consequently the network. Indexing is a different problem for network users to solve than network-level communication.

Conclusion

In this article, we explored JSON-RPC and blockchain data, including how to access and interact with the raw data. If you want to get your hands wet with some real-world blockchain RPC calls and data without spinning up your own infrastructure, head over to one of the hosted RPC providers mentioned earlier. Open a free account, and try it out!

Here are some interesting Ethereum JSON-RPC calls to get you started. Check out the official Postman documentation for a full list.

WazirX, Bitso, and Coinsquare use LogRocket to proactively monitor their Web3 apps

Client-side issues that impact users’ ability to activate and transact in your apps can drastically affect your bottom line. If you’re interested in monitoring UX issues, automatically surfacing JavaScript errors, and tracking slow network requests and component load time, try LogRocket.https://logrocket.com/signup/

LogRocket is like a DVR for web and mobile apps, recording everything that happens in your web app or site. Instead of guessing why problems happen, you can aggregate and report on key frontend performance metrics, replay user sessions along with application state, log network requests, and automatically surface all errors.

Modernize how you debug web and mobile apps — .

  • net_version: Gets the current network ID
  • net_peerCount: Gets the number of active peers to which the RPC blockchain node is connected
  • eth_gasPrice: Gets the current price of gas, in ETH
  • eth_blockNumber: Returns the number (height) of the most recent block
  • eth_getBalance: Returns the balance of the account of a given address
  • eth_sendTransaction: Creates a new message call transaction or a contract creation, if the data field contains code
  • eth_getBlockByNumber: Returns information about a block, by block number
Paul Mikulskis Developer advocate and systems engineer with a passion for making decentralized ecosystems approachable.

Leave a Reply