Why I built OnchainProbe

TL;DR: I built OnchainProbe after years maintaining go-ethereum forks and repeatedly seeing RPC behavior change after upgrades. I needed behavioral regression testing for RPC endpoints that could compare real responses before and after merges or forks. Nobody had built that tool, so I did.

I’ve spent most of my career maintaining go-ethereum forks. Ethereum Classic, CoreDAO, Ebakus. Different chains, different protocol modifications, but always the same core challenge: keeping a customized client alive and correct while upstream moves fast. You’ll notice geth comes up a lot in our blog. That’s simply because my experience is rooted in go-ethereum. The problems and solutions apply equally to any EVM client.

The work is some of the most demanding in blockchain engineering. You’re carrying local protocol changes like gas schedule tweaks, custom opcodes, and consensus modifications while periodically merging months of upstream geth development. Every merge is a high-wire act. Every hard fork activation is a moment where you hold your breath and watch the network.

I’ve done this enough times to know where the real risk lives. And it’s not where most people think.

The testing gap nobody talks about

Early in my career, I learned a principle that stuck: never trust local tests alone.

A blockchain client behaves differently when interacting with other nodes, validators, contracts, and real network traffic. Unit tests verify isolated logic. They don’t tell you whether your fork’s modifications interact correctly with the latest upstream changes under real conditions.

Over the years, I developed a layered validation approach:

Unit and regression tests for isolated logic
Sync verification. Syncing four nodes after every merge: testnet fast, testnet full archive, mainnet fast, mainnet full archive. If any of them fail or stall, something is wrong.
Devnet testing. Deploying a private network to create targeted scenarios and sharpen your testing tools against the specific changes you brought in.
Scenario-based testing. What happens 10 blocks before a fork, at the fork boundary, and 10 blocks after.
Performance and stress testing. Heavy transaction loads, large blocks, edge cases.
Full monitoring and alerts once deployed

I’ve done this across more than 15 upstream merges over my career, including several that spanned months of upstream changes. When you break the work into manageable pieces and validate at every layer, you can bring even a significantly outdated client up to date without network instability.

But there was always a gap in this stack.

The missing layer

Steps 1 and 2 have good tooling. Step 6 has mature monitoring solutions. But the middle layers, the ones that verify actual RPC behavior after an upgrade, were always manual. Custom scripts. Curl commands saved in someone’s notes. Ad-hoc comparisons that someone ran once and never maintained.

Every time I merged a large upstream batch, the process looked like this:

Merge the code. Fix conflicts. Run the test suite. Green.
Sync four nodes. Wait. They all finish. Good.
Deploy a devnet. Manually run RPC calls. Looks fine.
Deploy to mainnet. Watch logs. Hope nothing breaks.
Two days later, discover that eth_estimateGas returns different values, or a receipt field changed format, or a trace output restructured.

The unit tests passed because they tested internal logic. But the external behavior, the RPC interface that every application, wallet, and service depends on, was never systematically verified.

I kept thinking: someone must have built a tool for this. A way to baseline RPC responses before an upgrade, run the same calls after, and get a clear diff of what changed. Field by field. Method by method.

Nobody had.

Building what I needed

OnchainProbe started as the tool I wished existed during every upstream merge and hard fork activation I’d ever done.

The core idea: behavioral testing for RPC endpoints. Not testing code paths. Testing outputs. You point OnchainProbe at your endpoints, it runs structured test suites against them, and you get field-level diffs showing exactly where behavior diverges from the spec or from your reference endpoint.

It goes beyond one-off comparisons. You can run multiple endpoints side by side in the same test run, so you see exactly where your fork disagrees with upstream geth, or where provider A disagrees with provider B. It validates EIP compliance, tests transaction handling, checks error response formats, and runs on a schedule or from CI so regressions surface before they reach production.

A gas estimate shifted by 15%? A new field appeared in block responses? An error format broke the JSON-RPC 2.0 spec? You see it immediately, with exact diffs and clear verdicts.

I built it because I knew exactly what the workflow needed to be. I’d been doing it manually for years. The questions were always the same:

Did this merge change any RPC responses?
Do my two endpoints still agree on the same calls?
Is the behavior at the fork boundary what I expect?
Are gas estimates still within the range my applications assume?

OnchainProbe answers all of these automatically.

These aren’t hypothetical problems

In our own testing, we’ve found issues that would have been invisible without structured comparison:

A major L1 testnet where eth_estimateGas underestimates by 43% on contract interactions. The chain’s own documentation recommends a 20-30% buffer for state-dependent inaccuracy, but this gap is deterministic and undocumented.
An RPC proxy returning every error response as a plain string instead of the spec-required JSON-RPC 2.0 error object, silently breaking standard client libraries.

These are real findings from real endpoints. The kind of issues that live quietly in production until someone’s transaction fails or their monitoring breaks.

Who this is for

OnchainProbe is for anyone responsible for RPC endpoint correctness:

Fork maintainers merging upstream changes and activating hard forks
L2/L3 teams validating their RPC layer against EVM standards
RPC providers ensuring multi-client consistency across their fleet
Protocol teams verifying behavior before and after upgrades
QA and DevOps teams monitoring endpoint health continuously

If you’ve ever written a curl script to spot-check an endpoint after a deploy, you know the workflow. OnchainProbe is what that workflow looks like when it’s systematic.

Start testing your endpoints today.