Avalanche Unpatched Zero Day Leaves Protcol in Danger

Late in the evening on February 9th, 2022, I received an alert on Telegram from a 'network security' focused channel that read, "Avalanche Blockchain Vulnerable to 0day DoS".

Considering the fact that Avalanche is currently trading with a total market cap in excess of $5.5 billion at the time of writing, news of this nature immediately struck me as something worthy of investigation.

After I did a cursory search on Google, Reddit, Telegram, Twitter and other social media to see if any other outlets or entities had mentioned the situation, I was a bit confused to see there was absolutely no discussion on the matter.

Thus, I took it upon myself to consult the OG source of this information.

As this report will show, despite the widespread lack of coverage and attention paid to this issue, the individual who originally made these findings stumbled upon a bonafide denial of service bug capable of knocking the entire 'P chain' offline.

Consulting the OG Intel Report

Before moving forward, all credit for this discovery should be given to the author of the original report, whom we're about to familiarize ourselves with. If there are any bounties, rewards, etc., that are to be paid out in pursuance of this piece, I kindly request that you forward them to the aforementioned research ahead of myself. Thank you.

Following the link from the Telegram channel that initially alerted me about this discovery, I was able to find the original author of this report.

https://twitter.com/123456; curiosly, they're followed by Huobi and Brock Pierce, among others - so this clearly isn't a "random" of some sort

The specific tweet in question that notes the zero day is linked below:

https://twitter.com/123456/status/1623855584469561344?s=20&t=uQJLfRNQxqzP50db8maEuQ

As we can see above, the author didn't mince any words in their description of the alleged issues on Avalanche's blockchain. They also included a link to a longer form piece they published that details their findings in greater depth.

We're going to go take a peek at that report now.

Digging into the Vuln Analysis Report

This report is admittedly a much more interesting read than most proof of concepts you'll ever find.

The report starts with the following:

"Avalanche just fucked me out of a sizable bug bounty — so I immediately found another bug to disclose to the public. This is a remote API DoS/crash that should OOM chain P and render a vulnerable node mostly or entirely useless."

Sticking to the facts here, the claims made with respect to the identified vulnerability were:

The issue renders Avalanche's API endpoint susceptible to being taken down/offline via exploiting the vulnerability to exhaust that resource via denial of service.
Exhausting said resource via denial of service via exploiting this discovered vulnerability could "OOM chain P".

"OOM Chain P"?

to understand this, we're going to need to (lightly) cover how Avalanche works for a second.

The primary difference between Avalanche's design as a 'blockchain' versus other in this space that we've grown familiar with over time is that Avalanche has three different chains that users in their ecosystem can interact with.

From Avalanche's documentation: "Avalanche features 3 built-in blockchains: Exchange Chain (X-Chain), Platform Chain (P-Chain), and Contract Chain (C-Chain). All 3 blockchains are validated and secured by all Avalanche validators which is also referred as the Primary Network. The Primary Network is a special Subnet, and all members of all custom Subnets must also be a member of the Primary Network by staking at least 2,000 AVAX."

That description was also supplemented with an accompanying graphic (re-published below for convenience):

The following screenshot from Avalanche's docs provide a bit more insight into what it is that each individual chain's role is in the ecosystem.

Quick Takeaways on Avalanche Chain Structure (Context)

This isn't a tangent here. The purpose of taking the time out to understand the purpose and role of the 'P Chain' within the context of this ecosystem as this will help inform our perspectives later when we assess the magnitude of the issue.

Based on the brief overview from above, we now know that:

All 3 chains that run on Avalanche rely on the 'P Chain'. Remember, the documentation we just read stated, "All 3 blockchains are validated and secured by all Avalanche validators which is also referred as the Primary Network" (Primary Network = 'P Chain')
The existence of each 'subnet' on the protocol depends on the perpetual functionality of the "P Chain"
The consensus mechanism for Avalanche is handled and executed exclusively by the P Chain.

With all the following in consideration, its clear that any takedown of the 'P Chain' would effectively render the entire blockchain itself inoperable.

"What Does 'OOM' Mean?

OOM = Out of Memory

In the next part of this case study, we will soon see how relevant this is to the overall attack vector.

Analyzing the Actual Attack

In the blog post, the author put up a link to a YouTube video showing the specific issue at hand.

https://www.youtube.com/watch?v=Pg9w_g4Cv9A

What the video is showing is:

Using a software tool called 'Burp Suite' to craft a very specific API request directed at an endpoint.
The API command is a simple `POST' request, which asks for the endpoint to return back a list of validators.
The endpoint does so, but the response appears to be massive.

The issue is inherent in the video alone (if you don't get it, we'll break down why further along). Also, based on what we can see in the video, the alleged mitigations that were suggested by Avalanche to the author of this research will not have any impact on mitigating this issue.

Below is a semantic explanation from the author of what occurred in the video we just watched above:

The author is right in their assessment of the situation.

We'll find out that there are a few other issues with this current situation as it is on Avalanche and why some of the mitigations that they've already put in place could be bypassed with relative ease.

Running Our Own Tests on the Avalanche Main API Endpoint

No malicious or nefarious conduct was taken by the author of this study. This exercise did not result in actually disabling the Avalanche API endpoint (as that would just be an outright attack on the network at that point). This testing also did not require crafting any purposefully malicious inputs or behaving in a way outside of what has been explicitly specified in Avalanche's documentation

With that disclaimer out of the way, let's get started.

While the research proffered by the OG intelligence report appeared more than credible, it's good practice to roll your sleeves up and get your 'hands dirty' as well if you have the chance.

Doing so allows you to independently verify that what's been alleged is actually the case and may also enable you to discover a few supplementary facts or developed perspectives not present in the initial report.

Avalanche API Documentation

From here, our first trip is to Avalanche's API documentation.

The documentation starts off by letting us know that, "There is a public API server that allows developers to access the Avalanche network without having to run a node themselves. The public API server is actually several AvalancheGo nodes behind a load balancer to ensure high availability and high request throughput."

As they note, their public API server is hosted at api.avax.network for the Avalanche mainnet. In order to make calls on the endpoint that draws data from one of the 3 chains, one has to append the appropriate subdirectory location for that respective chain to the API URL endpoint.

Thus, there are one of three potential Avalanche API endpoint URLs you will call (contingent on which chain you want to interact with), and they are:

C-Chain = https://api.avax.network/ext/bc/C/rpc
X-Chain = https://api.avax.network/ext/bc/X
P-Chain = https://api.avax.network/ext/bc/P

Obviously, we're most interested in the third option.

Finding the API Documentation Containing Methods for Calling the 'P Chain'

Fortunately, this is all on the same site hosting this documentation right here. More specifically, the API call that we're interested in looking a bit closer at is the platform.getCurrentValidators method.

This method is defined as being used to "list the current validators of the given subnet'. However, what we'll see is that this API request actually works much differently in practice (which is a flaw in the design and logical, sane API design by Avalanche).

Below is the 'signature' (basically a list of all the different parameters & subparams that can be extracted using this call on the endpoint):

platform.getCurrentValidators({
    subnetID: string, // optional
    nodeIDs: string[], // optional
}) -> {
    validators: []{
        txID: string,
        startTime: string,
        endTime: string,
        stakeAmount: string,
        nodeID: string,
        weight: string,
        validationRewardOwner: {
            locktime: string,
            threshold: string,
            addresses: string[]
        },
        delegationRewardOwner: {
            locktime: string,
            threshold: string,
            addresses: string[]
        },
        potentialReward: string,
        delegationFee: string,
        uptime: string,
        connected: bool,
        signer: {
            publicKey: string,
            proofOfPosession: string
        },
        delegators: []{
            txID: string,
            startTime: string,
            endTime: string,
            stakeAmount: string,
            nodeID: string,
            rewardOwner: {
                locktime: string,
                threshold: string,
                addresses: string[]
            },
            potentialReward: string,
        }
    }
}

As noted above, this method is defined as being purposed for returning back the "current validators of the given Subnet".

But if we look at how the various parameters are defined and their usage within the context of this API call, the cause of the issue becomes readily apparent.

Specifically, one would imagine that the subnetIDparam would be required for this call to evaluate successfully on the endpoint - but it is not. The docs define subnetID as a "the subnet whose current validators are returned". But goes on to state that "if omitted, returns the current validators of the Primary Network" (P Chain)

Similarly, the `nodeIDs' param is defined as a "list of the NodeIDs of the current validators to request", but "if ommitted, all current validators are returned".

Below is an excerpt from the 'signature' for this API method that defines all of the information that accompanies each validator.

validators: []{
    txID: string,
    startTime: string,
    endTime: string,
    stakeAmount: string,
    nodeID: string,
    weight: string,
    validationRewardOwner: {
        locktime: string,
        threshold: string,
        addresses: string[]
    },
    delegationRewardOwner: {
        locktime: string,
        threshold: string,
        addresses: string[]
    },
    potentialReward: string,
    delegationFee: string,
    uptime: string,
    connected: bool,
    signer: {
        publicKey: string,
        proofOfPosession: string
    },
    delegators: []{
        txID: string,
        startTime: string,
        endTime: string,
        stakeAmount: string,
        nodeID: string,
        rewardOwner: {
                locktime: string,
                threshold: string,
                addresses: string[]
        },
        potentialReward: string,
    }
}

Executing This API Call on the Avalanche Endpoint

For the sake of exploration, I decided to go ahead and execute this call on the provided Avalanche endpoint.

The sample curl command given by Avalanche for this method was:

curl -X POST --data '{
    "jsonrpc": "2.0",
    "method": "platform.getCurrentValidators",
    "params": {},
    "id": 1
}' -H 'content-type:application/json;' 127.0.0.1:9650/ext/bc/P

We're going to change this command slightly so that:

The Avalanche endpoint is inserted in the place of the IP address
The curl command is formatted a bit more coherently for the sake of portability and readability.

Without wasting time detailing those efforts, I have that curl command published below.

curl --location --request POST 'https://api.avax.network/ext/bc/P' \
--header 'Content-Type: application/json' \
--header 'Content-Length: 77' \
--data-raw '{
  "id":2,
  "method":"platform.getCurrentValidators",
  "jsonrpc":"2.0"
}'

For starters, I (foolishly) ran this in a naked terminal. Just for kicksj to see what would happen.

As we can see from the GIF above, the response that's generated from the endpoint is absolutely massive.

Granted, there is a 1015 error that's incurred everytime attempts to run the command subsequently on the same device (likely due to strong rate limiting to try to mitigate the situation), but this is something that can be bypassed with ease as we'll see later.

Running the Command in Postman

After seeing the type of output that was generated by the response when I pinged the endpoint, I decided to try the curl command in Postman to see if I could get a gauge of how large the response was.

The result was actually pretty nuts.

For those curious about what the crafted request looks like in Postman, here it is:

Running this command in Postman causes it to crash every single time its ran (tried it on different computers, diff operating systems & browsers). But I could see that the value of the response fed back from the server was >20MB+, which is actually preposterous for this type of data (JSON).

At 20+ MB, the size of the response is far in excess of what anyone would want their API endpoint to be returning for any request when the data is in this format (also not chunked at all).

Avalanche's Entire Network Could be Knocked Off With Ease

None of the supposed mitigations that Avalanche has proposed will assuage this situation in the least bit.

I am aware that there is a section of the API documentation that notes that the API requests made on the main public API endpoint hosted by Avalanche are supposed to be load balanced.

Going back to the panel on Postman that shows the cookies embedded by the API endpoint reveals what mechanism Avalanche is using for their load balancing.

AWS ALB Cookies

These cookies are deployed to create "sticky sessions" for one's load balancer implementation provisioned with AWS.

These cookies essentially are attached to various users making requests on a load-balanced endpoint to ensure that the user interacts with the same internal endpoint among the load-balanced services.

How This Can Be Trivially Bypassed

If we were to append 8KB+ of data to the end of our request, then the servers will no longer be able to tag us as an entity, effectively eroding whatever protections were granted by the WAF (web application firewall) & its security assurances on the API endpoint as well.

[source]

Other Factors Making This Vulnerability a Done Deal

Since the size of the request is so nominal on our side (as the sender), it would be virtually impossible for Avalanche to try to scan the traffic headed to its WAF setup in order to parse between "legitimate" and non-legit activity.

This is especially true when considering the fact that this payload is generated by virtue of submitting a legitimate request to the server. We not only followed the API's documented methods & parameters, but we also explored this attack vector by using the 'example' request provided on their site for this method (seriously).

Hopefully Avalanche Pays the Diligent Researcher That Uncovered These Issues

He did his job and found a legitimate denial of service vulnerability.

It would be trivially easy to exhaust the main public API endpoint for Avalanche with the information we've gathered here today.

If you've read up to this point and you're at somewhat of a loss because you were expecting to see a 'walkthrough' or guide on how to do so - sorry. You won't find one here and that's only due to a personal code of ethics & just general decorum.

For those that are familiar with the topics discussed in this vulnerability report, I'd imagine that the attack vector became glaringly obvious for you the second you saw the YouTube video referenced at the top of this piece.

Hopefully publishing this report prompts the Avalanche team to take the safety and security of their $5.6 billion platform a little more seriously.

Librechain