What makes a relay trust-worthy?

guayabyte · August 23, 2022, 7:44pm

This is a brainstorming to collect more specific details of what does it mean for a relay to be trust-worthy. Please reply with your requirements. This topic will be updated to reflect the main findings.

API

Operation

Operated by a reputable organization well-known in the Ethereum community.
Open source code.
Transparent deployment process.
Data available to verify source code behavior.
Clear statement of any filtering or censoring.
Wait for 16 epochs after the merge to propose the first blocks.
Treats all builders equally.
Take down the server when a high-severity bug is found, so validators switch to local building and don’t miss slots.
Publishes a post-mortem after every incident.
In case of missing blocks, does not retroactively pay to the affected validators.

Performance

Always reply to the requests from the current slot proposer.
Reply to a header request in less than 1 second.
Reply to a block request in less than 1 second.
Always serves valid blocks.

Testing

More than 90% unit test code coverage.
Test relay running in Goerli.
Pass a shadow fork test.

Security

Supports the relay monitor.
An indepedent security audit published.
A bug bounty.
Aware of the risks of vertical integration, and willing to step-down if their role becomes risky.

With the current design of proposer/builder separation that will go live at the merge, the relay is still a trusted mediator between proposers and builders.

In particular:

it can steal MEV opportunities from builders
it can lie on the amount to be paid to the proposer
it can withhold the block body after it has been signed by the proposer
it can deliver an invalid block to the proposer
it can filter out any transactions it dislikes

So a relay has to promise to do its best effort to avoid those situations, it has to promise to be trust-worthy.

For builders, see How do I choose a good block builder?

ralexstokes · August 23, 2022, 8:05pm

Reply with a valid block in less than 2 seconds.

this could broken down into the first round getting bids or the second round getting the full blocks

just want to point out there is a 1 second timeout at the moment for mev-boost to provide any available bids before clients will fall back to their local builder

guayabyte · September 6, 2022, 2:16pm

I’ve updated the scoring of the proposer payment to block level scoring as discussed in Block scoring for mev-boost relays.

This now seems the easier, cheaper, and more forward-looking option. I’m aware this brings challenges to Lido and maybe some other node operators, so we have to continue building on top of it to come with a solution that is satisfactory for everybody.

cc @TheDZhon.

tjd.im · October 31, 2022, 11:00pm

@guayabyte Could you elaborate here on what a transparent deployment process would look like? Open sourcing terraform code?

guayabyte · November 1, 2022, 4:11pm

That would be perfect.

At the very least, it has to be documented. From there, the fancier the better. I imagine a system that when a tag is created, the source and the binaries are published, this triggers integration tests in staging, then canary deployment in production, and then full release. Everything automated with manual gates to move to the next stage.

This is, of course, hard. We have been improving our infrastructure, we are hiring a second devops engineer, and plan to share more about the way we run our relay. We have been supporting other teams that run relays, and I would be very happy to collaborate with them on improving our release processes.

tjd.im · November 4, 2022, 9:33pm

Amazing. Looking forward to see how Flashbots illuminates their deploy/release processes.

tjd.im · November 4, 2022, 9:43pm

When it comes to performance I think we should add uptime here. However, I do think it is important that the /eth/v1/builder/status endpoint be dynamic and return the last block number/header or a timestamp. Otherwise there’s nothing preventing a relay from setting their /eth/v1/builder/status endpoint to a statically served page to created artificial uptime. Curious to hear thoughts on this.

metachris · November 5, 2022, 11:34am

Agree, returning highest slot would make sense (timestamp is too easy to game with a simple nginx lua script). To push this forward, the next step would be creating an issue at GitHub - ethereum/builder-specs: Specification for the external block builders. - which is where this payload would be defined.

On the other side, why not just look at the delay on the getHeader call? You can do that on every slot.

tjd.im · November 7, 2022, 7:33pm

Yeah that’s a good idea. Would you look at the latest slot or call some past slot though? And what would the delay threshold be?

If we looked at the latest slot we’d need to update these values every new slot: /eth/v1/builder/header/{slot}/{parent_hash}/{pubkey}

metachris · November 8, 2022, 8:15am

Only the latest slot is reliable, many relays don’t provide bids for past slots. You ask for the bid 12 seconds after the head-slot event.

You can get all the required url args from the beacon node. See here for details: relay-monitor/collector.go at 587a4e3ccefacc89a688155b6da6031c847a8971 · ralexstokes/relay-monitor · GitHub