Cells ADR 001: Routing Technology using Cloudflare Workers
Context
In https://gitlab.com/groups/gitlab-org/-/epics/11002 we first brainstormed multiple options and investigated our 2 top technologies, Cloudflare Worker & Istio.
We favored the Cloudflare Worker PoC and extended the PoC with the Cell 1.0 proposal to have multiple routing rules. These PoCs help validate the routing service blueprint, that got accepted in https://gitlab.com/gitlab-org/gitlab/-/merge_requests/142397, and rejected the request buffering, and routes learning
Decision
Use Cloudflare Workers written in JavaScript/TypeScript to route the request to the right cell, following the accepted routing service blueprint.
Cloudflare Workers meets all our requirments apart from the self-managed
, which is a low priority requirment.
You can read a detailed analysis of Cloudflare workers in https://gitlab.com/gitlab-org/gitlab/-/issues/433471#results
Consequences
- We will be choosing a technology stack knowing that it will not support all self-managed customers.
- More vendor locking with Cloudflare, but are already heavily dependent on them.
- Run compute in a new platform, outside of GCP, however we already use Cloudflare.
- We anticipate that we might to rewrite Routing Service if the decision changes. We don't expect this to be big risk, since we expect Routing Service to be very small and simple (up to 1000 lines of code).
Alternatives
- We considered Istio but concluded that it's not the right fit.
- We considered Request Buffering
- We considered Routes Learning
- Use WASM for Cloudflare workers which is the wrong choice: https://blog.cloudflare.com/webassembly-on-cloudflare-workers#whentousewebassembly