Today, we’re excited to launch SYNTHETIC-2, our next-generation, open-source reasoning dataset and planetary-scale, pipeline-parallel decentralized inference run.
Built on our peer-to-peer inference stack and powered by the new DeepSeek-R1-0528 model, SYNTHETIC-2 generates verified reasoning traces spanning the most comprehensive set of complex reinforcement-learning tasks and verifiers released to date.
The run supports heterogeneous compute—letting everyone, from consumer GPUs to hyperscale NVIDIA and AMD clusters, contribute meaningful work towards frontier-level AGI research.
With TOPLOC v2 verifiable proofs, we ensure honest computation. There is no waitlist, no permission required, and no cap on how much compute you can bring. Just spin up your GPUs and start helping us advance towards open-source superintelligence.
A few weeks ago we previewed our peer-to-peer decentralized inference stack. Today that stack moves into production—fully integrated with:
Frontier models such as DeepSeek-R1 with hundreds of billions of parameters do not fit into the GPU memory of a single GPU. With pipeline parallelism, instead of keeping the entire model on every GPU, we divide it into sequential stages. Each device—whether an H100 in a data center or a consumer RTX 4090 card—stores only its stage, processes its slice of the forward pass, and streams the activation to the next peer. This is enables us to run large models on consumer devices.
Our protocol infrastructure dynamically assigns heterogenous GPUs in the network to groups to work in a pipeline parallel fashion on the highest throughput tasks.
To trust results from thousands of permissionless nodes, we must verify their generations cheaply. Our TOPLOC verifiable inference work employs a compact locality sensitive hashing scheme for intermediate activations, which can detect unauthorized modifications to models, prompts, or precision.
TOPLOC v2 extends this scheme to the pipeline parallel inference setting:
Additionally, our original TOPLOC approach was able to verify that the computation until the last hidden state was done correctly, but it couldn’t yet detect changes in sampling behavior, such as speculative decoding or inputting arbitrary token sequences during the forward pass.
TOPLOC v2 introduces a novel approach to effectively address this last problem for fully verifiable inference. TOPLOC v2 utilizes reproducible Gumbel noises for categorical sampling, allowing verifiers to perform parallel estimation of the original token sampling significantly faster than the original inference, with a quantifiable margin of error. The approach is robust across diverse model parallel configurations, GPU types, and kernel implementations, providing flexibility in both hardware selection and deployment for inference and verification.
Our full arxiv paper release of TOPLOC v2’s sampling proof approach is coming in the next couple weeks.
We’ve updated our protocol codebase with the following feature and scalability improvements over the past couple of months:
SYNTHETIC-2 consists of a large set of verifiable reasoning tasks as well as reasoning traces obtained from multiple models. This design serves two purposes:
Beyond traditional mathematics and coding problems, SYNTHETIC-2 aims to cover highly diverse tasks to teach models reasoning skills that generalize better beyond mathematics and coding. By aggregating data from publicly available datasets, using existing research repositories, and designing several reasoning tasks that can be generated programmatically on our own, we collect more than 20 difficult reasoning tasks and implement verifiers for them inside of our framework prime-rl. These tasks range from games such as puzzles from reasoning-gym, through kernel engineering, to precise JSON-format adherence.
Apart from verifiable tasks, we collect tasks whose responses are not meant to be verifiable in a rule-based manner, but only through reward models. These prompts are specifically meant to generate diverse SFT data to avoid training on a task distribution that is too narrow. These non-verifiable tasks include problems such as critique fine-tuning or questions from public forums such as Reddit or Stack Exchange.
We generate reasoning traces for all of our tasks using DeepSeek-R1-0528. Additionally, for all tasks that are verifiable programmatically, we generate reasoning data from the following models to annotate our dataset for difficulty:
Our full SYNTHETIC-2 tasks dataset is available on HuggingFace.
SYNTHETIC-2 supports two methods of contributing compute:
You can contribute using either method by clicking “Contribute Compute” on the SYNTHETIC-2 dashboard.
Once your node joins the network, it will automatically starting working in a group of other nodes on the highest throughput model, and it’ll also show up on the map and the leaderboard — where your contributions will be tracked.
Building on the launch of SYNTHETIC-2, our next step is to leverage the SYNTHETIC-2 tasks dataset as the foundation for our next distributed RL run. INTELLECT-2 has already shown that globally distributed reinforcement learning works—now it’s time to demonstrate its promise as a novel scaling paradigm, unlocking even more compute and achieving state-of-the-art model performance. Since the INTELLECT-2 release, we’ve made significant improvements to the stability of asynchronous RL at large scale and are confident these improvements will lead to state-of-the-art reasoning models trained in a decentralized fashion.
To expand the diversity of our RL environment’s ecosystem, we will integrate the verifiers repository as our core library for crowdsourcing complex RL environments from the open-source community. More details on this soon!
Our goal is to introduce additional multi-turn and tool-use environments—especially for coding and autonomous research tasks—to unlock SOTA coding-agent capabilities with our INTELLECT-3 model.