Authors

Jackmin

Manveer

Justus

Jannik

Sami

Johannes

Announcements

June 23, 2025

SYNTHETIC-2: Planetary-Scale Pipeline Parallel Inference for Verified Reasoning

Today, we’re excited to launch SYNTHETIC-2, our next-generation, open-source reasoning dataset and planetary-scale, pipeline-parallel decentralized inference run.

Built on our peer-to-peer inference stack and powered by the new DeepSeek-R1-0528 model, SYNTHETIC-2 generates verified reasoning traces spanning the most comprehensive set of complex reinforcement-learning tasks and verifiers released to date.

The run supports heterogeneous compute—letting everyone, from consumer GPUs to hyperscale NVIDIA and AMD clusters, contribute meaningful work towards frontier-level AGI research.

With TOPLOC v2 verifiable proofs, we ensure honest computation. There is no waitlist, no permission required, and no cap on how much compute you can bring. Just spin up your GPUs and start helping us advance towards open-source superintelligence.

Contribute compute

‍

Planetary-Scale Inference: Our Peer-To-Peer Decentralized Inference Stack

A few weeks ago we previewed our peer-to-peer decentralized inference stack. Today that stack moves into production—fully integrated with:

prime-rl – our fault-tolerant, asynchronous distributed RL library
TOPLOC verifiable-computing proofs for pipeline parallel inference (+ v2, see below)
The Prime Intellect protocol infrastructure layer

Frontier models such as DeepSeek-R1 with hundreds of billions of parameters do not fit into the GPU memory of a single GPU. With pipeline parallelism, instead of keeping the entire model on every GPU, we divide it into sequential stages. Each device—whether an H100 in a data center or a consumer RTX 4090 card—stores only its stage, processes its slice of the forward pass, and streams the activation to the next peer. This is enables us to run large models on consumer devices.

**Pipeline Parallel Communication.** Each node sends hidden states to the next stage worker. The final device decodes the next token and sends it back to the first worker, and the cycle repeats.

Our protocol infrastructure dynamically assigns heterogenous GPUs in the network to groups to work in a pipeline parallel fashion on the highest throughput tasks.

TOPLOC v2

To trust results from thousands of permissionless nodes, we must verify their generations cheaply. Our TOPLOC verifiable inference work employs a compact locality sensitive hashing scheme for intermediate activations, which can detect unauthorized modifications to models, prompts, or precision.

TOPLOC v2 extends this scheme to the pipeline parallel inference setting:

Group-level reward: If the final output is correct, we treat it as evidence that all pipeline stages behaved honestly.
Blame assignment on failure: If a result fails verification, we replay proofs stage-by-stage, pinpoint the first faulty worker, reject the output, and slash that node’s reward.

Additionally, our original TOPLOC approach was able to verify that the computation until the last hidden state was done correctly, but it couldn’t yet detect changes in sampling behavior, such as speculative decoding or inputting arbitrary token sequences during the forward pass.

TOPLOC v2 introduces a novel approach to effectively address this last problem for fully verifiable inference. TOPLOC v2 utilizes reproducible Gumbel noises for categorical sampling, allowing verifiers to perform parallel estimation of the original token sampling significantly faster than the original inference, with a quantifiable margin of error. The approach is robust across diverse model parallel configurations, GPU types, and kernel implementations, providing flexibility in both hardware selection and deployment for inference and verification.

Our full arxiv paper release of TOPLOC v2’s sampling proof approach is coming in the next couple weeks.

Protocol

We’ve updated our protocol codebase with the following feature and scalability improvements over the past couple of months:

Dynamic Grouping: Worker GPU nodes are now dynamically grouped based on task requirements and geolocation to maximize pipeline utilization and inference and training efficiency.
Extensible Plugin System: A modular plugin architecture now enables reusable components across different compute pools.
P2P Communication Layer: We’ve introduced secure peer-to-peer communication between protocol components—such as the validator and the orchestrator.
AMD Support: ROCm devices such as the MI300X can now join the compute pool.
Observability & Monitoring: We’ve integrated Prometheus dashboards to better monitor the global compute fabric.

SYNTHETIC-2 Dataset

SYNTHETIC-2 consists of a large set of verifiable reasoning tasks as well as reasoning traces obtained from multiple models. This design serves two purposes:

Supervised Training Data: High quality reasoning traces are crucial as cold start SFT data for reasoning models as well as mid-training data for base models. Using DeepSeek-R1-0528, the strongest open reasoning model, we generate a large amount of such data that is verified for correctness.
Difficulty-Annotated RL Data: Previous work has shown that RL datasets have to be carefully filtered for difficulty as measured by the base model’s pass rate to obtain performance improvements. Using a variety of smaller models, we annotate our RL tasks with pass@k rates as a proxy for difficulty.

Beyond traditional mathematics and coding problems, SYNTHETIC-2 aims to cover highly diverse tasks to teach models reasoning skills that generalize better beyond mathematics and coding. By aggregating data from publicly available datasets, using existing research repositories, and designing several reasoning tasks that can be generated programmatically on our own, we collect more than 20 difficult reasoning tasks and implement verifiers for them inside of our framework prime-rl. These tasks range from games such as puzzles from reasoning-gym, through kernel engineering, to precise JSON-format adherence.

Apart from verifiable tasks, we collect tasks whose responses are not meant to be verifiable in a rule-based manner, but only through reward models. These prompts are specifically meant to generate diverse SFT data to avoid training on a task distribution that is too narrow. These non-verifiable tasks include problems such as critique fine-tuning or questions from public forums such as Reddit or Stack Exchange.

We generate reasoning traces for all of our tasks using DeepSeek-R1-0528. Additionally, for all tasks that are verifiable programmatically, we generate reasoning data from the following models to annotate our dataset for difficulty:

Our full SYNTHETIC-2 tasks dataset is available on HuggingFace.

How To Contribute Compute

SYNTHETIC-2 supports two methods of contributing compute:

Fully Managed: Rent GPUs on Prime Intellect and contribute them to the run. Full Tutorial Walkthrough
Self-Hosted: Bring your own at-home or hosted GPUs and contribute them. Full Tutorial Walkthrough

You can contribute using either method by clicking “Contribute Compute” on the SYNTHETIC-2 dashboard.

Contribute compute

‍

Once your node joins the network, it will automatically starting working in a group of other nodes on the highest throughput model, and it’ll also show up on the map and the leaderboard — where your contributions will be tracked.

Next Steps

Building on the launch of SYNTHETIC-2, our next step is to leverage the SYNTHETIC-2 tasks dataset as the foundation for our next distributed RL run. INTELLECT-2 has already shown that globally distributed reinforcement learning works—now it’s time to demonstrate its promise as a novel scaling paradigm, unlocking even more compute and achieving state-of-the-art model performance. Since the INTELLECT-2 release, we’ve made significant improvements to the stability of asynchronous RL at large scale and are confident these improvements will lead to state-of-the-art reasoning models trained in a decentralized fashion.

To expand the diversity of our RL environment’s ecosystem, we will integrate the verifiers repository as our core library for crowdsourcing complex RL environments from the open-source community. More details on this soon!

Our goal is to introduce additional multi-turn and tool-use environments—especially for coding and autonomous research tasks—to unlock SOTA coding-agent capabilities with our INTELLECT-3 model.

Eurus-2-PRIME

System Prompt

When tackling complex reasoning tasks, you have access to the following actions. Use them as needed to progress through your thought process. [ASSESS] [ADVANCE] [VERIFY] [SIMPLIFY] [SYNTHESIZE] [PIVOT] [OUTPUT] You should strictly follow the format below: [ACTION NAME] # Your action step 1 # Your action step 2 # Your action step 3 ... Next action: [NEXT ACTION NAME]

User

Quadratic polynomials $P(x)$ and $Q(x)$ have leading coefficients $2$ and $-2,$ respectively. The graphs of both polynomials pass through the two points $(16,54)$ and $(20,53).$ Find $P(0) + Q(0).$

Eurus-2-PRIME

[ASSESS]

We are given two quadratic polynomials $P(x)$ and $Q(x)$ with leading coefficients $2$ and $-2$, respectively. The graphs of both polynomials pass through the points $(16,54)$ and $(20,53)$. We need to find the value of $P(0) + Q(0)$.

[ADVANCE]

Let's express the quadratic polynomials $P(x)$ and $Q(x)$ in the general form:
$ P(x) = 2x^2 + bx + c $
$ Q(x) = -2x^2 + dx + e $

Since the graphs of both polynomials pass through the points $(16,54)$ and $(20,53)$, we can substitute these points into the equations of $P(x)$ and $Q(x)$ to form a system of equations.

For $P(x)$:
$ P(16) = 2(16)^2 + 16b + c = 54 $
$ 512 + 16b + c = 54 $
$ 16b + c = -458 \quad \text{(Equation 1)} $

$ P(20) = 2(20)^2 + 20b + c = 53 $
$ 800 + 20b + c = 53 $
$ 20b + c = -747 \quad \text{(Equation 2)} $

For $Q(x)$:
$ Q(16) = -2(16)^2 + 16d + e = 54 $
$ -512 + 16d + e = 54 $
$ 16d + e = 566 \quad \text{(Equation 3)} $

$ Q(20) = -2(20)^2 + 20d + e = 53 $
$ -800 + 20d + e = 53 $
$ 20d + e = 853 \quad \text{(Equation 4)} $

[ADVANCE]

Now, we can solve the system of equations for $P(x)$:
Subtract Equation 1 from Equation 2:
$ (20b + c) - (16b + c) = -747 - (-458) $
$ 4b = -289 $
$ b = -\frac{289}{4} $

Substitute $b = -\frac{289}{4}$ into Equation 1:
$ 16\left(-\frac{289}{4}\right) + c = -458 $
$ -1156 + c = -458 $
$ c = 698 $

So, for $P(x)$, we have:
$ P(x) = 2x^2 - \frac{289}{4}x + 698 $
$ P(0) = 698 $

Next, we can solve the system of equations for $Q(x)$:
Subtract Equation 3 from Equation 4:
$ (20d + e) - (16d + e) = 853 - 566 $
$ 4d = 287 $
$ d = \frac{287}{4} $

Substitute $d = \frac{287}{4}$ into Equation 3:
$ 16\left(\frac{287}{4}\right) + e = 566 $
$ 1148 + e = 566 $
$ e = -582 $

So, for $Q(x)$, we have:
$ Q(x) = -2x^2 + \frac{287}{4}x - 582 $
$ Q(0) = -582 $

[ADVANCE]

Finally, we can find $P(0) + Q(0)$:
$ P(0) + Q(0) = 698 + (-582) = 116 $

So, the value of $P(0) + Q(0)$ is $\boxed{116}$.

INTELLECT-MATH-7B-SFT (ours)

System Prompt

Solve the following math problem efficiently and clearly. Think carefully and step by step about your response and reason before providing a final response. Conclude your response with: \n\nTherefore, the final answer is: $\\boxed{answer}$. I hope it is correct.\n\nWhere [answer] is just the final number or expression that solves the problem.

User

Quadratic polynomials $P(x)$ and $Q(x)$ have leading coefficients $2$ and $-2,$ respectively. The graphs of both polynomials pass through the two points $(16,54)$ and $(20,53).$ Find $P(0) + Q(0).$

INTELLECT-MATH-7B-SFT

So I've got this problem here about quadratic polynomials $P(x)$ and $Q(x)$. $P(x)$ has a leading coefficient of $2$, and $Q(x)$ has a leading coefficient of $-2$. Both polynomials pass through the points $(16,54)$ and $(20,53)$. I need to find $P(0) + Q(0)$.

First, since both $P(x)$ and $Q(x)$ are quadratic polynomials, I can express them in the general form:

$P(x) = 2x^2 + bx + c$

$Q(x) = -2x^2 + dx + e$

Wait, actually, the leading coefficient is the coefficient of the highest power of $x$, which for a quadratic is $x^2$. So yes, $P(x) = 2x^2 + bx + c$ and $Q(x) = -2x^2 + dx + e$.

Now, both polynomials pass through the points $(16,54)$ and $(20,53)$. That means when $x=16$, $P(16)=54$, and when $x=20$, $P(20)=53$. Similarly for $Q(x)$.

So, I can set up equations using these points.

First, for $P(x)$:

$P(16) = 2(16)^2 + b16 + c = 54$

Similarly, $P(20) = 2(20)^2 + b20 + c = 53$

Let me calculate $16^2$ and $20^2$:

$16^2 = 256$

$20^2 = 400$

So,

$P(16): 2256 + 16b + c = 54 \Rightarrow 512 + 16b + c = 54 \Rightarrow 16b + c = 54 - 512 \Rightarrow 16b + c = -458$

$P(20): 2400 + 20b + c = 53 \Rightarrow 800 + 20b + c = 53 \Rightarrow 20b + c = 53 - 800 \Rightarrow 20b + c = -747$

Now, I have two equations:

1) $16b + c = -458$

2) $20b + c = -747$

I can subtract equation 1 from equation 2 to eliminate $c$:

$(20b + c) - (16b + c) = -747 - (-458)$

$20b + c - 16b - c = -747 + 458$

$4b = -289$

So, $b = -289 / 4 = -72.25$

Wait, but coefficients are usually integers, but maybe not. Let me check.

Now, plug $b$ back into one of the equations to find $c$. Let's use equation 1:

$16(-72.25) + c = -458$

$-1156 + c = -458$

$c = -458 + 1156 = 698$

So, $P(x) = 2x^2 - 72.25x + 698$

Hmm, $P(0)$ would be $c$, which is $698$.

Now, let's do the same for $Q(x)$.

$Q(16) = -2*(16)^2 + d16 + e = 54 \Rightarrow -2256 + 16d + e = 54 \Rightarrow -512 + 16d + e = 54 \Rightarrow 16d + e = 54 + 512 \Rightarrow 16d + e = 566$

$Q(20) = -2*(20)^2 + d20 + e = 53 \Rightarrow -2400 + 20d + e = 53 \Rightarrow -800 + 20d + e = 53 \Rightarrow 20d + e = 53 + 800 \Rightarrow 20d + e = 853$

Now, equations for $Q(x)$:

1) $16d + e = 566$

2) $20d + e = 853$

Subtract equation 1 from equation 2:

$(20d + e) - (16d + e) = 853 - 566$

$4d = 287$

$d = 287 / 4 = 71.75$

Now, plug $d$ back into equation 1:

$1671.75 + e = 566 \Rightarrow 1148 + e = 566 \Rightarrow e = 566 - 1148 = -582$

So, $Q(x) = -2x^2 + 71.75x - 582$

Therefore, $Q(0) = e = -582$

Now, $P(0) + Q(0) = 698 + (-582) = 116$

Wait, but the problem says "present the answer in LaTeX format: \boxed{Your answer}"

So, I should write $\boxed{116}$

But let me double-check my calculations to make sure I didn't make any mistakes.

First, for $P(x)$:

$P(16) = 2(256) + 16b + c = 512 + 16b + c = 54 \Rightarrow 16b + c = -458$

$P(20) = 2*(400) + 20b + c = 800 + 20b + c = 53 \Rightarrow 20b + c = -747$

Subtracting: $4b = -289 \Rightarrow b = -72.25$

Then, $c = -458 - 16*(-72.25) = -458 + 1156 = 698$

Similarly for $Q(x)$:

$Q(16) = -2*(256) + 16d + e = -512 + 16d + e = 54 \Rightarrow 16d + e = 566$

$Q(20) = -2*(400) + 20d + e = -800 + 20d + e = 53 \Rightarrow 20d + e = 853$

Subtracting: $4d = 287 \Rightarrow d = 71.75$

Then, $e = 566 - 16*71.75 = 566 - 1148 = -582$

So, $P(0) = 698$, $Q(0) = -582$, sum is $116$.

Seems correct.

Final Answer

$\boxed{116}$

INTELLECT-MATH-7B (ours)

System Prompt

User

Quadratic polynomials $P(x)$ and $Q(x)$ have leading coefficients $2$ and $-2,$ respectively. The graphs of both polynomials pass through the two points $(16,54)$ and $(20,53).$ Find $P(0) + Q(0).$

INTELLECT-MATH-7B

So I have two quadratic polynomials, $P(x)$ and $Q(x)$, with leading coefficients 2 and -2, respectively. Both pass through the points $(16,54)$ and $(20,53)$. I need to find $P(0) + Q(0)$.

First, since both are quadratic polynomials, I can express them in the general form:

$P(x) = 2x² + bx + c$

$Q(x) = -2x² + dx + e$

But since both pass through the points $(16,54)$ and $(20,53)$, I can set up equations using these points.

For $P(x)$:

$P(16) = 2*(16)² + b \cdot 16 + c = 54$

$P(20) = 2(20)² + b \cdot 20 + c = 53$

Similarly for $Q(x)$:

$Q(16) = -2(16)² + d \cdot 16 + e = 54$

$Q(20) = -2(20)² + d \cdot 20 + e = 53$

Let's compute the squares:

$16² = 256$

$20² = 400$

So for $P(x)$:

$2256 + 16b + c = 54 \Rightarrow 512 + 16b + c = 54 \Rightarrow 16b + c = 54 - 512 = -458 \quad ...(1)$

$2400 + 20b + c = 53 \Rightarrow 800 + 20b + c = 53 \Rightarrow 20b + c = 53 - 800 = -747 \quad ...(2)$

For $Q(x)$:

$-2256 + 16d + e = 54 \Rightarrow -512 + 16d + e = 54 \Rightarrow 16d + e = 54 + 512 = 566 \quad ...(3)$

$-2400 + 20d + e = 53 \Rightarrow -800 + 20d + e = 53 \Rightarrow 20d + e = 53 + 800 = 853 \quad ...(4)$

Now, I have two systems of linear equations:

For $P(x)$:

$16b + c = -458 \quad ...(1)$

$20b + c = -747 \quad ...(2)$

Subtract equation (1) from equation (2):

$(20b + c) - (16b + c) = -747 - (-458)$

$4b = -289 \Rightarrow b = -\frac{289}{4}$

Now plug $b$ back into equation (1):

$16\left(-\frac{289}{4}\right) + c = -458 \Rightarrow -1156 + c = -458 \Rightarrow c = -458 + 1156 = 698$

So $P(x) = 2x² - \frac{289}{4}x + 698$

For $Q(x)$:

$16d + e = 566 \quad ...(3)$

$20d + e = 853 \quad ...(4)$

Subtract equation (3) from equation (4):

$(20d + e) - (16d + e) = 853 - 566$

$4d = 287 \Rightarrow d = \frac{287}{4}$

Now plug $d$ back into equation (3):

$16\left(\frac{287}{4}\right) + e = 566 \Rightarrow 1148 + e = 566 \Rightarrow e = 566 -1148= -582$

So $Q(x)=-2x^{2}+\frac{287}{4}x-582$.

Now, $P(0)=c=698$.

$Q(0)=e=-582$.

Therefore, $P(0)+Q(0)=698+(-582)=116$.

Final Answer

$\boxed{116}$.

Authors

Acknowledgements

Shout out to open-source contributors like rasdani for adding RL environments to prime-rl!

SYNTHETIC-2 Release: Four Million Collaboratively Generated Reasoning Traces

We are releasing SYNTHETIC-2, an open dataset of four million verified reasoning traces spanning the most comprehensive set of complex reinforcement learning tasks and verifiers released to date. The dataset was collaboratively generated by compute contributors across the globe via our pipeline-parallel decentralized inference. Over 1,250 GPUs joined in 3 days — from 4090s to H200s — creating data for complex RL tasks.

PCCL - Prime Collective Communications Library

We are excited to release the Prime Collective Communications Library, a low-level communication library built for decentralized training across the globe.

INTELLECT-2 Release: The First 32B Parameter Model Trained Through Globally Distributed Reinforcement Learning

We're excited to release INTELLECT-2, the first 32B parameter model trained via globally distributed reinforcement learning. Unlike traditional centralized training efforts, INTELLECT-2 trains a reasoning language model using fully asynchronous RL across a dynamic, heterogeneous swarm of permissionless compute contributors.

SYNTHETIC-2: Planetary-Scale Pipeline Parallel Inference for Verified Reasoning

Planetary-Scale Inference: Our Peer-To-Peer Decentralized Inference Stack

TOPLOC v2

Protocol

SYNTHETIC-2 Dataset

How To Contribute Compute

Next Steps

SYNTHETIC-2 Release: Four Million Collaboratively Generated Reasoning Traces

PCCL - Prime Collective Communications Library

INTELLECT-2 Release: The First 32B Parameter Model Trained Through Globally Distributed Reinforcement Learning

Related Posts

SYNTHETIC-2 Release: Four Million Collaboratively Generated Reasoning Traces

PCCL - Prime Collective Communications Library

INTELLECT-2 Release: The First 32B Parameter Model Trained Through Globally Distributed Reinforcement Learning