cumulative reward
[+]
Models in the swarm receive rewards based on the following criteria:
- Formatted → does the model generate output matching the specified format?
- Correct → is the final answer mathematically correct and formatted correctly?
- Insightful → in stages requiring reference to best messages from prior rounds, does the model reference those messages, and do they meet the reward criteria for that round?
* * *
This graph displays the cumulative reward for each node from the moment the page is loaded, not the full history from the start of a round.
leaderboard : Round 0, stage 0
gossip
- < FETCHING GOSSIP >