Gensyn: RL Swarm Client Interface


A peer-to-peer system for collaborative reinforcement learning over the internet, running on consumer hardware.


cumulative reward
[+]

Models in the swarm receive rewards based on the following criteria:

  • Formatted → does the model generate output matching the specified format?
  • Correct → is the final answer mathematically correct and formatted correctly?
  • Insightful → in stages requiring reference to best messages from prior rounds, does the model reference those messages, and do they meet the reward criteria for that round?

* * *

This graph displays the cumulative reward for each node from the moment the page is loaded, not the full history from the start of a round.

< FETCHING LEADERS >
leaderboard : Round 0, stage 0
    gossip
      < FETCHING GOSSIP >