Apus Console Docs
  • Apus Network: Trustless GPU Network for AI on AO
  • AO
    • Benchmark POC
Powered by GitBook
On this page
  • Benchmark POC: Benchmarking LLM models on AO
  • Prerequisites
  • Processes
  • Detailed Process
  1. AO

Benchmark POC

PreviousApus Network: Trustless GPU Network for AI on AO

Last updated 10 months ago

Benchmark POC: Benchmarking LLM models on AO

This proof of concept (POC) aims to benchmark large language models (LLMs) on AO. Funders can create a funding pool with a set of questions, and models compete to answer them. Participants can train and submit their models, which are evaluated and ranked on a daily-updated leaderboard. At the end of the funding period, winners are determined based on the leaderboard, and rewards are distributed.

Prerequisites

  1. Familiarity with , , and .

  2. AOS and Ardrive installed on your system.

Processes

  1. Pool Creation

  2. Model Upload

  3. Model Evaluation

  4. Scoring and Leaderboard

Detailed Process

1. Pool Creation

Upload Dataset

Upload your chosen benchmarking dataset (e.g., SIQA) to the Arweave blockchain using the ArDrive application or CLI.

Create pool

Create a pool by sending the following message through AOS:

ao.send({
   Target = 'xU9zFkq3X2ZQ6olwNVvr1vUWIjc3kXTWr7xKQD6dh10',
   Action = 'Transfer',
   Recipient = 'DLJoP8Xtdat6SKz3kqYGZPaa7DJBG6etF1jRLQCwquo',
   Quantity = Fee,
   ['X-Dataset'] = <your dataset process id>,
   ['X-Allocation'] = 'ArithmeticDecrease'
})

For details on creating a dataset process ID, refer to the tutorial on our GitHub.

2. Prepare Model

Upload fine-tuned models

Upload two fine-tuned models (e.g., llama3-8B dataset alpaca and samsum) to the Arweave blockchain via the ArDrive application or CLI. For example:

After uploading the model, you get the data tx ID : Such as:ISrbGzQot05rs_HKC08O_SmkipYQnqgB1yC3mjZZeEo

3. Model Evaluation

Register models

Join a pool by sending a message to the pool process with the following payload:

ao.send({
   Target = 'DLJoP8Xtdat6SKz3kqYGZPaa7DJBG6etF1jRLQCwquo',
   Action = 'Join-Pool',
   Data = '{"dataset": <the pool id you want to join>, 
   "model": <your model id>}'})

Once you join a pool, the model will start evaluating the dataset and sending back the score to the pool process.

4. Scoring and Leaderboard

Retrieve Leaderboard Results

Retrieve the leaderboard results by sending a message to the pool process with the following payload:

ao.send({
   Target = 'DLJoP8Xtdat6SKz3kqYGZPaa7DJBG6etF1jRLQCwquo',
   Action = 'Leaderboard',
   Data = <pool id>
})

Leaderboard updated every 24 hours

This message will execute and display the model leaderboard within AOS. Such as:

Untitled
🔑
📽️
📜
AO
AOS
ArDrive