Skip to main content

Sampling

Sampling lets your server request LLM completions from the client. This enables agentic behaviors without needing your own API keys - the client controls model access.

See MCP sampling for the full concept.

Requesting a Completion

Use the context's sample method:

import cats.effect.IO
import mcp.protocol.{Content, CreateMessageResult, Role, SamplingMessage}
import mcp.server.{SampleResult, ToolContext}

def analyze(ctx: ToolContext[IO]): IO[List[Content]] = {
val messages = List(
SamplingMessage(Role.user, List(Content.Text("Summarize this data: ...")))
)

ctx.sample(messages, maxTokens = 500).map {
case SampleResult.Success(result) =>
val text = result.content.collect { case Content.Text(t, _, _) => t }.mkString
List(Content.Text(s"Analysis: $text"))
case SampleResult.Failed(reason) =>
List(Content.Text(s"Sampling failed: $reason"))
case SampleResult.NotSupported =>
List(Content.Text("Client doesn't support sampling"))
}
}

Model Preferences

Don't hardcode model names. Instead, express preferences and let the client choose:

import mcp.protocol.{ModelHint, ModelPreferences, SamplingMessage}
import mcp.server.ToolContext
import cats.effect.IO

def sampleWithPreferences(ctx: ToolContext[IO], messages: List[SamplingMessage]): Unit = {
ctx.sample(
messages = messages,
maxTokens = 100,
modelPreferences = Some(ModelPreferences(
hints = Some(List(ModelHint(name = Some("claude-3-sonnet")), ModelHint(name = Some("claude")))),
intelligencePriority = Some(0.8),
speedPriority = Some(0.5),
costPriority = Some(0.3)
))
)
}

Priority Scales (0.0 to 1.0)

PriorityHigh value means...
intelligencePriorityPrefer more capable models
speedPriorityPrefer faster models
costPriorityPrefer cheaper models

Hints

Hints are advisory - the client may map them to equivalent models:

import mcp.protocol.ModelHint

ModelHint(name = Some("claude-3-sonnet")) // Prefer Sonnet-class
ModelHint(name = Some("claude")) // Fall back to any Claude

System Prompt

Provide context for the model:

import mcp.protocol.SamplingMessage
import mcp.server.ToolContext
import cats.effect.IO

def sampleWithSystem(ctx: ToolContext[IO], messages: List[SamplingMessage]): Unit = {
ctx.sample(
messages = messages,
maxTokens = 1000,
systemPrompt = Some("You are a senior code reviewer. Be constructive.")
)
}

Checking Client Support

Sampling requires client capability. Check before using:

import cats.effect.IO
import mcp.protocol.Content
import mcp.server.{SamplingCapability, ToolContext}

def checkSampling(ctx: ToolContext[IO]): IO[List[Content]] = {
ctx.samplingCapability match {
case SamplingCapability.Supported =>
// Safe to call ctx.sample(...)
IO.pure(List(Content.Text("Sampling supported")))
case SamplingCapability.NotSupported =>
IO.pure(List(Content.Text("Client doesn't support sampling")))
}
}

Use Cases

  • Agentic tools - Let tools make decisions using LLM reasoning
  • Content generation - Generate text, summaries, translations
  • Analysis - Have the LLM interpret data your server provides