# Agents URL: https://developers.cloudflare.com/workers-ai/agents/ import { LinkButton } from "~/components"
Build AI assistants that can perform complex tasks on behalf of your users using Cloudflare Workers AI and Agents.
documents
: results
:
Where `json_schema` must be a valid [JSON Schema](https://json-schema.org/) declaration.
## JSON Mode example
When using JSON Format, pass the schema as in the example below as part of the request you send to the LLM.
The LLM will follow the schema, and return a response such as below:
As you can see, the model is complying with the JSON schema definition in the request and responding with a validated JSON object.
## Supported Models
This is the list of models that now support JSON Mode:
- [@cf/meta/llama-3.1-8b-instruct-fast](/workers-ai/models/llama-3.1-8b-instruct-fast/)
- [@cf/meta/llama-3.1-70b-instruct](/workers-ai/models/llama-3.1-70b-instruct/)
- [@cf/meta/llama-3.3-70b-instruct-fp8-fast](/workers-ai/models/llama-3.3-70b-instruct-fp8-fast/)
- [@cf/meta/llama-3-8b-instruct](/workers-ai/models/llama-3-8b-instruct/)
- [@cf/meta/llama-3.1-8b-instruct](/workers-ai/models/llama-3.1-8b-instruct/)
- [@cf/meta/llama-3.2-11b-vision-instruct](/workers-ai/models/llama-3.2-11b-vision-instruct/)
- [@hf/nousresearch/hermes-2-pro-mistral-7b](/workers-ai/models/hermes-2-pro-mistral-7b/)
- [@hf/thebloke/deepseek-coder-6.7b-instruct-awq](/workers-ai/models/deepseek-coder-6.7b-instruct-awq/)
- [@cf/deepseek-ai/deepseek-r1-distill-qwen-32b](/workers-ai/models/deepseek-r1-distill-qwen-32b/)
We will continue extending this list to keep up with new, and requested models.
Note that Workers AI can't guarantee that the model responds according to the requested JSON Schema. Depending on the complexity of the task and adequacy of the JSON Schema, the model may not be able to satisfy the request in extreme situations. If that's the case, then an error `JSON Mode couldn't be met` is returned and must be handled.
JSON Mode currently doesn't support streaming.
---
# Prompting
URL: https://developers.cloudflare.com/workers-ai/features/prompting/
import { Code } from "~/components";
export const scopedExampleOne = `{
messages: [
{ role: "system", content: "you are a very funny comedian and you like emojis" },
{ role: "user", content: "tell me a joke about cloudflare" },
],
};`;
export const scopedExampleTwo = `{
messages: [
{ role: "system", content: "you are a professional computer science assistant" },
{ role: "user", content: "what is WASM?" },
{ role: "assistant", content: "WASM (WebAssembly) is a binary instruction format that is designed to be a platform-agnostic" },
{ role: "user", content: "does Python compile to WASM?" },
{ role: "assistant", content: "No, Python does not directly compile to WebAssembly" },
{ role: "user", content: "what about Rust?" },
],
};`;
export const unscopedExampleOne = `{
prompt: "tell me a joke about cloudflare";
}`;
export const unscopedExampleTwo = `{
prompt: "
Here's a better example of a chat session using multiple iterations between the user and the assistant.
Note that different LLMs are trained with different templates for different use cases. While Workers AI tries its best to abstract the specifics of each LLM template from the developer through a unified API, you should always refer to the model documentation for details (we provide links in the table above.) For example, instruct models like Codellama are fine-tuned to respond to a user-provided instruction, while chat models expect fragments of dialogs as input.
### Unscoped Prompts
You can use unscoped prompts to send a single question to the model without worrying about providing any context. Workers AI will automatically convert your `prompt` input to a reasonable default scoped prompt internally so that you get the best possible prediction.
You can also use unscoped prompts to construct the model chat template manually. In this case, you can use the raw parameter. Here's an input example of a [Mistral](https://docs.mistral.ai/models/#chat-template) chat template prompt:
---
# Vercel AI SDK
URL: https://developers.cloudflare.com/workers-ai/configuration/ai-sdk/
Workers AI can be used with the [Vercel AI SDK](https://sdk.vercel.ai/) for JavaScript and TypeScript codebases.
## Setup
Install the [`workers-ai-provider` provider](https://sdk.vercel.ai/providers/community-providers/cloudflare-workers-ai):
```bash
npm install workers-ai-provider
```
Then, add an AI binding in your Workers project Wrangler file:
```toml
[ai]
binding = "AI"
```
## Models
The AI SDK can be configured to work with [any AI model](/workers-ai/models/).
```js
import { createWorkersAI } from 'workers-ai-provider';
const workersai = createWorkersAI({ binding: env.AI });
// Choose any model: https://developers.cloudflare.com/workers-ai/models/
const model = workersai('@cf/meta/llama-3.1-8b-instruct', {});
```
## Generate Text
Once you have selected your model, you can generate text from a given prompt.
```js
import { createWorkersAI } from 'workers-ai-provider';
import { generateText } from 'ai';
type Env = {
AI: Ai;
};
export default {
async fetch(_: Request, env: Env) {
const workersai = createWorkersAI({ binding: env.AI });
const result = await generateText({
model: workersai('@cf/meta/llama-2-7b-chat-int8'),
prompt: 'Write a 50-word essay about hello world.',
});
return new Response(result.text);
},
};
```
## Stream Text
For longer responses, consider streaming responses to provide as the generation completes.
```js
import { createWorkersAI } from 'workers-ai-provider';
import { streamText } from 'ai';
type Env = {
AI: Ai;
};
export default {
async fetch(_: Request, env: Env) {
const workersai = createWorkersAI({ binding: env.AI });
const result = streamText({
model: workersai('@cf/meta/llama-2-7b-chat-int8'),
prompt: 'Write a 50-word essay about hello world.',
});
return result.toTextStreamResponse({
headers: {
// add these headers to ensure that the
// response is chunked and streamed
'Content-Type': 'text/x-unknown',
'content-encoding': 'identity',
'transfer-encoding': 'chunked',
},
});
},
};
```
## Generate Structured Objects
You can provide a Zod schema to generate a structured JSON response.
```js
import { createWorkersAI } from 'workers-ai-provider';
import { generateObject } from 'ai';
import { z } from 'zod';
type Env = {
AI: Ai;
};
export default {
async fetch(_: Request, env: Env) {
const workersai = createWorkersAI({ binding: env.AI });
const result = await generateObject({
model: workersai('@cf/meta/llama-3.1-8b-instruct'),
prompt: 'Generate a Lasagna recipe',
schema: z.object({
recipe: z.object({
ingredients: z.array(z.string()),
description: z.string(),
}),
}),
});
return Response.json(result.object);
},
};
```
---
# Workers Bindings
URL: https://developers.cloudflare.com/workers-ai/configuration/bindings/
import { Type, MetaInfo, WranglerConfig } from "~/components";
## Workers
[Workers](/workers/) provides a serverless execution environment that allows you to create new applications or augment existing ones.
To use Workers AI with Workers, you must create a Workers AI [binding](/workers/runtime-apis/bindings/). Bindings allow your Workers to interact with resources, like Workers AI, on the Cloudflare Developer Platform. You create bindings on the Cloudflare dashboard or by updating your [Wrangler file](/workers/wrangler/configuration/).
To bind Workers AI to your Worker, add the following to the end of your Wrangler file:
Get started by creating your first note
Configure post-processing of recording transcriptions with AI models.
Settings changes are auto-saved locally.