Intermediate

Section 17: Writing a Custom TypeScript Extension

A Pi extension is a TypeScript module that registers additional tools, slash commands, or TUI components into the current session. Writing one requires understanding three things: the extension anatomy (how files are laid out), the register() call (how capabilities are declared to Pi), and the package.json shape (what tells Pi this package is an extension).

Extensions are the mechanism Pi uses to grow beyond its four primitives without changing the core. Every capability beyond Read, Write, Edit, and Bash comes from an extension — whether installed from the community or written by you. The extension API is intentionally simple: a single register() function call that takes an object describing your tool, command, or component.

%% Color Palette: Blue #0173B2, Orange #DE8F05, Teal #029E73, Purple #CC78BC, Brown #CA9161
%% All colors are color-blind friendly and meet WCAG AA contrast standards
 
graph TD
    PKG["package.json<br/>pi-extension field declares entry point"]:::brown
    ENTRY["index.ts<br/>Extension entry point"]:::blue
    REG["register() call<br/>Declares tools / commands / components"]:::orange
    PI["Pi session<br/>Loads extension at startup"]:::teal
    LLM["LLM tool schema<br/>Sees your tool in function-calling"]:::purple
 
    PKG -->|"points to"| ENTRY
    ENTRY -->|"calls"| REG
    REG -->|"injects into"| PI
    PI -->|"exposes to"| LLM
 
    classDef brown fill:#CA9161,stroke:#000000,color:#FFFFFF,stroke-width:2px
    classDef blue fill:#0173B2,stroke:#000000,color:#FFFFFF,stroke-width:2px
    classDef orange fill:#DE8F05,stroke:#000000,color:#FFFFFF,stroke-width:2px
    classDef teal fill:#029E73,stroke:#000000,color:#FFFFFF,stroke-width:2px
    classDef purple fill:#CC78BC,stroke:#000000,color:#FFFFFF,stroke-width:2px

The minimal extension has three files: package.json, index.ts, and optionally a SKILL.md that gives the LLM natural-language guidance on when to use the extension's tools.

// package.json — the pi-extension field is what Pi looks for when scanning
// the extensions directory. Without it, Pi ignores the package.
{
  "name": "pi-ext-grep-symbols",
  "version": "1.0.0",
  "description": "Search for symbols across a codebase",
  "main": "dist/index.js", // => Compiled output — Pi loads this file
  "pi-extension": {
    "entry": "dist/index.js", // => Pi reads this field to find the entry point
    "version": "1" // => Extension API version (currently "1")
  },
  "scripts": {
    "build": "tsc", // => Compile TypeScript to dist/
    "watch": "tsc --watch" // => Watch mode for hot-reload during development
  },
  "dependencies": {},
  "devDependencies": {
    "typescript": "^5.4.0",
    "@earendil-works/pi-coding-agent": "^0.75.0" // => Types for Tool, register(), etc.
  }
}

// index.ts — the extension entry point
// Pi imports this module and calls the default export with the Pi API object
 
import { PiExtensionAPI, Tool } from "@earendil-works/pi-coding-agent";
// => Import types from the Pi package
 
// Define the tool your extension provides
const grepSymbolsTool: Tool = {
  name: "grep_symbols", // => Name the LLM uses in function-calling
  description:
    "Search for a TypeScript or JavaScript symbol across all source files. " +
    "Use this to find where a function, class, or variable is defined or used.",
  // => LLM reads this description to decide when to call
  parameters: {
    type: "object",
    properties: {
      symbol: {
        type: "string", // => The symbol name to search for
        description: "Symbol name to search (function, class, variable, type)",
      },
      path: {
        type: "string", // => Directory to search in
        description: "Directory to search in (default: current directory)",
      },
    },
    required: ["symbol"], // => path is optional; symbol is required
  },
  execute: async ({ symbol, path = "." }, { bash }) => {
    // => execute receives tool arguments and Pi's built-in tool helpers
    // => bash() is Pi's Bash tool, available to extensions for shell execution
 
    const result = await bash(`grep -r --include='*.ts' --include='*.js' -n "${symbol}" ${path}`); // => grep with TypeScript/JS file filter, line numbers
 
    if (!result.stdout.trim()) {
      return `No occurrences of '${symbol}' found in ${path}`;
      // => Return a string — Pi sends this as tool_result
    }
 
    return result.stdout; // => Return grep output; LLM analyzes the matches
  },
};
 
// Extension entry point — Pi calls this function when loading the extension
export default function setup(api: PiExtensionAPI): void {
  api.register(grepSymbolsTool); // => Register the tool with Pi's session
  // => Tool is now available to the LLM immediately
}

Build and install the extension:

# Compile the TypeScript extension
npm run build
# => tsc -p tsconfig.json
# => (Output written to dist/index.js)
 
# Install the extension into Pi's extensions directory
npm install --prefix ~/.pi/extensions .
# => (Links or copies the built package into ~/.pi/extensions/node_modules/pi-ext-grep-symbols/)
 
# Start Pi — extension loads automatically
pi
# => Pi v0.75.4
# => Extensions loaded: pi-ext-grep-symbols
# => Provider: anthropic · Model: claude-sonnet-4-5 · Context: 0 tokens

Key Takeaway: A Pi extension is a TypeScript module with a pi-extension field in package.json and a default export that calls api.register() with your tool definition.

Why It Matters: Writing your own extensions means the agent gains exactly the capability your project needs — a tool that knows your database schema, your test framework's output format, or your CI system's API — without those capabilities being part of Pi's core or anyone else's session.

Section 18: Registering Custom Tools

Registering a tool is the act of adding a capability to Pi's active session. Once registered, the tool appears in the LLM's function-calling schema, which means the LLM can choose to call it on any subsequent turn. The Tool interface defines what the LLM knows about the tool and what Pi does when the LLM calls it.

The three fields that matter most are name, description, and parameters. The LLM uses these fields — not your TypeScript implementation — to decide when and how to call the tool. A poorly written description means the LLM will either overuse the tool (calling it when it is not appropriate) or underuse it (not calling it when it would help). A poorly written parameter schema means the LLM will pass the wrong arguments and your execute function will receive unexpected input.

import { Tool, PiExtensionAPI } from "@earendil-works/pi-coding-agent";
 
// A tool that runs the project's test suite and returns structured results
const runTestsTool: Tool = {
  name: "run_tests",
 
  // Good description: specific about what it does, when to use it, what it returns
  description:
    "Run the project's Vitest test suite and return a structured summary of " +
    "results including pass/fail counts and failure details. Use this after " +
    "making code changes to verify the change does not break existing tests.",
  // => LLM uses this paragraph to decide: "should I call run_tests now?"
  // => The "Use this after making code changes" phrase guides the LLM's timing
 
  parameters: {
    type: "object",
    properties: {
      filter: {
        type: "string",
        description:
          "Optional test name filter — only run tests matching this string. " +
          "Example: 'auth' runs only tests with 'auth' in their name.",
        // => Description guides LLM on when to set this field
      },
      bail: {
        type: "boolean",
        description:
          "If true, stop after the first test failure. Useful for fast feedback " +
          "when debugging a specific failure. Default: false.",
      },
    },
    required: [], // => All parameters are optional — LLM can call with {}
  },
 
  execute: async ({ filter, bail = false }, { bash }) => {
    // Build the vitest command from parameters
    const filterFlag = filter ? `--reporter=verbose -t "${filter}"` : "";
    // => -t is Vitest's test name filter flag
 
    const bailFlag = bail ? "--bail" : ""; // => --bail stops after first failure
 
    const result = await bash(`npx vitest run ${filterFlag} ${bailFlag} --reporter=json 2>&1`); // => --reporter=json gives structured output
    // => 2>&1 captures stderr too (Vitest logs there)
 
    // Parse Vitest JSON output into a readable summary
    try {
      const report = JSON.parse(result.stdout);
      // => Vitest JSON output has numPassedTests, numFailedTests, etc.
      const passed = report.numPassedTests; // => Count of passing tests
      const failed = report.numFailedTests; // => Count of failing tests
      const failures = report.testResults // => Array of test file results
        .flatMap((f: any) => f.testResults) // => Flatten to individual test results
        .filter((t: any) => t.status === "failed")
        .map((t: any) => `  FAIL: ${t.fullName}\n    ${t.failureMessages[0]}`)
        .join("\n"); // => Format failures for LLM readability
 
      return [`Tests: ${passed} passed, ${failed} failed`, failures ? `\nFailures:\n${failures}` : ""].join("");
      // => Returns: "Tests: 47 passed, 2 failed\nFailures:\n  FAIL: auth > validates JWT..."
    } catch {
      return result.stdout; // => Fallback: return raw output if JSON parse fails
    }
  },
};
 
export default function setup(api: PiExtensionAPI): void {
  api.register(runTestsTool); // => Adds run_tests to the LLM's tool schema
  // => From this point forward, the LLM can call run_tests on any turn
}

Error handling in the execute function matters. When execute throws an exception, Pi catches it and returns the error message to the LLM as a tool_result with an error flag. The LLM can then decide whether to retry, ask you for help, or take a different approach. Return a descriptive error string from execute rather than throwing when the error is recoverable — this gives the LLM more context for its decision.

// Error handling pattern for tool execute functions
execute: async ({ symbol, path = "." }, { bash }) => {
  try {
    const result = await bash(`grep -r "${symbol}" ${path}`);
 
    if (result.exitCode !== 0 && !result.stdout) {
      // => exitCode non-zero with no output means grep found nothing
      return `No matches for '${symbol}' in ${path}`;
                                          // => Descriptive return — LLM understands and adapts
    }
 
    return result.stdout;
  } catch (error) {
    // => Catch unexpected errors (permission denied, path not found, etc.)
    return `grep_symbols failed: ${(error as Error).message}`;
                                          // => Return error as string — LLM sees this as tool_result
                                          // => LLM can then decide to try a different path or approach
  }
},

Key Takeaway: The description and parameters fields in a Tool definition are what the LLM reads to decide when and how to call your tool — write them from the LLM's perspective, not the implementor's.

Why It Matters: A well-designed tool description is the difference between a tool that gets used at the right moments and a tool that gets called constantly on every turn or ignored entirely. The quality of your tool's description directly determines the quality of the agent's behavior.

Section 19: Skills System

A skill in Pi is a SKILL.md markdown file that provides the LLM with natural-language instructions, examples, and guidance for a specific domain or task. Skills differ from tools in a fundamental way: tools give the LLM a new capability it can execute; skills give the LLM knowledge about how to use existing capabilities more effectively.

A skill file can describe: best practices for working with a specific framework, the conventions of a specific codebase, the expected format for a specific output, or guidance for navigating a complex domain. The LLM reads the skill content and incorporates it into its reasoning, without you having to repeat the guidance in every message.

<!-- Example skill file: .pi/skills/vitest-conventions/SKILL.md -->
<!-- This skill teaches the agent how the project uses Vitest -->
 
# Vitest Test Conventions for This Project
 
## Test File Location
 
Tests live next to the source files they test, with a `.test.ts` suffix:
 
- `src/auth/jwt.ts` → `src/auth/jwt.test.ts`
- `src/router/tasks.ts` → `src/router/tasks.test.ts`
 
Never place tests in a separate `tests/` or `__tests__/` directory.
 
## Test Structure
 
Each test file follows this structure:
 
```typescript
import { describe, it, expect, beforeEach, vi } from "vitest";
 
describe("ModuleName", () => {
  beforeEach(() => {
    vi.clearAllMocks(); // Always clear mocks between tests
  });
 
  describe("functionName", () => {
    it("does the expected behavior when given valid input", () => {
      // Arrange
      // Act
      // Assert with expect()
    });
 
    it("throws when given invalid input", () => {
      expect(() => functionName(badInput)).toThrow("Expected error message");
    });
  });
});
```

Mocking Rules

Use vi.mock() at the top of the file for module mocks
Use vi.spyOn() for function-level mocks within tests
Never mock src/lib/logger.ts — let tests exercise real logging behavior
Database calls must always be mocked — never hit a real DB in unit tests

Running Tests

Single file: npx vitest run src/auth/jwt.test.ts
Watch mode: npx vitest src/auth/jwt.test.ts
Coverage: npx vitest run --coverage


Skills are stored in a `skills/` directory under `.pi/` in your home directory or your
project directory. Pi scans both locations. You can also specify a skills directory in
`AGENTS.md`. Skill selection per turn is covered in Section 20.

```bash
# Create a skills directory for your project
mkdir -p .pi/skills/vitest-conventions

# Create the skill file
# (Write the SKILL.md content as shown above)
vim .pi/skills/vitest-conventions/SKILL.md

# Verify Pi discovers the skill at startup
pi
# => Pi v0.75.4
# => Skills discovered: vitest-conventions (local)
# => Provider: anthropic · Model: claude-sonnet-4-5 · Context: 0 tokens

# The LLM now has access to vitest conventions when relevant turns come up
# You do not need to tell Pi to use the skill — selection is automatic (Section 20)

Skills and tools compose well. An extension that registers a run_tests tool (Section 18) pairs naturally with a vitest-conventions skill that tells the LLM when to run tests, how to interpret the results, and which test file to look at when a test fails. The tool provides the execution capability; the skill provides the reasoning guidance.

Key Takeaway: A skill is a SKILL.md file that gives the LLM natural-language guidance for a domain — not a new execution capability, but knowledge that improves how the LLM uses existing capabilities.

Why It Matters: Skills allow you to encode team conventions, framework expertise, and project-specific knowledge in a form the LLM can actually use. Instead of repeating "our tests live next to source files" in every session, you write it once in a skill file and the LLM applies it consistently.

Section 20: Dynamic Skill Injection

Pi does not inject every discovered skill into every turn's context. Instead, it uses a relevance scoring algorithm to select which skills are relevant to the current turn and injects only those. This keeps the context window from growing linearly with the number of skills in your system.

Relevance scoring uses the content of the current user message and the recent conversation history as a query. Each skill's title, description, and first 200 characters of content are compared against the query. Skills above the relevance threshold are injected into the context for that turn; skills below the threshold are not. The threshold is configurable.

%% Color Palette: Blue #0173B2, Orange #DE8F05, Teal #029E73, Purple #CC78BC, Brown #CA9161
%% All colors are color-blind friendly and meet WCAG AA contrast standards
 
graph LR
    MSG["Current user message<br/>+ recent history"]:::teal
    SCORE["Relevance scorer<br/>(embedding similarity)"]:::blue
    SKILLS["All discovered skills<br/>(titles + excerpts)"]:::brown
    THRESH["Threshold filter<br/>(configurable)"]:::orange
    INJECT["Injected skills<br/>(added to context this turn)"]:::purple
    SKIP["Skipped skills<br/>(not in context this turn)"]:::orange
 
    MSG -->|"query"| SCORE
    SKILLS -->|"candidates"| SCORE
    SCORE -->|"scores"| THRESH
    THRESH -->|"above threshold"| INJECT
    THRESH -->|"below threshold"| SKIP
    INJECT -->|"prepended to"| MSG
 
    classDef teal fill:#029E73,stroke:#000000,color:#FFFFFF,stroke-width:2px
    classDef blue fill:#0173B2,stroke:#000000,color:#FFFFFF,stroke-width:2px
    classDef brown fill:#CA9161,stroke:#000000,color:#FFFFFF,stroke-width:2px
    classDef orange fill:#DE8F05,stroke:#000000,color:#FFFFFF,stroke-width:2px
    classDef purple fill:#CC78BC,stroke:#000000,color:#FFFFFF,stroke-width:2px

The scoring mechanism is embedding-based similarity by default. Pi computes an embedding vector for the query (user message + recent history) and compares it against pre-computed embeddings for each skill's summary. Skills are re-embedded when their SKILL.md file changes. Embeddings are cached in .pi/skill-embeddings.json.

You can configure injection behavior in AGENTS.md:

# AGENTS.md excerpt — skill injection configuration
 
## Pi Skills Configuration
 
skills:
threshold: 0.72 # Relevance cutoff (0.0–1.0; default: 0.70)
max-injected: 3 # Maximum skills injected per turn (default: 5)
always-inject: # Skills always injected regardless of score - project-conventions
never-inject: # Skills never injected (useful for debugging) - legacy-api-reference

// You can also control skill injection programmatically in an extension
 
import { PiExtensionAPI, Skill } from "@earendil-works/pi-coding-agent";
 
const databaseSkill: Skill = {
  name: "database-conventions",
  content: `
# Database Access Conventions
 
Always use Prisma client for database access. Never write raw SQL.
Connection string is in DATABASE_URL environment variable.
Migrations live in prisma/migrations/ — never edit them directly.
  `,
  // => Inline skill defined in TypeScript — useful for extension-specific guidance
  alwaysInject: false, // => Let relevance scoring decide injection
  priority: 1.0, // => Higher priority = preferred over same-score skills
};
 
export default function setup(api: PiExtensionAPI): void {
  api.registerSkill(databaseSkill); // => Register inline skill alongside SKILL.md files
  // => Both types participate in relevance scoring
}

Debugging skill injection is straightforward. When pi --verbose is set, Pi logs which skills were scored, which were above threshold, and which were injected on each turn. This lets you tune the threshold and the skill content until injection happens at the right times.

# Run Pi in verbose mode to see skill injection decisions
pi --verbose
# => [skills] Scoring 4 skills for turn 1
# => [skills] vitest-conventions: 0.84 (INJECT — above threshold 0.70)
# => [skills] database-conventions: 0.31 (skip — below threshold)
# => [skills] git-workflow: 0.22 (skip — below threshold)
# => [skills] project-conventions: always-inject (INJECT)
# => [skills] Injecting 2 skills (total: 847 tokens)

Key Takeaway: Pi injects skills selectively per turn using embedding-based relevance scoring — only skills above the configured threshold appear in context, keeping token usage proportional to relevance.

Why It Matters: Selective injection is what makes a large skill library practical. If all skills were injected on every turn, the context would fill with irrelevant guidance. Relevance scoring ensures the LLM gets the right knowledge at the right time, at a token cost proportional to what it actually needs.

Section 21: Context Window Management

The context window is the LLM's working memory for a session. Every token in the context window is sent to the LLM on every turn, which means the context window directly determines both the cost and the quality of each LLM call. Pi provides three mechanisms for managing the context window: auto-compaction, manual compaction, and branching.

Auto-compaction triggers when the context reaches approximately 80% of the model's context limit. Pi sends a compaction prompt to the LLM asking it to summarize the older parts of the conversation, then replaces those turns in the history with the summary. The most recent turns (configurable, default: last 8) are always kept verbatim to preserve immediate context.

# Monitor context growth during a session
/stats
# => Session: 2026-05-21T10-15-00__implement-search
# => Turns: 42
# => Context tokens: 156,832 / 200,000 (78.4%)
# => Auto-compaction will trigger at ~160,000 tokens
# => Estimated cost so far: $0.31
 
# Trigger manual compaction before the limit
/compact
# => Compacting 34 older turns...
# => Summary generated (1,247 tokens)
# => Before: 156,832 tokens (42 turns)
# => After: 24,391 tokens (8 turns verbatim + summary)
# => Saved: 132,441 tokens

The compaction summary is generated by the LLM itself. Pi sends the older portion of the conversation and asks the LLM to produce a dense summary of: what was accomplished, what decisions were made, what files were changed, and what the current state is. This summary replaces the raw turns in the context. The quality of auto-compaction depends on the model — stronger models produce more accurate and useful summaries.

Branching (Section 9) is an alternative to compaction when you are switching to a genuinely different task. Compaction keeps you in the same session with a summary of what happened. Branching creates a new session that inherits only the context up to the branch point. Use compaction when you want to continue the same task with a lighter context; use branching when you want to pivot to a different task.

// Configure compaction behavior in an extension or AGENTS.md
 
// In AGENTS.md:
// pi-config:
//   context:
//     compaction-threshold: 0.75    # Trigger at 75% of model limit (default: 0.80)
//     verbatim-tail: 12             # Keep last 12 turns verbatim (default: 8)
//     compaction-model: claude-haiku-4-5  # Use a cheaper model for compaction itself
 
// In an extension — programmatically control compaction:
import { PiExtensionAPI } from "@earendil-works/pi-coding-agent";
 
export default function setup(api: PiExtensionAPI): void {
  // Register a slash command that triggers compaction + reports savings
  api.registerCommand({
    name: "compact-report", // => /compact-report in the TUI
    description: "Compact context and show token savings report",
    execute: async (_, { session, compact }) => {
      const before = session.contextTokens; // => Token count before compaction
      await compact(); // => Trigger compaction
      const after = session.contextTokens; // => Token count after
      const saved = before - after; // => Tokens saved
 
      return [
        `Compaction complete`,
        `Before: ${before.toLocaleString()} tokens`,
        `After:  ${after.toLocaleString()} tokens`,
        `Saved:  ${saved.toLocaleString()} tokens (${Math.round((saved / before) * 100)}%)`,
      ].join("\n");
    },
  });
}

Understanding what the compaction algorithm preserves is important for working with it effectively. Verbatim preservation at the tail means the LLM always has precise context for the most recent actions. The summary covers older material, but summaries lose detail — specific variable names, exact error messages, and precise file contents from early in the session may be paraphrased. If accuracy for older material is critical, use /branch to start a fresh session and explicitly re-read the relevant files.

Key Takeaway: Pi auto-compacts the context when it reaches 80% of the model's limit by summarizing older turns; use /compact to trigger this manually, and prefer /branch when switching to a genuinely different task.

Why It Matters: Context management is the operational skill that separates developers who can run Pi sessions for hours on large codebases from those who hit context limits and have to restart. Understanding when to compact, when to branch, and what the compaction algorithm preserves gives you control over session quality across long work sessions.

Section 22: Branching Sessions for Code Review

Branching for code review is a concrete workflow pattern that demonstrates why tree-structured sessions matter. When you branch a session at the point where a feature is complete, the review branch inherits the full context of the feature development — the files changed, the decisions made, the tests written — without the review messages mixing into the development session's history.

The review branch can explore the code independently: read files, run analysis tools, compare against conventions, check for security issues. When the review is complete, the findings live in the review branch. The development session remains clean. You merge the useful findings back manually — Pi does not auto-merge branches.

# Development session: you've just completed implementing a JWT auth module
# Session: 2026-05-21T09-00-00__implement-jwt-auth (22 turns, 34,000 tokens)
 
# Branch from the current point for review
/branch review-jwt-security
# => Branched from: implement-jwt-auth (turn 22)
# => New session: 2026-05-21T11-30-00__review-jwt-security
# => Context: 34,000 tokens (22 turns — identical to branch point)
# => (You are now in the review branch)
 
# In the review branch, ask for a security-focused review
# "Review the JWT authentication implementation for security issues.
#  Check token expiry validation, algorithm validation, and secret key handling."
 
# The agent reads the relevant files and produces findings:
# => Tool call: Read("src/auth/jwt.ts")
# => Tool call: Read("src/auth/middleware.ts")
# => Tool call: Bash("grep -n 'secret\\|key\\|algorithm' src/auth/jwt.ts")
# => Response: "I found 3 issues: [1] The algorithm is not validated... [2] ..."
 
# Share the review session for the PR
/share --title "JWT auth security review"
# => Session shared: https://gist.github.com/yourusername/abc123...
 
# Return to the development session (it is unchanged)
/load implement-jwt-auth
# => Resuming session: implement-jwt-auth (22 turns, 34,000 tokens)
# => (Development session is exactly as you left it)

Code review branches work best when they start from a clean branch point — after a feature is complete, not in the middle of development. A branch point mid-development inherits incomplete code, which makes the review findings harder to act on.

Multiple review branches from the same point are also valid. Branch once for security review, once for performance review, once for API contract review. Each branch has focused findings without cross-contaminating the others.

Key Takeaway: Branch from a completed feature session to create a clean review context that inherits all the development context without mixing review messages into the development history.

Why It Matters: Code review sessions generate their own conversation context — the agent reads files, asks clarifying questions, proposes fixes. Keeping this separate from the development session means the development session stays coherent as a record of what was built and why, while the review session stays coherent as an independent audit.

Section 23: RPC Protocol Mode

Pi's RPC protocol mode exposes the agent over JSON-RPC, allowing another process or application to drive Pi programmatically. In RPC mode, Pi reads JSON-RPC 2.0 requests from stdin and writes JSON-RPC 2.0 responses to stdout. A caller sends a run request with a message, Pi processes the turn (LLM call + tool execution), and sends back the response.

RPC mode is how you embed Pi inside another tool without using the agent-core SDK directly. It lets you keep Pi's session management, extension loading, and context engineering features while driving it from a different language (Python, Go, Rust) or a different process.

%% Color Palette: Blue #0173B2, Orange #DE8F05, Teal #029E73, Purple #CC78BC, Brown #CA9161
%% All colors are color-blind friendly and meet WCAG AA contrast standards
 
sequenceDiagram
    participant CALLER as Caller Process<br/>(Python / Go / shell)
    participant PI as Pi RPC Process<br/>(stdin/stdout)
    participant LLM as LLM Provider
 
    CALLER->>PI: JSON-RPC: { method: "run", params: { message: "..." } }
    PI->>LLM: LLM API call with tools
    LLM-->>PI: Tool call: Bash("ls src/")
    PI->>PI: Execute Bash tool
    PI->>LLM: Tool result: "auth.ts\nrouter.ts\n..."
    LLM-->>PI: Final text response
    PI-->>CALLER: JSON-RPC response: { result: { response: "...", tool_calls: [...] } }

# Start Pi in RPC mode — reads from stdin, writes to stdout
pi --rpc
# => (Pi is now listening for JSON-RPC messages on stdin)
# => (No TUI rendered — pure stdin/stdout protocol)
 
# Send a run request (in a separate terminal or from a caller process)
echo '{"jsonrpc":"2.0","id":1,"method":"run","params":{"message":"List TypeScript files in src/"}}' | pi --rpc
# => {"jsonrpc":"2.0","id":1,"result":{
# =>   "response":"Here are the TypeScript files in src/:\n- auth.ts\n- router.ts\n- index.ts",
# =>   "tool_calls":[
# =>     {"name":"Bash","input":{"command":"find src/ -name '*.ts'"},"result":"src/auth.ts\nsrc/router.ts\nsrc/index.ts\n"}
# =>   ],
# =>   "tokens":{"input":342,"output":67}
# => }}

# Calling Pi's RPC mode from Python
import subprocess
import json
 
class PiRPCClient:
    def __init__(self, cwd: str, model: str = None):
        # Start Pi in RPC mode as a subprocess
        cmd = ["pi", "--rpc", "--cwd", cwd]
        if model:
            cmd.extend(["--model", model])      # => Optional model override
 
        self.process = subprocess.Popen(
            cmd,
            stdin=subprocess.PIPE,              # => We write JSON-RPC to stdin
            stdout=subprocess.PIPE,             # => We read JSON-RPC from stdout
            text=True,
        )
        self._request_id = 0
 
    def run(self, message: str) -> dict:
        self._request_id += 1
        request = {
            "jsonrpc": "2.0",
            "id": self._request_id,             # => Unique ID per request for correlation
            "method": "run",
            "params": {"message": message},
        }
 
        # Send request to Pi's stdin
        self.process.stdin.write(json.dumps(request) + "\n")
        self.process.stdin.flush()              # => Flush required — Pi reads line by line
 
        # Read response from Pi's stdout
        line = self.process.stdout.readline()  # => Blocks until Pi finishes the turn
        return json.loads(line)                # => Parse JSON-RPC response
 
    def close(self):
        self.process.terminate()
 
# Usage
client = PiRPCClient(cwd="/path/to/project")
 
result = client.run("What is the main entry point of this application?")
# => result["result"]["response"] = "The main entry point is src/index.ts..."
# => result["result"]["tool_calls"] = [{"name": "Bash", "input": {...}, "result": "..."}]
 
client.close()

RPC mode supports session management through additional JSON-RPC methods: branch to create a branch, load to load an existing session, stats to query token usage. The full RPC schema is documented at pi.dev/docs/rpc.

Key Takeaway: pi --rpc exposes the agent over JSON-RPC on stdin/stdout, letting any language or process drive Pi programmatically while keeping Pi's session and extension features intact.

Why It Matters: RPC mode makes Pi composable at the process level. You can integrate Pi's reasoning into scripts, automate multi-step workflows, and embed Pi in existing tools that were not written in TypeScript — without rewriting the agent logic yourself.

Section 24: SDK Embedding: pi-agent-core

@earendil-works/pi-agent-core is the agent runtime package underneath the Pi CLI. Using it as a library gives you the complete agentic loop — LLM call, tool execution, result feeding, loop termination — as a TypeScript API you instantiate and control. This is the right approach when you want to build your own CLI, web server, or application that embeds an agent with Pi's behavior.

The SDK gives you more control than RPC mode (you write TypeScript, not JSON-RPC), but requires more setup — you instantiate the agent, configure the provider, register tools, and run the loop yourself.

import { Agent, AnthropicProvider, Tool } from "@earendil-works/pi-agent-core";
 
// Define the tools your embedded agent will use
const readTool: Tool = {
  name: "read_file",
  description: "Read the contents of a file",
  parameters: {
    type: "object",
    properties: {
      path: { type: "string", description: "File path to read" },
    },
    required: ["path"],
  },
  execute: async ({ path }) => {
    const { readFile } = await import("fs/promises");
    return readFile(path, "utf-8"); // => Reads file, returns content as string
  },
};
 
const bashTool: Tool = {
  name: "bash",
  description: "Execute a shell command",
  parameters: {
    type: "object",
    properties: {
      command: { type: "string" },
    },
    required: ["command"],
  },
  execute: async ({ command }) => {
    const { exec } = await import("child_process");
    const { promisify } = await import("util");
    const execAsync = promisify(exec);
 
    const { stdout, stderr } = await execAsync(command);
    // => Executes command, captures stdout + stderr
    return stdout || stderr; // => Return whichever has content
  },
};
 
// Create and configure the agent
const agent = new Agent({
  provider: new AnthropicProvider({
    apiKey: process.env.ANTHROPIC_API_KEY!, // => API key from environment
    model: "claude-sonnet-4-5", // => Model to use for this agent
  }),
  systemPrompt: `You are a code review agent. Read source files and identify
issues. Write findings to review.md. Do not modify source files.`,
  // => Custom system prompt — replaces Pi's default
  tools: [readTool, bashTool], // => Register tools for this agent instance
  maxTurns: 20, // => Safety limit — stop after 20 tool call turns
});
 
// Run the agent on a task and get the result
async function reviewFile(filePath: string): Promise<string> {
  const result = await agent.run(`Review ${filePath} for code quality issues and security vulnerabilities.`);
  // => agent.run() executes the full agentic loop:
  // => 1. Send system prompt + user message to LLM
  // => 2. If LLM calls a tool, execute it, feed result back
  // => 3. Repeat until LLM produces a final text response
  // => 4. Return the final text response
 
  return result.response; // => Final LLM text response (the review)
}
 
// Call the agent
const review = await reviewFile("src/auth/jwt.ts");
console.log(review);
// => "I found 2 issues in src/auth/jwt.ts:
// => 1. [HIGH] The JWT algorithm is not validated against an allowlist...
// => 2. [MEDIUM] The token expiry is set to 7 days — consider shorter-lived tokens..."

The Agent class manages conversation state internally. Successive calls to agent.run() in the same instance continue the same conversation — history is preserved between calls. Create a new Agent instance for each independent conversation or task.

// Using Agent for a multi-turn conversation from code
const agent = new Agent({ provider, systemPrompt, tools });
 
// First turn
await agent.run("Read package.json and tell me the project name");
// => "This project is named 'my-api' (version 1.2.3)."
 
// Second turn — agent has context from the first turn
await agent.run("What scripts are defined in package.json?");
// => "The package.json defines: start, build, test, and lint scripts."
// => (Agent already has package.json contents in context from first turn)
 
// Access conversation history
console.log(agent.messages.length);
// => 6 (2 user turns + 2 assistant turns + 2 tool turns for the Read calls)

Key Takeaway: @earendil-works/pi-agent-core provides the Agent class — instantiate it with a provider, system prompt, and tools, then call agent.run() to execute the full agentic loop from TypeScript code.

Why It Matters: SDK embedding is how you build production systems that use an agent as a component. A code review service, a documentation generator, an automated refactoring pipeline — all of these can be built on pi-agent-core without building the agentic loop yourself.

Section 25: pi-tui: Terminal UI Components

@earendil-works/pi-tui is Pi's terminal UI library, which powers the TUI you see when you run pi interactively. The library provides a component model for terminal UIs with differential rendering — only changed regions of the screen are redrawn, which keeps the display responsive even when output is streaming.

You can use pi-tui independently to build other terminal applications, or use it within a Pi extension to add custom display components to the Pi TUI itself.

import { TUI, Box, Text, Input, ScrollPane } from "@earendil-works/pi-tui";
 
// Create a simple TUI application
const app = new TUI({
  fullscreen: true, // => Take over the entire terminal
  title: "My Pi Extension Dashboard",
});
 
// Define a layout with a scrollable output pane and an input field
const layout = app.createLayout({
  direction: "column", // => Stack children vertically
  children: [
    new ScrollPane({
      id: "output", // => Reference this pane by id to update content
      flex: 1, // => Take all available vertical space
      content: [], // => Start empty — we'll push content here
    }),
    new Box({
      height: 3, // => Fixed height for the input area
      border: "single", // => Draw a single-line border around the input
      children: [
        new Input({
          id: "input",
          placeholder: "Type a command...",
          onSubmit: (value) => handleInput(value),
          // => Called when user presses Enter
        }),
      ],
    }),
  ],
});
 
app.render(layout); // => Draw the initial layout to the terminal
 
// Update the output pane with new content (differential rendering)
function appendOutput(text: string): void {
  const outputPane = app.getComponent<ScrollPane>("output");
  outputPane.push(new Text({ content: text }));
  // => Only the new Text node is redrawn
  // => Existing content is not redrawn (differential)
  outputPane.scrollToBottom(); // => Auto-scroll to show newest content
  app.render(); // => Apply the diff to the terminal
}

Building a custom pane in a Pi extension works through api.registerComponent(). Your component renders into a dedicated area in the Pi TUI:

import { PiExtensionAPI, TUIComponent, Box, Text } from "@earendil-works/pi-coding-agent";
 
// A custom TUI component that shows live test status
class TestStatusPane implements TUIComponent {
  private status: "idle" | "running" | "passing" | "failing" = "idle";
  private summary: string = "";
 
  // Pi calls render() when the component needs to draw itself
  render(): Box {
    const statusColor = {
      idle: "gray",
      running: "yellow",
      passing: "green",
      failing: "red",
    }[this.status];
    // => Map status to a terminal color
 
    return new Box({
      border: "single",
      title: "Test Status",
      children: [
        new Text({
          content: this.status.toUpperCase(),
          color: statusColor, // => pi-tui handles terminal color codes
        }),
        new Text({ content: this.summary }),
      ],
    });
  }
 
  // Update the component's state (triggers re-render)
  update(status: typeof this.status, summary: string): void {
    this.status = status;
    this.summary = summary;
    // => Pi will call render() on the next frame
  }
}
 
export default function setup(api: PiExtensionAPI): void {
  const testPane = new TestStatusPane();
 
  // Register the component — Pi adds it to the TUI layout
  api.registerComponent({
    id: "test-status",
    position: "sidebar", // => Display in Pi's sidebar area
    component: testPane,
  });
 
  // Update the component from a tool's execute function
  api.register({
    name: "run_tests",
    description: "Run tests and update the test status pane",
    parameters: { type: "object", properties: {}, required: [] },
    execute: async (_, { bash }) => {
      testPane.update("running", "Running...");
      // => Update pane immediately when tests start
 
      const result = await bash("npx vitest run --reporter=json 2>&1");
 
      try {
        const report = JSON.parse(result.stdout);
        const passed = report.numPassedTests;
        const failed = report.numFailedTests;
 
        testPane.update(failed > 0 ? "failing" : "passing", `${passed} passed, ${failed} failed`);
        // => Update pane with results
        return `Tests: ${passed} passed, ${failed} failed`;
      } catch {
        testPane.update("failing", "Parse error");
        return result.stdout;
      }
    },
  });
}

Key Takeaway: pi-tui provides a component model with differential rendering; use api.registerComponent() in an extension to add custom display areas to the Pi TUI.

Why It Matters: Custom TUI components let you surface information that the agent is tracking — test status, token usage, file change count — in a persistent display that updates in real time without scrolling through conversation history to find the latest state.

Section 26: Supply-Chain Hardening

Pi ships with supply-chain hardening for the CLI package: pinned exact dependency versions and an npm-shrinkwrap.json that locks the full dependency tree including transitive dependencies. For a CLI tool that executes arbitrary shell commands on behalf of an LLM, supply-chain integrity is not optional — a compromised transitive dependency could execute malicious commands in any session.

Understanding Pi's hardening approach helps you apply the same practices to extensions you write and distribute.

# Inspect Pi's shrinkwrap file — this locks every transitive dependency
cat "$(npm root -g)/@earendil-works/pi-coding-agent/npm-shrinkwrap.json" | jq '.lockfileVersion'
# => 3   ← npm lockfile version 3 (npm 7+)
 
# Verify package integrity against the shrinkwrap
npm audit --prefix "$(npm root -g)/@earendil-works/pi-coding-agent"
# => found 0 vulnerabilities
# => (Pi's shrinkwrap pins versions that are CVE-clean as of the release date)
 
# Check an individual package's integrity hash
cat "$(npm root -g)/@earendil-works/pi-coding-agent/npm-shrinkwrap.json" | \
  jq '.packages["node_modules/some-dep"].integrity'
# => "sha512-AbCdEf..." ← SHA-512 hash of the published package tarball
# => npm verifies this hash on install — tampered packages will not install

Apply the same hardening to your own extensions:

# In your extension's package.json, pin ALL dependencies to exact versions
# (No ^ or ~ prefixes — exact pins only)
# package.json:
# {
#   "dependencies": {
#     "some-package": "1.2.3"    ← exact, not "^1.2.3" or "~1.2.3"
#   }
# }
 
# Generate a shrinkwrap file for your extension before publishing
cd your-extension/
npm shrinkwrap
# => wrote npm-shrinkwrap.json with 47 packages
 
# Audit your extension's dependencies
npm audit
# => found 0 vulnerabilities
# => (Fix any findings before publishing — do not ship extensions with known CVEs)
 
# Pin your TypeScript and Node.js versions for reproducibility
# .nvmrc or .node-version in your extension root:
echo "20.19.0" > .node-version
# => Users running your extension on Node.js 20.19.0 have the same behavior as you

// Supply-chain-hardened extension: explicit imports, no dynamic require()
 
// GOOD — static import, resolved at build time
import { execSync } from "child_process";
// => Static import — bundler can verify this
 
// BAD — dynamic require with variable input is a supply-chain risk
// const mod = require(userProvidedModuleName);
// => Never do this in an extension — dynamic require with external input
// => can load arbitrary code if the input is not strictly validated
 
// GOOD — if you must load modules dynamically, validate the name against an allowlist
const ALLOWED_FORMATTERS = ["prettier", "eslint", "tsc"] as const;
type Formatter = (typeof ALLOWED_FORMATTERS)[number];
 
async function runFormatter(name: Formatter, file: string): Promise<string> {
  // => TypeScript type ensures 'name' is one of the three allowed strings
  // => The union type is the allowlist — nothing outside it can pass type checking
  const { execAsync } = await import("./utils");
  return execAsync(`npx ${name} ${file}`);
}

Key Takeaway: Pi pins all dependencies with exact versions and npm-shrinkwrap.json to lock the full transitive tree; apply the same pattern to extensions you distribute.

Why It Matters: A coding agent runs code on your machine with your shell permissions. A compromised dependency in the agent's package tree could inject malicious tool calls, exfiltrate files, or escalate privileges. Supply-chain hardening is not security theater — it is the concrete mechanism that prevents these attacks.

Section 27: Multi-Agent Patterns

Pi's RPC mode enables multi-agent patterns: multiple Pi instances, each specialized for a task, coordinated by an orchestrator. The orchestrator sends tasks to each agent and aggregates their results. This lets you parallelize work across agents, specialize each agent's tools and system prompt for its task, and isolate risks (an agent reviewing untrusted code does not share state with an agent that has write access).

The most common multi-agent patterns with Pi are: parallel investigation (multiple agents read different parts of the codebase simultaneously), pipeline (agent A produces output that agent B processes), and review (agent A writes code, agent B reviews it).

import { PiRPCClient } from "./pi-rpc-client"; // The Python client pattern from Section 23, in TS
 
// Orchestrator: run two specialized agents in parallel
async function parallelCodeAnalysis(sourcePath: string): Promise<void> {
  // Agent 1: security review — specialized system prompt, read-only tools
  const securityAgent = new PiRPCClient({
    cwd: sourcePath,
    systemPromptFile: ".pi/security-review-system.md",
    // => Loads a system prompt focused on security
    model: "claude-opus-4-5", // => Use stronger model for security analysis
  });
 
  // Agent 2: performance review — different specialization, parallel execution
  const perfAgent = new PiRPCClient({
    cwd: sourcePath,
    systemPromptFile: ".pi/perf-review-system.md",
    model: "claude-sonnet-4-5", // => Lighter model for performance review
  });
 
  // Start both agents in parallel — they run simultaneously, sharing no state
  const [securityResult, perfResult] = await Promise.all([
    securityAgent.run(`Review ${sourcePath}/src/auth/ for security issues`),
    perfAgent.run(`Review ${sourcePath}/src/db/ for performance issues`),
  ]);
  // => Both RPC calls run concurrently
  // => Each agent has its own Pi session, its own context, its own tool calls
  // => No shared state between agents — fully isolated
 
  // Aggregate results
  console.log("Security findings:", securityResult.response);
  console.log("Performance findings:", perfResult.response);
 
  // Clean up
  securityAgent.close();
  perfAgent.close();
}
 
// Pipeline pattern: agent A generates, agent B reviews
async function generateAndReview(spec: string): Promise<void> {
  const generatorAgent = new PiRPCClient({
    cwd: "/project",
    systemPrompt: "You are a TypeScript developer. Write clean, typed code.",
  });
 
  // Step 1: generate
  const generated = await generatorAgent.run(`Implement the following spec and write it to src/: ${spec}`);
  // => Generator agent writes files to src/
 
  // Step 2: review what was generated (new agent, starts fresh)
  const reviewerAgent = new PiRPCClient({
    cwd: "/project",
    systemPrompt: "You are a code reviewer. Identify bugs and improvements.",
  });
 
  const review = await reviewerAgent.run(`Review the files recently added to src/ and report issues.`);
  // => Reviewer agent reads the files the generator wrote and produces findings
  // => Reviewer has no context of the generator's session — independent assessment
 
  console.log("Generated:", generated.response);
  console.log("Review:", review.response);
}

Key Takeaway: Run multiple Pi instances via RPC with different system prompts and tool sets to parallelize work, create pipelines, or isolate agent capabilities from each other.

Why It Matters: Multi-agent patterns are how you scale beyond what a single agent session can do safely and efficiently. Isolation between agents means a mistake in one agent's session does not corrupt another's context, and parallel execution means independent tasks complete faster than sequential single-agent work.

Section 28: Prompt Templates and Dynamic Injection

Prompt templates in Pi are AGENTS.md files (or inline strings) that contain template variables filled at session start. Dynamic injection goes further: certain content is injected into specific turns rather than at session start, based on what the agent is currently doing.

Template variables allow you to parameterize your context files without hardcoding values that change across environments or runs. Dynamic injection lets you provide large reference documents (API specs, database schemas) only when the agent is working on a turn that requires them, rather than paying their token cost in every turn.

<!-- AGENTS.md with template variables -->
 
## Project: {{PROJECT_NAME}}
 
Environment: {{ENVIRONMENT}}
Database: {{DATABASE_URL}}
API base: {{API_BASE_URL}}
 
## Current Task Context
 
Branch: {{GIT_BRANCH}}
Last commit: {{GIT_COMMIT_SHORT}}
Files changed: {{GIT_CHANGED_FILES}}
 
<!-- Pi fills {{...}} variables from environment variables or a .pi-vars file -->
<!-- Variables not found are left as-is and logged as a warning -->

# Variables filled from environment variables
export PROJECT_NAME="TaskManager API"
export ENVIRONMENT="development"
export DATABASE_URL="postgresql://localhost:5432/taskmanager_dev"
export API_BASE_URL="http://localhost:3000"
 
# Or from a .pi-vars file (not committed — add to .gitignore)
cat .pi-vars
# => PROJECT_NAME=TaskManager API
# => ENVIRONMENT=development
# => DATABASE_URL=postgresql://localhost:5432/taskmanager_dev
 
pi
# => Pi v0.75.4
# => Template variables resolved: PROJECT_NAME, ENVIRONMENT, DATABASE_URL, API_BASE_URL, ...
# => Git variables injected: GIT_BRANCH=main, GIT_COMMIT_SHORT=a1b2c3d, GIT_CHANGED_FILES=3
# => Loaded AGENTS.md (487 tokens — after variable substitution)

// Dynamic injection via an extension — inject API schema only when relevant
import { PiExtensionAPI } from "@earendil-works/pi-coding-agent";
import { readFile } from "fs/promises";
 
export default function setup(api: PiExtensionAPI): void {
  // Register a turn hook — called before each LLM turn
  api.onBeforeTurn(async (turn, { inject }) => {
    const message = turn.userMessage.toLowerCase();
 
    // Only inject the OpenAPI schema when the user is asking about the API
    if (message.includes("api") || message.includes("endpoint") || message.includes("route")) {
      const schema = await readFile("openapi.yaml", "utf-8");
      // => Read the schema file on demand
 
      inject({
        content: `# API Schema\n\`\`\`yaml\n${schema}\n\`\`\``,
        position: "before-message", // => Inject before the user's message in context
        label: "openapi-schema", // => Shown in verbose logs for debugging
      });
      // => Schema is in context for this turn only
      // => Not injected on turns where the user is not asking about the API
    }
  });
}

Dynamic injection is the mechanism behind Pi's skill system (Section 20). Skills are a structured form of dynamic injection with automatic relevance scoring. The raw injection API in the turn hook is for cases where you need full control: inject a file, a database schema query result, or a web page — only when the current turn specifically needs it.

Key Takeaway: Template variables in AGENTS.md fill from environment variables at session start; turn hooks let extensions inject content into specific turns on demand, keeping large reference documents out of the context when they are not needed.

Why It Matters: Dynamic injection is the mechanism that makes Pi efficient on large codebases. A 50,000-token database schema injected on every turn would consume most of a model's context budget. Injected only on turns where the agent is reasoning about schema, it costs nothing when irrelevant and provides full detail when needed.

Section 29: pi-ai: Unified LLM API

@earendil-works/pi-ai is the LLM abstraction layer that Pi's agent runtime sits on top of. It provides a single TypeScript interface for calling any supported LLM provider, handling authentication, request formatting, response parsing, streaming, and retry logic. You can use pi-ai standalone — without the rest of the Pi stack — as a provider-agnostic LLM client in any TypeScript project.

The core value of pi-ai is that it normalizes different providers' APIs into a single interface. OpenAI, Anthropic, Google, and Bedrock all have different request schemas, authentication mechanisms, and response formats. pi-ai handles all of that internally.

import { createProvider, Message, ToolDefinition } from "@earendil-works/pi-ai";
 
// Create a provider — swap one line to change providers
const provider = createProvider("anthropic", {
  apiKey: process.env.ANTHROPIC_API_KEY!,
  model: "claude-sonnet-4-5",
});
// => Identical interface for: createProvider("openai", {...})
// =>                          createProvider("google", {...})
// =>                          createProvider("ollama", {...})
 
// Define tools in pi-ai's unified format
const tools: ToolDefinition[] = [
  {
    name: "get_weather",
    description: "Get current weather for a city",
    parameters: {
      type: "object",
      properties: {
        city: { type: "string" },
      },
      required: ["city"],
    },
  },
];
 
// Make a single LLM call with tools
const messages: Message[] = [{ role: "user", content: "What is the weather in Jakarta?" }];
 
const response = await provider.call({
  messages,
  tools, // => Tools are normalized to provider's function-calling format
  systemPrompt: "You are a helpful assistant.",
});
// => response.type is "text" or "tool_calls"
 
if (response.type === "tool_calls") {
  for (const call of response.toolCalls) {
    console.log(call.name); // => "get_weather"
    console.log(call.arguments); // => { city: "Jakarta" }
    // => Execute the tool and feed result back...
  }
}

Failover configuration routes requests to a backup provider when the primary fails. This is useful when a provider has an outage or rate limit:

import { createProvider, withFailover } from "@earendil-works/pi-ai";
 
const primary = createProvider("anthropic", {
  apiKey: process.env.ANTHROPIC_API_KEY!,
  model: "claude-sonnet-4-5",
});
 
const backup = createProvider("openai", {
  apiKey: process.env.OPENAI_API_KEY!,
  model: "gpt-4o",
});
 
// Wrap the primary provider with automatic failover
const provider = withFailover(primary, {
  fallback: backup, // => Use backup when primary fails
  retries: 2, // => Retry primary 2 times before switching
  onFailover: (error) => {
    console.warn(`Failover triggered: ${error.message}`);
    // => Log when failover occurs
  },
});
// => provider.call() now automatically retries primary twice, then uses backup
// => Your code doesn't change — same interface whether primary or backup responds

Adding a custom provider for an API not in pi-ai's built-in list requires implementing the Provider interface — a single call() method that accepts the normalized request format and returns the normalized response:

import { Provider, ProviderRequest, ProviderResponse } from "@earendil-works/pi-ai";
 
// Custom provider for a hypothetical company-internal LLM API
class InternalLLMProvider implements Provider {
  constructor(
    private apiEndpoint: string,
    private authToken: string,
  ) {}
 
  async call(request: ProviderRequest): Promise<ProviderResponse> {
    // Translate from pi-ai's normalized format to your API's format
    const internalRequest = {
      prompt: request.systemPrompt + "\n" + request.messages.map((m) => m.content).join("\n"),
      functions: request.tools?.map((t) => ({ name: t.name, schema: t.parameters })),
      max_tokens: 4096,
    };
 
    const res = await fetch(this.apiEndpoint, {
      method: "POST",
      headers: { Authorization: `Bearer ${this.authToken}` },
      body: JSON.stringify(internalRequest),
    });
 
    const data = await res.json();
 
    // Translate response back to pi-ai's normalized format
    if (data.function_call) {
      return {
        type: "tool_calls",
        toolCalls: [{ name: data.function_call.name, arguments: data.function_call.args }],
      };
    }
    return { type: "text", content: data.generated_text };
  }
}

Key Takeaway: @earendil-works/pi-ai provides a single provider.call() interface for 15+ LLM providers; implement the Provider interface to add any provider not in the built-in list.

Why It Matters: Building on pi-ai means your code is not coupled to a specific LLM provider's API. When you switch from OpenAI to Anthropic (or vice versa), you change one createProvider() call — no request-shaping code, no response-parsing code, no authentication handling changes anywhere else.

Intermediate

Command Palette