Section 2
LLMs & Generative AI
Deep dive into the LLM landscape — model comparisons, API integration, prompt engineering mastery, and production agentic systems.
2.1
LLM Landscape & Comparison
The major model families, their strengths, and when to use each.
Interactive Comparison Matrix
| Model | Context | Cost In | Cost Out | Speed | Intelligence | Vision | Tools | Strengths |
|---|---|---|---|---|---|---|---|---|
GPT-4o OpenAI | 128K | $5/1M | $15/1M | ✓ | ✓ | CodingReasoningVision | ||
Claude 3.5 Sonnet Anthropic | 200K | $3/1M | $15/1M | ✓ | ✓ | Long contextSafetyWriting | ||
Gemini 2.5 Flash Google | 1M | $0.15/1M | $0.60/1M | ✓ | ✓ | SpeedLong contextCost | ||
Gemini 2.5 Pro Google | 2M | $3.50/1M | $10.50/1M | ✓ | ✓ | ReasoningHuge contextCode | ||
Llama 3.3 70B Meta (OSS) | 128K | Free* | Free* | – | ✓ | Open sourcePrivacyFine-tunable | ||
Phi-4 Microsoft | 16K | Free* | Free* | – | ✓ | On-deviceSmall sizeReasoning |
* Hosting costs apply for self-hosted open-source models. Pricing approximate and subject to change.
2.2a
Anthropic Claude API
The preferred model for long-context tasks, safety-critical applications, and tool use. Powers this platform's AI assistant.
claude-3-5-sonnet-20241022
$3/$15 per 1M
Production workloads
claude-3-5-haiku-20241022
$0.80/$4 per 1M
High-volume, routing
claude-3-opus-20240229
$15/$75 per 1M
Complex reasoning
1import anthropic
2
3client = anthropic.Anthropic(api_key="your-key")
4
5# Basic message
6message = client.messages.create(
7 model="claude-3-5-sonnet-20241022",
8 max_tokens=1024,
9 system="You are a DoD financial analyst AI.",
10 messages=[
11 {"role": "user", "content": "Summarize FY2025 defense priorities"}
12 ]
13)
14print(message.content[0].text)
15
16# Tool use / Function calling
17tools = [{
18 "name": "search_budget_data",
19 "description": "Search DoD budget database",
20 "input_schema": {
21 "type": "object",
22 "properties": {
23 "year": {"type": "integer", "description": "Fiscal year"},
24 "service": {"type": "string", "description": "Military service branch"}
25 },
26 "required": ["year"]
27 }
28}]
29
30response = client.messages.create(
31 model="claude-3-5-sonnet-20241022",
32 max_tokens=1024,
33 tools=tools,
34 messages=[{"role": "user", "content": "What was Army FY2024 budget?"}]
35)Why Claude for This Platform
2.2b
Google Gemini API
Powers the MyThing platform agentic AI assistant with Google Search grounding and function calling.
1import { GoogleGenerativeAI } from '@google/genai';
2
3const genai = new GoogleGenerativeAI(process.env.GOOGLE_GEMINI_API_KEY!);
4
5// Function calling setup
6const tools = [{
7 functionDeclarations: [{
8 name: "get_budget_data",
9 description: "Retrieve DoD budget data",
10 parameters: {
11 type: "object",
12 properties: {
13 fiscal_year: { type: "string" },
14 component: { type: "string" }
15 }
16 }
17 }]
18}];
19
20async function runAgent(query: string) {
21 const model = genai.getGenerativeModel({
22 model: "gemini-2.0-flash",
23 tools,
24 systemInstruction: "You are a federal budget analyst AI."
25 });
26
27 const chat = model.startChat();
28 const result = await chat.sendMessage(query);
29
30 // Handle function calls
31 const response = result.response;
32 const parts = response.candidates?.[0]?.content?.parts || [];
33
34 for (const part of parts) {
35 if (part.functionCall) {
36 console.log("Tool called:", part.functionCall.name);
37 // Execute the function...
38 }
39 }
40
41 return response.text();
42}2.3
Prompt Engineering Mastery
Production-tested techniques for getting the best results from LLMs.
Zero-Shot
No examples — direct instruction. Works well with capable models.
"Classify this document as UNCLASSIFIED, CUI, or SECRET."Few-Shot
Provide 2-5 examples before the task. Dramatically improves accuracy.
"Example 1: X → Y. Example 2: A → B. Now classify: C → ?"Chain-of-Thought
Ask the model to reason step-by-step before answering.
"Think step by step about the budget implications..."System Prompts
Set role, constraints, output format at the system level.
"You are a DoD analyst. Always cite FM references. Output JSON."Tree of Thoughts
Explore multiple reasoning paths, select the best.
Generate 3 approaches, evaluate each, pick the optimal.Structured Output
Force JSON/XML output for programmatic processing.
"Respond ONLY in JSON: {"risk_level": ..., "factors": [...]}"Interactive Prompt Playground
Prompt Playground
Test prompts against Claude · Powered by Anthropic API
Response will appear here after running...
2.4
Agentic AI Systems
Building multi-agent systems with tool use, routing, and orchestration — based on the MyThing platform implementation.
ReAct Agent
Reason → Act → Observe loop. Tool use + reasoning interleaved.
Plan & Execute
Generate a full plan first, then execute each step.
Multi-Agent Router
Classify query → route to specialized agent. (MyThing pattern)
Reflexive Agent
Self-critique and revise before final output.
Case Study: MyThing Platform Agent Architecture
1# Multi-Agent System (MyThing Platform Pattern)
2from anthropic import Anthropic
3
4client = Anthropic()
5
6def route_query(query: str) -> str:
7 """Determine which specialized agent to use"""
8 routing = client.messages.create(
9 model="claude-3-5-haiku-20241022",
10 max_tokens=50,
11 system="Classify the query. Respond with ONLY one word: portfolio, tech, dod, or notes",
12 messages=[{"role": "user", "content": query}]
13 )
14 return routing.content[0].text.strip().lower()
15
16AGENT_CONFIGS = {
17 "portfolio": "You are a portfolio analyst for Peter Shang's projects...",
18 "tech": "You are a tech trends analyst with expertise in AI/ML...",
19 "dod": "You are a DoD policy and federal finance expert...",
20 "notes": "You help retrieve and summarize personal notes...",
21}
22
23def run_agent(query: str) -> str:
24 agent_type = route_query(query)
25 system_prompt = AGENT_CONFIGS.get(agent_type, AGENT_CONFIGS["tech"])
26
27 response = client.messages.create(
28 model="claude-3-5-sonnet-20241022",
29 max_tokens=1024,
30 system=system_prompt,
31 messages=[{"role": "user", "content": query}]
32 )
33
34 return f"[{agent_type.upper()} Agent] {response.content[0].text}"
35
36# Usage
37print(run_agent("What AI projects has Peter built?")) # → portfolio
38print(run_agent("Latest trends in agentic AI?")) # → tech
39print(run_agent("Explain FIAR audit requirements")) # → dodProduction Implementation
RAG (Retrieval-Augmented Generation)
1from anthropic import Anthropic
2import numpy as np
3
4# Simple RAG implementation
5client = Anthropic()
6
7def cosine_similarity(a: list, b: list) -> float:
8 a, b = np.array(a), np.array(b)
9 return np.dot(a, b) / (np.linalg.norm(a) * np.linalg.norm(b))
10
11def simple_rag(query: str, documents: list[dict]) -> str:
12 """
13 Retrieval-Augmented Generation with Claude
14 documents: [{"content": "...", "embedding": [...]}]
15 """
16 # 1. Embed the query (simplified - use real embedding API)
17 # query_embedding = embed(query)
18
19 # 2. Find most relevant docs
20 context = "\n\n".join([
21 f"[Source: {doc['source']}]\n{doc['content']}"
22 for doc in documents[:3] # Top 3 relevant chunks
23 ])
24
25 # 3. Generate with context
26 response = client.messages.create(
27 model="claude-3-5-sonnet-20241022",
28 max_tokens=1024,
29 system="""Answer questions using ONLY the provided context.
30 Cite sources when possible. If unsure, say so.""",
31 messages=[{
32 "role": "user",
33 "content": f"Context:\n{context}\n\nQuestion: {query}"
34 }]
35 )
36
37 return response.content[0].text