Jobs & Lambdas

Reminders

  • HW1 is due tonight!
  • No class next week, Fall break
  • HW2 will be released tomorrow, due in 2 weeks
  • Reminder: late day policy
    • 5 days total, up to 2 per assignment

Agenda

  1. Jobs: Motivation
  2. Job Queue Overview
  3. Serverless Computing/Lambdas
  4. Agentic AI
  5. Lab

Jobs: Motivation

Jobs: Motivation

  • Within a system, you may find a need for asychronous or long running tasks
  • These tasks might need to be triggered in different ways
    • User action, e.g. web request or button press
    • Scheduled job
  • How can we achieve this?

Jobs: Motivation

  • Consider an endpoint like post upload for Instagram. We may want to do some additional actions like:
    • Resize/compress the image
    • Send notifications/emails to friends
    • Run it through content moderation
  • What could happen if we coded all of this logic within the endpoint handler itself?

Jobs: Motivation

  • If this logic was on the path of the web request handler:
    • The web request might take a long time to return or even timeout
    • This will block/reduce resources from other requests happening on the server
    • What if some parts fail, e.g. notifications?
  • We need a way to offload these tasks from the main thread that handles web requests

Job Queues

Job Queues

A system for asynchronously processing tasks outside the request-response cycle, either offloading or scheduling them.

Often called other names, but they're all basically the same thing:

  • Message Queue
  • Publisher & Subscriber (pub/sub)

Core Components

  1. Producer - Creates jobs and adds them to the queue
  2. Queue/Broker - Stores jobs waiting to be processed
  3. Worker - Consumes and executes jobs
  4. Result Store (optional) - Saves job results/status

The Producer

Enqueues jobs in response to events. Importantly, only needs to ensure description of task is stored in queue.

Responsibilities:

  1. Serialize job data
  2. Set priority/retry policies
  3. Return quickly to caller

The Queue/Broker

Storage and distribution of jobs.

Responsibilities:

  1. Store job states
  2. Distribute jobs (push or pull)

Additional Features:

  • Persistence (survive crashes)
  • Ordering guarantees
  • Delivery semantics (at-least-once, exactly-once)

The Worker

Execute jobs from the queue.

Responsibilities:

  1. Poll or subscribe to the queue
  2. Process jobs
  3. Handle errors and retries
  4. Report completion status

Importantly, workers are horizontally scalable.

job queue

job queue

Delivery Guarantees

We can implement various types of guarantees for job delivery/execution.

At-most-once: Job might be lost, never duplicated

At-least-once: Job always delivered, might be duplicated

Exactly-once: Job delivered once and only once

Real-World Example

User uploads profile photo:

  1. Web handler saves file, returns 200 OK
  2. Enqueues ProcessProfilePictureJob with file path
  3. Worker picks up job
  4. Generates thumbnails, runs moderation, sends notifications, etc.
  5. Updates database with results
  6. User's browser polls for completion

Aside: Redis

Redis is an in memory data store

  • Extremely fast (microsecond latency)
  • Rich data structures (strings, lists, sets, sorted sets, hashes)
  • Optional persistence to disk
  • Pub/sub messaging
  • Atomic operations

Common uses: Cache, session store, real-time analytics, job queues

Serverless Computing & Lambdas

What is Serverless?

A misnomer, in "serverless", servers still exist, you just don't manage them.

Traditional: You provision and manage servers

  • Rent VMs, install software, scale manually
  • Pay for servers even when idle

Serverless: Cloud provider manages infrastructure

  • Write code, deploy functions
  • Provider handles scaling, availability
  • Pay only for actual execution time

Key Characteristics

⚡ Event-driven: Functions triggered by events, e.g. HTTP request, File upload, Queue message, Timer/schedule

📈 Auto-scaling: From 0 to thousands of instances automatically

💰 Pay-per-use: Billed by execution time (milliseconds)

⏱️ Stateless: Each invocation is independent

Popular Platforms

AWS Lambda: Most mature, largest ecosystem

Google Cloud Functions: Good for GCP integration

Azure Functions: Strong .NET support

Cloudflare Workers: Edge computing, ultra-low latency

Vercel/Netlify Functions: Frontend-focused

Serverless + Job Queues

Serverless in a job queue paradigm makes a lot of sense

Benefits:

  • No idle workers consuming resources
  • Automatic scaling with queue depth
  • Pay only for actual processing
  • Built-in retry mechanisms

Example: Image Processing

Serverless Limitations

⏰ Execution time limits: 15 min (AWS Lambda)

  • Long-running jobs need different approach

🧊 Cold starts: First invocation can be slow (100ms - 10s)

💾 Limited memory/storage: Ephemeral disk space

🔗 Stateless: Can't maintain connections between calls

💵 Cost at scale: Can get expensive with constant high traffic

When to Use Serverless

✅ Good for:

  • Event-driven processing
  • Irregular/spiky workloads
  • Short-duration tasks
  • Rapid prototyping
  • LLM Inference

❌ Not ideal for:

  • Long-running computations (>15 min)
  • Stateful applications
  • Predictable steady traffic
  • Tasks needing persistent connections

Agentic AI

How do LLMs like ChatGPT and Claude do things like search the internet?

Agentic AI

  • LLMs are effectively glorified "bags of words", spitting out text
  • If we want them to actually execute tasks like run code or search the web, we need to give them "tools"
  • These tools are functions we define that the LLM can call
  • We combine this with the LLM's reasoning ability to accomplish complex tasks autonomously
  • Example: "search for latest Python updates and summarize them"

Simple Example

"Find the cheapest flights from NYC to London next week"

Task Execution:

  1. Search for flights on multiple sites (tool: web search)
  2. Compare prices across results
  3. Check baggage policies
  4. Verify departure times work
  5. Present top 3 options with tradeoffs

Agentic AI

Traditional AI: Responds to single prompts

Agentic AI: Pursues goals through multiple steps and tool usages

  • User gives goal → AI plans → AI executes → AI summerizes

Key difference: Autonomy and iteration toward a goal

Agentic AI: Tools

How does an LLM know what tools it has?

  • Effectively sophisticated prompt engineering
  • We provide all context of tools in the prompt, e.g. name, params, types, etc
  • When the model wants to call a tool, it outputs a structured output
  • We parse this output, execute the function call, then return the result to the model, which then continues
  • Multiple libraries exist that abstract all of this away for us, we'll be using pydantic-ai

Building Agentic AI: Core Loop

  1. Jarvis receives message and decides if it is actionable
  2. If there is a task, first determine execution plan
  3. Pass plan to execution agent that has access to tools
  4. Execution agent runs step by step and calls tools when necessary
  5. Returns final result to user

Jarvis Job Queue Stack

  • Jarvis uses
  1. User Message
  2. ┌──────────────────────┐
  3. │ TaskDetectionAgent │ ← Is this actionable?
  4. └──────────┬───────────┘
  5. ┌──────────────┐
  6. │ PlanningAgent│ ← Generate execution plan
  7. └──────┬───────┘
  8. ┌──────────────────┐
  9. │ ExecutionAgent │ ← Execute with tools
  10. └──────┬───────────┘
  11. [Tools: web_search, mark_progress]
  12. Redis Progress Updates → Final Result
  1. User Message
  2. ┌──────────────────────┐
  3. │ TaskDetectionAgent │ ← Is this actionable?
  4. └──────────┬───────────┘
  5. ┌──────────────┐
  6. │ PlanningAgent│ ←──────┐
  7. └──────┬───────┘ │
  8. ↓ │── Job Queue
  9. ┌──────────────────┐ │
  10. │ ExecutionAgent │ ←──┘
  11. └──────┬───────────┘
  12. [Tools: web_search, mark_progress]
  13. Redis Progress Updates → Final Result
  1. @self.agent.tool_plain
  2. async def web_search(search_query: str, freshness: str = "pw") -> str:
  3. """Perform a real web search and return formatted results."""
  4. search_results = search_web(search_query, self.context.tools, freshness)
  5. return format_results(search_results)
  6. @self.agent.tool_plain
  7. async def mark_progress(message: str) -> str:
  8. """Stream a progress update to the user in real-time."""
  9. update = TaskUpdate(
  10. status=TaskStatus.EXECUTING,
  11. content=message,
  12. timestamp=datetime.now(timezone.utc)
  13. )
  14. self.context.execution.redis_client.publish_task_update(
  15. self.context.agent.chat_id, update
  16. )
  17. return "✅ Progress update sent"
  1. # From agentic_job.py - RQ job function
  2. def execute_agentic_task_job(task_context: AgentContext, plan: TaskPlan):
  3. """Job function executed by RQ worker"""
  4. # Convert to context bundle
  5. context_bundle = AgentContextBundle.create(
  6. user_id=task_context.user_id,
  7. chat_id=task_context.chat_id,
  8. response_id=task_context.response_id,
  9. original_message=task_context.original_message,
  10. redis_client=get_redis_client(settings.redis_url)
  11. )
  12. # Create execution agent and run
  13. execution_agent = ExecutionAgent(context_bundle)
  14. result = await execution_agent.execute_plan(plan)
  15. return result