Containers & Docker

Admin Stuff

  • Assignments will be released via Ed announcements
    • HW1 will be released later today!
    • If you are auditing, email me and I will get you access to Ed

Agenda

  • Why do we need Docker?
  • What is Docker?
  • How do we use Docker?
  • Configurations & Secrets

Scenario: Deploying Code

  • Let's say you have a server that works locally, e.g. uv run fastapi dev main.py
  • How do you actually get this running on another machine?
    • What do we need to do as setup for a new copy of your computer?
    • What if the machine is completely different?
  • What we need is portability, the ability to run code in different environments easily.
    • Package managers ensure we have the right versions of things, but what about Python itself?
    • How can we ensure that system level things are set/installed, e.g. environment variables or curl?

docker meme

What if I just used an exact copy of my machine to deploy to production?

(why does this suck?)

Containerization

  • Intuition: what if we could use something like a virtual machine, but at the application level?
  • We define environment setup as code, making an image
  • Anyone can run an image to spawn a container
    • Think of images as a blueprint and containers as the object created from that blueprint
    • This allows us to create self-contained and portable execution environments

Containers vs VMs

docker container

vm

https://www.docker.com/resources/what-container/

docker logo

  • "Build once, run anywhere"
  • Industry-standard tool to package applications and their dependencies into lightweight, portable containers
  • Fast and efficient - implemented on top of host OS kernel as a software process, unlike VMs
  • Rich ecosystem - Docker Hub registry with millions of pre-built images

How does Docker work?

  • We define images through Dockerfiles, which express requirements, configs, binaries, etc as code
  • Images are created by building a Dockerfile
  • Images are composed of immutable layers, which are used for caching/reuse between images
  • We can then spawn a container by running an image

Docker Image Layers

Docker layers

Docker layers

https://docs.docker.com/get-started/docker-concepts/building-images/understanding-image-layers

  1. # FROM is used to pick a base image to start from
  2. # Alpine is a lightweight linux distro
  3. FROM alpine:3.22
  4. # Install uv
  5. # COPY can be used to copy files from an existing image on the internet (from a "hub")
  6. COPY --from=ghcr.io/astral-sh/uv:latest /uv /uvx /bin/
  7. # Set work directory, this effectively sets the root path for any subsequent commands
  8. WORKDIR /app
  9. # COPY can also copy files from our local machine to the image
  10. COPY services/auth/. ./
  11. # RUN is used to run a command directly on the image
  12. RUN uv sync --no-cache --link-mode=copy
  13. # CMD is used to set a command when running a container spawned from this image
  14. # This is not required and can be set at runtime
  15. CMD ["uv", "run", "fastapi", "dev", "main.py", "-p", "8000"]

Docker for Development

  • Docker is also a great way to create and manage development environments
  • Pain point: onboarding to a new codebase sucks, we have to install tools, fight with errors, add things to our shell, etc.
  • We can eliminate all of these things by developing within a container!
  • We won't explicitly do this in class, but it's good to be aware of
  • Cons: more resource intensive, persistent storage is harder, not great for GUI related things

Configurations & Secrets

Configuration

  • Imagine we have Jarvis running in 3 different environments: local, dev, and production
  • How might these environments differ?
  • How would this affect our code?
  • How do we remedy this?
  1. # services/auth/app/config.py
  2. class Settings:
  3. database_url: str = "sqlite:///../../jarvis.db"
  4. jwt_secret_key: str = "super-secret-key-dont-use-this-in-production"
  5. jwt_algorithm: str = "HS256"
  6. access_token_expire_minutes: int = 30
  7. refresh_token_expire_days: int = 30
  8. cookie_secure: bool = True
  9. cookie_samesite: str = "None"
  10. settings = Settings()

Why does this suck?

Configuration

  • We need a way to deal with two problems:
    • Configurations need to differ across environments
    • Some configuration values need to be kept secret
  • Idea: read these values from the environment at runtime, or fallback to an environment file
    • For environments like local and development, we probably don't care as much about keeping things secret, so using a file is better to know explicitly where values are coming from
    • For production, we will need a more robust setup (encryption and key management!). We will build upon this groundwork later

Lab: Configs & Docker