Setting up Epithet as of v0.1.7 30 Nov 2025
The Goal
We want to set up a real epithet system, end to end, and use it. Because everything is self hosted, there are a number of steps:
- Install epithet on your local Mac
- Configure an SSO provider
- Set up CA and Policy services
- Configure the epithet agent
- Create a cloud-init config for new VMs to rely on the CA
- Actually SSH to the VM!
- Run epithet as a daemon
- Cleaning up config and playing around
Prerequisites
Have tofu and awscli installed. Make sure you have configured awscli. You will need to run this on a machine capable of spawning a web browser you interract with, so if you are doing it on a VM or such, do it under X (or Wayland, VNC, etc). The article assumes you are on a Mac, because I am, but it should adapt alright if you are on a flavor of linux or whatnot. I have not tried any of this on Windows yet. If it works, let me know!
Install epithet on your local machine
As mentioned, I'm working on a Mac, and use Homebrew, so this article is kind of oriented around that:
brew install epithet-ssh/tap/epithet
If you are not on a mac, or not using homebrew, you will need to build and install it yourself. Clone epithet and check out v0.1.7 which is the version this article is written against. It's implemented in Go, so you'll need that installed. You can build it with make:
git clone https://github.com/epithet-ssh/epithet/
cd epithet
git checkout v0.1.7
make epithet # or just `make` if you want to run the tests
If you built it youself, put the binary somewhere on your $PATH. If you don't have it on your $PATH, later stuff we do will be trickier, and changing the examples to use the full path is an exercise left to you.
Configure an SSO provider
We'll use Google for SSO, but you could as easily use Microsoft, Apple, AWS (Cognito), Facebook, Github, Gitlab, Okta, Yahoo!, Keycloak, Authentik, and so on. OIDC/OAuth2 is pretty widespread nowadays, and you don't need a full IDP (ie, a full SCIM provider) for Epithet, just someone to provide identity at the other end of OIDC. I'm using Google because GMail is widely used, so it is useful for many people.
- Go to Google Cloud Console
- Select or create a project
- Navigate to APIs & Services → Credentials
- Configure a consent screen
- Add a name for your App, I used "Epithet Demo"
- For Audience, if you are using Google Workspace you can make it Internal and be good to go. If using general GMail, then make it External. I am making this demo External.
- This dropped me on a screen with a "Create OAuth Client" -- which I picked. If you are getting there differently it is fine, you need to create an OAuth Client/Credential.
- Choose Application type: I use Universal Windows Platform (UWP). Despite the name, it is not limited to Windows apps, and it does PKCE. "Desktop App" does not support PKCE, so you need to embed the OAuth secret in the agent config. The internet tells me this is fine, and secure, and the secret is not really a secret. I want to just use PKCE, so I use UWP here.
- Enter a name: I used "Epithet Agent"
- If you are using UWP you will need to include a Store ID, I don't believe it is used anywhere. I used "epithet-demo"
- Click Create
- Make note of the Client ID.
- Select the "Audience" tab on the left.
- In the "Test Users" section add a couple users (email addresses) who you want to be able to use the app. I added a couple google accounts I have, a general public one which I am using to set up this demo, and a Workspace email that I actually use for things. I set up two so that I can configure different policies for them later. If you created an "Internal" app type earlier, you can skip this step -- your Workspace users should be able to use the app automagically.
You can test the configuration using the epithet auth oidc command:
echo "" | epithet -vvv auth oidc --issuer=https://accounts.google.com --client-id=<OAUTH_CLIENT_ID> 3>/dev/null
Note, use your client id, not the <OAUTH_CLIENT_ID> placeholder. Also note the echo "" | at the beginning and 3>/dev/null at the end: these are for epithet's auth plugin protocol, which expects input on stdin and to write state to FD 3. This should pop up the authentication flow and if you log in will let you know it was successful and show you the auth token.
In my case, it was successful, so I am moving on!
Set up CA and Policy services
These are generally low volume things which are more or less perfect use cases for FaaS platforms. Because AWS is the most popular I made Lambda wrappers for epithet's built in basic CA and Policy services. Start by cloning the repo for them, which contains the terraform (tofu) configs to actually set them up (as well as the wrapper code if you want to play with it). I set it up as a template repo, so you can use the "Use this template" to clone it in Github if you like. Make sure to set the privacy to "Private" if you do -- it can be public, but Github will complain about you checking in your OAuth Client ID if you do.
git clone https://github.com/epithet-ssh/epithet-aws epithet-demo
cd epithet-demo
Configuring the Policy Server
Now, let's do some basic configuration for the deployment. We do this with some terraform variables, created and edit terraform/terraform.tfvars:
aws_region = "us-east-1"
project_name = "epithet-demo"
I am using us-east-1 for the demo (unofficial designated demo region) and picked a name (epithet-demo).
Now we need to make a basic policy for the Policy service. It should be in config/policy.yaml. I'm going to use:
# CA public key is loaded from SSM Parameter Store at runtime
# This placeholder satisfies config validation
ca_public_key: "placeholder - loaded from SSM"
oidc:
issuer: "https://accounts.google.com"
audience: "<OAUTH_CLIENT_ID>"
users:
"brianm@skife.org": ["wheel"]
"brian.mccallister@gmail.com": ["dev"]
defaults:
allow:
arch: ["dev", "wheel"]
expiration: "5m"
#hosts:
# "prod-*.example.com":
# allow:
# deploy: ["dev"]
# expiration: "2m"
Let's talk about what it does and how it works.
The Policy server needs the CA's public key in ca_public_key as it validates requests that come with it (the CA signs them using the CA private key). For the Lambda wrapper we load the public key from AWS's config service (SSM) so the actual configuration value is ignored.
The oidc section needs to know the provider URL (issuer) and the client id (called audience here, for OIDC reasons). Put your client id for the audience attribute.
The users section configures allowed users. The built-in Policy
server uses static configurations, so it needs to know which users are
allowed to use it, up front. Additionally, it uses a group-style
matching system and assumes that "remote-user == principal". For demo
purposes I use two groups, wheel and dev, but that choice is
arbitrary. It also has no relationship to actual unix groups on the
hosts -- it is purely used internally for matching users to principals
in the policy server. I configured two users so I can mess around later and experiment.
The defaults section establishes defaults for host configuration. In this case it gives the arch principal to anyone in the dev or wheel group, unless overridden for a specific host. We also set up a 5 minute expiration for certificates.
Finally, the hosts section lets you do per-host configration. It is a map of host-glob -> config. I have it commented out for now, but am leaing it as a reminded for playing around later.
This overall config means that, assuming one of the users authenticates successfully (which it checks), it will issue a cert for any remote host and remote user with the arch principal. I used that principal because I am going to use Arch Linux cloud images to test things, and those provision the arch user by default.
Deploy the services
I am not going to step through setting up a new AWS acount for you, so I'll prefix this next section by saying -- have an AWS account, and have the aws-cli configured to use it!
make init
make apply
Watch tofu set up a pile of stuff in AWS:
- API Gateway for CA
- API Gateway for Policy
- IAM policies
- Secret to hold the private key
- SSM to hold the public key
- S3 bucket to keep certificate log in
- Other stuff, look in
terraform/of the repo you cloned for everything. I believe there are 32 things total it sets up.
When it is finished it will spit out a bunch of information for you:
ca_public_key = <sensitive>
ca_public_key_command = "aws ssm get-parameter --name /epithet-demo-9230323c/ca-public-key --query Parameter.Value --output text"
ca_public_key_parameter = "/epithet-demo-9230323c/ca-public-key"
ca_secret_arn = "arn:aws:secretsmanager:us-east-1:378899212612:secret:epithet-demo-9230323c-ca-key-encO9c"
ca_secret_name = "epithet-demo-9230323c-ca-key"
ca_url = "https://ir6kilkap3.execute-api.us-east-1.amazonaws.com/"
cert_archive_bucket = "epithet-demo-9230323c-cert-archive"
cert_archive_bucket_arn = "arn:aws:s3:::epithet-demo-9230323c-cert-archive"
policy_url = "https://e2np6k2uj5.execute-api.us-east-1.amazonaws.com"
region = "us-east-1"
Save this info somewhere, you will need the CA URL later, and you may want to poke around the S3 bucket where info about issued certs is stored, or look at the various logs.
But we are not done, we need to actually make the CA key and upload it:
make setup-ca-key
We have to create the key pair after we provision everything else because we need the Secret and SSM to be created in AWS before we can put the private and public keys in them respectively.
You can test if the CA is running by issuing a GET to it, it should return the public key:
epithet-demo took 3s [!?] on main ❯ xh GET https://ir6kilkap3.execute-api.us-east-1.amazonaws.com/
HTTP/2.0 200 OK
apigw-requestid: U35tYhMyoAMEcRw=
content-length: 81
content-type: text/plain
date: Sun, 30 Nov 2025 20:03:46 GMT
ssh-ed25519 AAAAC3NzaC1lZDI1NTE5AAAAIIiLCOTQVHFqu7zP5j3KCjrfYavXqk7wasuckA6QvQgP
epithet-demo [!?] on main ❯
I should probably put the same kind of smoke test on the policy server, now that I think about it.
Despite having made a pile of AWS resources, most of them are cheap and idle, so for a couple of users authenticating a few times a day, it should cost under a dollar a month. Sometimes the cloud is kind of cool.
Configure the epithet agent
Okay, we have the system up, let's configure our agent to use it! Epithet uses ~/.epithet/ for its various configs and running stat. Yes, this is not XDG style. I am open to XDG, but this seems simpler and is not totally unexpected.
cd ~/.epithet
$EDITOR ./config
Add content like:
ca-url <Your CA URL>
match *
auth epithet auth oidc --issuer https://accounts.google.com --client-id <YOUR OAUTH CLIENT ID>
This config is fine for testing, for now, but the match * line is not something we'll want to leave in long term -- it is telling epithet that it should be used for every attempted connection, which we probably don't want. Once we have a VM or three to try against, we'll constrain it down.
We also need to tell SSH to use epithet, to do this we drop a line in ~/.ssh/config:
Include ~/.epithet/run/*/ssh-config.conf
Do this somewhere near the top, and definitely before anywhere you might set up an ssh agent socket.
Now, to debug we will start the agent manually -- run:
epithet -vvv agent
You should get some debug log output.
Create a cloud-init config for new VMs to rely on the CA
cloud-init
So the easiest way I have found for basic setup of a new vm is cloud-init, every major distribution and provider supports it, including my favorite, vm-bhyve (which is not really a major provider, but what I use to manage VMs on my machines).
We'll make a user-data config for cloud-init which configures sshd to respect our CA key. Make a file named user-data in your current directory which contains:
#cloud-config
resize_rootfs: True
manage_etc_hosts: localhost
write_files:
- path: /etc/ssh/epithet_ca.pub
content: |
ssh-ed25519 AAAAC3NzaC1lZDI1NTE5AAAAIIiLCOTQVHFqu7zP5j3KCjrfYavXqk7wasuckA6QvQgP
owner: root:root
permissions: '0644'
- path: /etc/ssh/sshd_config.d/100-epithet.conf
content: |
TrustedUserCAKeys /etc/ssh/epithet_ca.pub
owner: root:root
permissions: '0644'
But instead of using the public key for my CA (ssh-ed25519 AAAAC3NzaC1lZDI1NTE5AAAAIIiLCOTQVHFqu7zP5j3KCjrfYavXqk7wasuckA6QvQgP) use the one for your CA.
This creates two files in /etc/ssh/ -- one with the public key, and one which tells sshd to trust that public key.
A VM
Okay, so we have a CA, an agent, and some cloud-init config, but we still need a server to use it on. Since we have used AWS so far, let's keep using it. Sadly, EC2 is pretty painful for basic VM setup nowadays. That said, we'll do it.
For this, I am going to set up in a region I never use, so that I have to do things from scratch. Clean slate makes for good demo. If you have a VPC you like, which allows SSH ingress, then you can skip the setup for the most part and just spin up your instance there. If not, hang on, we have some AWS to do.
I am going to us us-east-2 because I have never used it before. If you pick a different region you will need to find the Arch AMI for that region. You can find them at arch-ami-list.drzee.net. For us-east-2 the AMI is ami-0c87f4f769e675bf8.
If you are in bash or a regular sh:
# Create a default VPC (if one doesn't exist)
aws ec2 create-default-vpc --region us-east-2
# Create security group for SSH
sg_id=$(aws ec2 create-security-group \
--region us-east-2 \
--group-name ssh-access \
--description "SSH access" \
--query 'GroupId' --output text)
# Allow inbound SSH
aws ec2 authorize-security-group-ingress \
--region us-east-2 \
--group-id $sg_id \
--protocol tcp \
--port 22 \
--cidr 0.0.0.0/0
# Launch the instance
instance_id=$(aws ec2 run-instances \
--region us-east-2 \
--image-id ami-0c87f4f769e675bf8 \
--instance-type t3.micro \
--security-group-ids $sg_id \
--user-data file://./user-data \
--query 'Instances[0].InstanceId' --output text)
# Wait for it to be running
aws ec2 wait instance-running --region us-east-2 --instance-ids $instance_id
# Get the public IP
ip=$(aws ec2 describe-instances \
--region us-east-2 \
--instance-ids $instance_id \
--query 'Reservations[0].Instances[0].PublicIpAddress' --output text)
echo $ip
If you are in fish, like me:
# Create a default VPC (if one doesn't exist)
aws ec2 create-default-vpc --region us-east-2
# Create security group for SSH
set sg_id (aws ec2 create-security-group \
--region us-east-2 \
--group-name ssh-access \
--description "SSH access" \
--query 'GroupId' --output text)
# Allow inbound SSH
aws ec2 authorize-security-group-ingress \
--region us-east-2 \
--group-id $sg_id \
--protocol tcp \
--port 22 \
--cidr 0.0.0.0/0
# Launch the instance
set instance_id (aws ec2 run-instances \
--region us-east-2 \
--image-id ami-0c87f4f769e675bf8 \
--instance-type t3.micro \
--security-group-ids $sg_id \
--user-data file://./user-data \
--query 'Instances[0].InstanceId' --output text)
# Wait for it to be running
aws ec2 wait instance-running --region us-east-2 --instance-ids $instance_id
# Get the public IP
set ip (aws ec2 describe-instances \
--region us-east-2 \
--instance-ids $instance_id \
--query 'Reservations[0].Instances[0].PublicIpAddress' --output text)
echo $ip
Note that when we made this VM we did not give it a public key. The only way to access it is via a certificate it trusts!
Actually SSH to the VM!
Given the above, we can ssh in:
ssh arch@$ip
It should pop a browser which asks you to log in, which you should do. When everything authenticates you should be back in ssh, and be able to connect:
epithet-demo [!?] on main ❯ ssh arch@$ip
.
/ \
/ \
/^. \ Arch Linux AMI (std)
/ .-. \ https://archlinux.org/
/ ( ) _\
/ _.~ ~._^\
/.^ ^.\ TM
[arch@ip-172-31-46-139 ~]$
Yea!
If you look over at the agent there will be a bunch of debug output showing what it did.
If you log out (or open a new terminal) you can inspect the state of the epithet agent:
epithet inspect
Which will show you the ssh agent sockets it has running and the certs it has:
Broker State
============
Socket: /Users/brianm/.epithet/run/d6d98793bf4b428a/broker.sock
Agent Dir: /Users/brianm/.epithet/run/d6d98793bf4b428a/agent
Match Patterns: [*]
Agents (1)
-----------
8fea92165cf18c9783b9b4e9b30ac50c559dd352
Socket: /Users/brianm/.epithet/run/d6d98793bf4b428a/agent/8fea92165cf18c9783b9b4e9b30ac50c559dd352
Expires: 2025-11-30T13:45:52-08:00 (valid, 2m17s)
Certificate: SHA256:cadNJy5p6U7h/tfExl5M5GYZ/kJMpFoPEwf7dvFmGss
Certificates (1)
-----------------
[0]
Fingerprint: SHA256:cadNJy5p6U7h/tfExl5M5GYZ/kJMpFoPEwf7dvFmGss
Identity: brianm@skife.org
Principals: [arch]
Valid: 2025-11-30T13:39:52-08:00 to 2025-11-30T13:45:52-08:00 (valid, 2m17s)
Extensions:
permit-agent-forwarding
permit-pty
permit-user-rc
Policy (HostUsers):
*: [arch]
The cert will expire in a few minutes, but the auth state should last as long as google doesn't want you to reauthenticate and you don't stop the agent. The auth state, certs, etc are all kept only in memory so killing the agent will wipe them out. No big deal, start the agent again and you can reauthenticate!
Run epithet as a daemon
If you installed via homebrew, it also added a homebrew service. You can enable the homebrew service via:
brew services start epithet
You can then stop it with stop instead of start, and so on. You need to restart it when you change the config, epithet agent does not pick up on config changes automagically.
Cleaning up config and playing around
Okay, so now you have a system, go play. The first thing I suggest doing is changing the match statement in ~/.epithet/config to match against just your VM and restarting the agent.
My config now looks like:
ca-url https://ir6kilkap3.execute-api.us-east-1.amazonaws.com/
match 18.220.139.216
match 54.190.23.91
auth epithet auth oidc --issuer https://accounts.google.com --client-id <abc123,etc>.apps.googleusercontent.com
Notice I have two match lines, that is because I spun up a second VM. The match line is used to short circuit what epithet tries to get a certificate for. In my actual day-to-day config it looks like:
ca-url https://<REDACTED>/
match *.home
match *.barn
match *.brianm.dev
auth /opt/homebrew/bin/epithet auth oidc --issuer https://accounts.google.com --client-id <REDACTED>
Which relies on wildcard matching. Our fooling around VMs don't have useful names to match on, so there we go. Alternately, you could have the policy server reject them, which will also cause matches to fail, but that is a different (and not yet written) article.
To change the policy server config you can edit the config/policy.yaml file and redeploy it via make apply.
When you are done playing, if you want to tear down what we set up, you can run make destroy to tear down the resources tofu made. You'll need to stop your VMs yourself (aws --region us-east-2 ec2 terminate-instances --instance-ids $instance_id)-- tofu didn't make them for you, it doesn't know about them.
Debugging
I can hope everything went smoothly for you, but it may not have. If it didn't go smoothly, I have found Claude Code to be extremely good at helping debug terraform/aws stuff. In the repo with the the terraform resources there is a CLAUDE.md and MCP server specs (.mcp.json) for AWS and Terraform respectively. While I used Tofu, not Terraform, the MCP server knows both. If you don't have uv installed, install it (brew install uv) and the MCP servers will work. Fire up claude and ask it to help you debug your issues.
Epithet, v2, Briefly - Part 1 28 Nov 2025
I last talked about epithet two years ago, in 2023, when I started kicking around ideas to make it general purpose. I'm pleased with how it's turned out, so I want to start talking about some of the design decisions. I dare not document it yet, for fear someone may use it, but that's probably coming.
Even though it never had a proper 1.0, I am calling this v2 in my head as it is not at all compatible with the previous iteration, and is so much better. The big pieces are the same shape: an Agent, a CA, and a Policy service:
Agent
The agent handles authenticating you, obtaining certificates, and deciding which certificate to use for a given connection. It uses a small policy protocol to match certificates against connections based on target host and remote username. Authentication is handled by a plugin system, so you can have arbitrary authenticators. There's a built-in OIDC authenticator, and I imagine I'll add SAML at some point.
Policy Service
The Policy Service is the brains of the system. It receives authentication and connection information and decides (1) whether to issue a cert, (2) which principals to put on the cert, (3) the cert parameters (expiration, extensions, etc), and (4) a matching policy telling the agent which connections this cert can be used for.
CA
The CA has access to the CA private key, so I want it to be as simple as possible. As such it delegates decision making about certificates to the policy service and merely issues certs as instructed.
So what's changed and why?
That looks pretty similar to two years ago, but there are a few key changes I'll dive into over a few posts. Today, the big reasons for the changes:
- I want to be able to issue a certificate which allows someone (or some process) to ssh into a specific host as a specific user only, rather than a more wide-open certificate with all of the user's principals. Imagine, if you will, a piece of deployment automation. We want it to be able to ssh as
deploy@prod-*.example.comfor the next ten minutes. Allowing this should be triggered by a workflow -- either automatically by inspecting if it should be deploying right now, or via a human workflow such as a response in Slack, and so on. - I don't want to bake in assumptions about how identities, principals, and hosts map to each other. The unix-group style set of principals from the previous version is reasonable for many use cases, but not all. It's overkill for simple cases (my personal stuff) and not powerful enough for the hard cases (see above).
- I want to coexist better with non-epithet ssh scenarios. Previously users had to carefully craft
Matchblocks in their ssh config to only use epithet when appropriate. Worse, if the CA or policy server was down, users would need to edit their ssh config to use a fallback. The last thing you want to do in a SEV is edit ssh config.
Notes on setting up vm-bhyve 04 May 2024
I'm replacing my general/util server at home and want to manage VMs on the new host with vm-bhyve. Setting it up is well documented, but running linux VMs still steers folks towards grub, which is not really great. These are just my notes (for later, after I forget) on using cloud images and uefi. This is a supplement to the docs, not a replacement!
I created a template, creatively named brian-linux.conf:
loader="uefi"
cpu=2
memory=4G
network0_type="virtio-net"
network0_switch="public"
disk0_type="nvme"
disk0_dev="sparse-zvol"
disk0_name="disk0"
I'm running on ZFS, so can use sparse-zvol for storage to make disk space only soft allocated, letting me overcommit.
Now, the commands to make things happy and run some cloud images:
# Additional dependencies for vm-bhyve
pkg install cdrkit-genisoimage # to use cloud-init
pkg install qemu-tools # to use cloud images
pkg install bhyve-firmware # to use uefi
# Download some cloud images to use for our VMs
vm img https://dl-cdn.alpinelinux.org/alpine/v3.19/releases/cloud/nocloud_alpine-3.19.1-x86_64-uefi-cloudinit-r0.qcow2
vm img http://cloud-images.ubuntu.com/noble/current/noble-server-cloudimg-amd64.img
vm create \ # C-k off comments if you copy this
-t brian-linux \ # our template, from above
-i noble-server-cloudimg-amd64.img \ # the cloud image
-s 50G \ # disk size override
-c 4 \ # cpu count override
-m 32G \ # memory override
-C -k ~brianm/.ssh/id_rsa.pub \ # cloud-init ssh pubkey
v0001 # vm name
Docs cover these, but reminding myself of how I configured it beyond defaults:
vm switch vlan public 4 # use vlan 4 for VMs
vm set console=tmux # tmux for console access
Nice side effect of using cloud-init is that it logs the DHCP assigned IP address as well, so I can fire up the console and go search for it (C-b C-[ C-s Address)! Probably should write a script to extract it, to be honest.