Deploy homelab HA vault over raspberry Pi
High-availability Vault on Raspberry Pi using an Ansible role and Docker, plus the rationale for running Vault outside Kubernetes for cluster bootstrapping.
Vault OSS on a Raspberry Pi (outside Kubernetes) — bootstrap-friendly HA with Ansible
In my homelab, Vault is a foundational dependency: it feeds credentials to GitOps (Argo CD), External Secrets Operator, CI/CD, and various automation workflows (I began with sealed secrets, which is great, but it requires the secre regeneration whenever you reinstall the cluster or just the sealed secret generation whenever you push a new secret which makes things a little bit cumbersome) .
That’s exactly why I decided to run Vault outside my Kubernetes cluster — on dedicated Raspberry Pi nodes — so I can bootstrap Kubernetes from scratch (or recover it) without a chicken-and-egg dependency on “Vault running inside the cluster that needs Vault to start”.
This post documents the approach and the Ansible role I use to deploy Vault OSS in Docker on Raspberry Pi, using integrated storage (Raft) and a simple HA topology.
Why Vault outside Kubernetes
Running Vault inside Kubernetes is totally possible, but it introduces a few practical problems in a homelab:
- Bootstrap problem: if the cluster needs Vault to fetch secrets (ESO, GitOps), but Vault is deployed by the same cluster, you end up with circular dependencies.
- Disaster recovery: if Kubernetes goes down (storage/ingress/node issues), you still want Vault reachable to restore access to credentials and recover services.
- Blast radius reduction: Vault is “too critical” to be dependent on the same control plane it’s securing.
So Vault runs on a separate substrate (Raspberry Pi + Docker + SSD storage), and Kubernetes consumes secrets from it.
High-level architecture
- Node A (RPi4): Vault server + Raft storage on SSD (leader most of the time)
- Node B (RPi3/RPi4): Vault server + Raft follower (standby)
Deployment approach
I deploy Vault using an Ansible role called rpivaultoss:
- Runs Vault in Docker (hashicorp/vault:
) - Uses bind mounts (my standard) to keep data/config on the SSD
- Uses integrated storage (Raft) for HA and replication
- Has an optional init step, only executed on the “leader” host
- Auto-detects the vault user UID/GID inside the container image to avoid permission pain with bind mounts
Role defaults
Below is the role defaults/main.yml, with:
- IPs/hosts/users replaced by CHANGEME_*
---
# defaults file for rpivaultoss
# ARM64-compatible image
vault_docker_image: "hashicorp/vault:1.21"
vault_config_read_only: false
vault_force_recreate: true # force container recreation if something relevant changes
# Names and paths
vault_container_name: "vault"
vault_network_name: "vault-net"
vault_container_command: ["server", "-config=/vault/config/config.hcl"]
vault_env_extra:
SKIP_SETCAP: "true"
SKIP_CHOWN: "true"
VAULT_DISABLE_MLOCK: "true"
VAULT_LOCAL_CONFIG: "" # force empty -> entrypoint does NOT append extra config
# Bind mounts (my standard)
vault_base_dir: "/mnt/ssd/vault"
vault_config_dir: "{{ vault_base_dir }}/config"
vault_data_dir: "{{ vault_base_dir }}/data"
# Published ports (host:container)
vault_published_ports:
- "8200:8200" # API/UI
# - "8201:8201" # cluster port (usually not necessary to expose)
# Listener and API/cluster addr
vault_listen_address: "0.0.0.0:8200"
vault_tls_disable: true # set to false and add TLS when ready
vault_ui: true
# Default host IP calculation (override in host_vars if desired)
vault_host_ip: "{{ ansible_default_ipv4.address | default('127.0.0.1') }}"
vault_api_addr: "http://{{ vault_host_ip }}:8200"
vault_cluster_addr: "http://{{ vault_host_ip }}:8201"
# Node identity and join (Raft HA)
vault_node_id: "{{ inventory_hostname }}"
vault_retry_join: [] # [{ leader_api_addr: "http://CHANGEME_LEADER_IP:8200" }]
# Healthcheck
vault_healthcheck_enable: true
vault_healthcheck:
test: ["CMD-SHELL", "vault status -address=http://127.0.0.1:8200 >/dev/null 2>&1 || exit 1"]
interval: "10s"
timeout: "5s"
retries: 8
start_period: "10s"
# Restart policy
vault_restart_policy: "always"
# Optional init (only on leader and only once)
vault_do_init: false
vault_key_shares: 5
vault_key_threshold: 3
vault_init_dir_controller: "~/.cache/vault/{{ inventory_hostname }}"
vault_init_json_controller: "{{ vault_init_dir_controller }}/init.json"
vault_container_user: "root"
vault_capabilities:
- "IPC_LOCK"
# Internal port (derived from listen address)
vault_internal_port: "{{ vault_listen_address.split(':')[-1] | int | default(8200) }}"
# Host port (if you publish 8200:8200)
vault_host_port: "{{ (vault_published_ports|default(['8200:8200']))[0].split(':')[0] | int }}"
...
Key implementation detail: UID/GID discovery for bind mounts
When you bind-mount directories, permissions can easily break because the container runs as a specific user (vault) with a UID/GID that may not match your host.
This task discovers the UID/GID inside the image and sets facts so directories are created with correct ownership:
---
# Discover the UID/GID for the 'vault' user in the Vault image
- name: Get vault UID
ansible.builtin.command: >
docker run --rm {{ vault_docker_image }} sh -lc 'id -u vault'
register: _vault_uid_cmd
changed_when: false
- name: Get vault GID
ansible.builtin.command: >
docker run --rm {{ vault_docker_image }} sh -lc 'id -g vault'
register: _vault_gid_cmd
changed_when: false
- name: Set facts for vault uid/gid (with sane fallback to 100). Detected {{ _vault_uid_cmd.stdout }}
ansible.builtin.set_fact:
vault_uid_num: "{{ (_vault_uid_cmd.stdout | default('100')) | int }}"
vault_gid_num: "{{ (_vault_gid_cmd.stdout | default('100')) | int }}"
...
This small step removes a lot of friction when running Vault with bind mounts, maybe there is a better alternative but thats how I handled it for now
Vault init flow (optional, leader-only)
This task:
- Waits for Vault to be reachable
- Checks status in JSON
- If uninitialized and vault_do_init: true, runs vault operator init
- Stores init.json on the controller machine (not on the node)
---
- name: Wait for Vault to listen on published port (host)
ansible.builtin.wait_for:
host: "127.0.0.1"
port: "{{ vault_host_port }}"
timeout: 60
- name: Query Vault status
command: >
docker exec {{ vault_container_name }}
vault status -format=json -address=http://127.0.0.1:8200
register: vault_status_json
changed_when: false
failed_when: false
- name: Parse status
set_fact:
vault_status: "{{ (vault_status_json.stdout | default('{}')) | from_json }}"
when: vault_status_json.rc == 0
- name: Decide if it needs init
set_fact:
vault_needs_init: true
when: vault_status_json.rc != 0 or (vault_status.initialized is defined and not vault_status.initialized)
- name: Initialize Vault (only if needed)
when: vault_do_init and vault_needs_init | default(false)
block:
- name: Run vault operator init
ansible.builtin.command: >
docker exec {{ vault_container_name }}
vault operator init
-key-shares={{ vault_key_shares }}
-key-threshold={{ vault_key_threshold }}
-format=json
-address=http://127.0.0.1:{{ vault_internal_port }}
register: vault_init_out
changed_when: true
- name: Create local directory {{ vault_init_dir_controller }} to store init.json
delegate_to: localhost
file:
path: "{{ vault_init_dir_controller }}"
state: directory
mode: "0700"
- name: Save init.json locally (controller) to {{ vault_init_json_controller }}
delegate_to: localhost
copy:
dest: "{{ vault_init_json_controller }}"
content: "{{ vault_init_out.stdout }}"
mode: "0600"
...
Operational note: initialization is only part of the story. Unseal and secure handling of recovery keys are critical and should be handled with care (offline storage, password manager, etc.).
Vault container setup (Docker network, dirs, config, container)
This task set:
- Creates a Docker network
- Creates the bind-mount directories with correct ownership
- Renders config.hcl
- Runs Vault container with:
- explicit entrypoint vault
- explicit command server -config=...
- bind mounts /vault/config and /vault/data
---
- name: Create Docker network for Vault {{ vault_network_name }}
community.docker.docker_network:
name: "{{ vault_network_name }}"
state: present
tags: [vault_network]
- name: Create Vault directories
ansible.builtin.file:
path: "{{ item }}"
state: directory
owner: "{{ vault_uid_num | default(100) }}"
group: "{{ vault_gid_num | default(100) }}"
mode: "0755"
loop:
- "{{ vault_base_dir }}"
- "{{ vault_config_dir }}"
- "{{ vault_data_dir }}"
tags: [vault_dirs]
- name: Render config.hcl
ansible.builtin.template:
src: "config.hcl.j2"
dest: "{{ vault_config_dir }}/config.hcl"
owner: "{{ vault_uid_num | default(100) }}"
group: "{{ vault_gid_num | default(100) }}"
mode: "0644"
notify: Restart vault container
tags: [vault_config]
# Ensure data dir ownership matches the image user (in case it already existed)
- name: Ensure ownership of data dir for vault image user
ansible.builtin.file:
path: "{{ vault_data_dir }}"
state: directory
owner: "{{ vault_uid_num | default(100) }}"
group: "{{ vault_gid_num | default(100) }}"
mode: "0750"
tags: [vault_dirs]
# Ensure config dir ownership too (avoid warnings)
- name: Ensure ownership of config dir for vault image user
ansible.builtin.file:
path: "{{ vault_config_dir }}"
state: directory
owner: "{{ vault_uid_num | default(100) }}"
group: "{{ vault_gid_num | default(100) }}"
mode: "0755"
tags: [vault_dirs]
- name: Ensure Vault container is running (ENTRYPOINT override)
community.docker.docker_container:
name: "{{ vault_container_name }}"
image: "{{ vault_docker_image }}"
state: started
restart_policy: "{{ vault_restart_policy }}"
pull: yes
user: "{{ vault_uid_num }}:{{ vault_gid_num }}"
capabilities: ["IPC_LOCK"]
networks:
- name: "{{ vault_network_name }}"
published_ports: "{{ vault_published_ports }}"
volumes:
- "{{ vault_config_dir }}:/vault/config{{ ':ro' if vault_config_read_only else '' }}"
- "{{ vault_data_dir }}:/vault/data"
# Run Vault directly (no image entrypoint logic)
entrypoint:
- "vault"
command:
- "server"
- "-config=/vault/config/config.hcl"
env: "{{ {'VAULT_ADDR': 'http://127.0.0.1:' ~ vault_internal_port } | combine(vault_env_extra) }}"
# Healthcheck omitted while iterating/debugging
healthcheck: "{{ omit }}"
recreate: true
...
Vault config template
This is where Raft and HA are defined. The follower uses retry_join to join the leader
listener "tcp" {
address = "{{ vault_listen_address }}"
tls_disable = {{ 1 if vault_tls_disable else 0 }}
{% if not vault_tls_disable %}
# tls_cert_file = "/vault/config/tls/server.crt"
# tls_key_file = "/vault/config/tls/server.key"
{% endif %}
}
storage "raft" {
path = "/vault/data"
node_id = "{{ vault_node_id }}"
{% for join in vault_retry_join %}
retry_join {
leader_api_addr = "{{ join.leader_api_addr }}"
{% if join.ca_cert_file is defined %}
leader_ca_cert_file = "{{ join.ca_cert_file }}"
{% endif %}
}
{% endfor %}
}
api_addr = "{{ vault_api_addr }}"
cluster_addr = "{{ vault_cluster_addr }}"
ui = {{ 'true' if vault_ui else 'false' }}
disable_mlock = true
With that in place I can setup the vault as code whenever I need . For now its going to be played with ansible engine, but the main idea is deploy that over an awx instance .
After that we will recover the initial credentials json and we will store somewhere the root key and the unseal keys etc and we are ready to do some interesting stuff