Deploy homelab HA vault over raspberry Pi

High-availability Vault on Raspberry Pi using an Ansible role and Docker, plus the rationale for running Vault outside Kubernetes for cluster bootstrapping.

Deploy homelab HA vault over raspberry Pi

Vault OSS on a Raspberry Pi (outside Kubernetes) — bootstrap-friendly HA with Ansible

In my homelab, Vault is a foundational dependency: it feeds credentials to GitOps (Argo CD), External Secrets Operator, CI/CD, and various automation workflows (I began with sealed secrets, which is great, but it requires the secre regeneration whenever you reinstall the cluster or just the sealed secret generation whenever you push a new secret which makes things a little bit cumbersome) .

That’s exactly why I decided to run Vault outside my Kubernetes cluster — on dedicated Raspberry Pi nodes — so I can bootstrap Kubernetes from scratch (or recover it) without a chicken-and-egg dependency on “Vault running inside the cluster that needs Vault to start”.

This post documents the approach and the Ansible role I use to deploy Vault OSS in Docker on Raspberry Pi, using integrated storage (Raft) and a simple HA topology.


Why Vault outside Kubernetes

Running Vault inside Kubernetes is totally possible, but it introduces a few practical problems in a homelab:

  • Bootstrap problem: if the cluster needs Vault to fetch secrets (ESO, GitOps), but Vault is deployed by the same cluster, you end up with circular dependencies.
  • Disaster recovery: if Kubernetes goes down (storage/ingress/node issues), you still want Vault reachable to restore access to credentials and recover services.
  • Blast radius reduction: Vault is “too critical” to be dependent on the same control plane it’s securing.

So Vault runs on a separate substrate (Raspberry Pi + Docker + SSD storage), and Kubernetes consumes secrets from it.


High-level architecture

  • Node A (RPi4): Vault server + Raft storage on SSD (leader most of the time)
  • Node B (RPi3/RPi4): Vault server + Raft follower (standby)
flowchart LR subgraph L["LAN"] A["Raspberry Pi (Node A)\nVault + Raft (leader)\n:8200"] --- B["Raspberry Pi (Node B)\nVault + Raft (standby)\n:8200"] K["Kubernetes cluster\n(Argo CD / ESO / Apps)"] -->|reads secrets| A K -->|reads secrets| B end

Deployment approach

I deploy Vault using an Ansible role called rpivaultoss:

  • Runs Vault in Docker (hashicorp/vault:)
  • Uses bind mounts (my standard) to keep data/config on the SSD
  • Uses integrated storage (Raft) for HA and replication
  • Has an optional init step, only executed on the “leader” host
  • Auto-detects the vault user UID/GID inside the container image to avoid permission pain with bind mounts

Role defaults

Below is the role defaults/main.yml, with:

  • IPs/hosts/users replaced by CHANGEME_*
---
# defaults file for rpivaultoss

# ARM64-compatible image
vault_docker_image: "hashicorp/vault:1.21"
vault_config_read_only: false
vault_force_recreate: true           # force container recreation if something relevant changes

# Names and paths
vault_container_name: "vault"
vault_network_name: "vault-net"

vault_container_command: ["server", "-config=/vault/config/config.hcl"]
vault_env_extra:
  SKIP_SETCAP: "true"
  SKIP_CHOWN: "true"
  VAULT_DISABLE_MLOCK: "true"
  VAULT_LOCAL_CONFIG: ""             # force empty -> entrypoint does NOT append extra config

# Bind mounts (my standard)
vault_base_dir: "/mnt/ssd/vault"
vault_config_dir: "{{ vault_base_dir }}/config"
vault_data_dir:  "{{ vault_base_dir }}/data"

# Published ports (host:container)
vault_published_ports:
  - "8200:8200"      # API/UI
#  - "8201:8201"    # cluster port (usually not necessary to expose)

# Listener and API/cluster addr
vault_listen_address: "0.0.0.0:8200"
vault_tls_disable: true              # set to false and add TLS when ready
vault_ui: true

# Default host IP calculation (override in host_vars if desired)
vault_host_ip: "{{ ansible_default_ipv4.address | default('127.0.0.1') }}"
vault_api_addr: "http://{{ vault_host_ip }}:8200"
vault_cluster_addr: "http://{{ vault_host_ip }}:8201"

# Node identity and join (Raft HA)
vault_node_id: "{{ inventory_hostname }}"
vault_retry_join: []                 # [{ leader_api_addr: "http://CHANGEME_LEADER_IP:8200" }]

# Healthcheck
vault_healthcheck_enable: true
vault_healthcheck:
  test: ["CMD-SHELL", "vault status -address=http://127.0.0.1:8200 >/dev/null 2>&1 || exit 1"]
  interval: "10s"
  timeout: "5s"
  retries: 8
  start_period: "10s"

# Restart policy
vault_restart_policy: "always"

# Optional init (only on leader and only once)
vault_do_init: false
vault_key_shares: 5
vault_key_threshold: 3
vault_init_dir_controller: "~/.cache/vault/{{ inventory_hostname }}"
vault_init_json_controller: "{{ vault_init_dir_controller }}/init.json"

vault_container_user: "root"
vault_capabilities:
  - "IPC_LOCK"

# Internal port (derived from listen address)
vault_internal_port: "{{ vault_listen_address.split(':')[-1] | int | default(8200) }}"
# Host port (if you publish 8200:8200)
vault_host_port: "{{ (vault_published_ports|default(['8200:8200']))[0].split(':')[0] | int }}"
...

Key implementation detail: UID/GID discovery for bind mounts

When you bind-mount directories, permissions can easily break because the container runs as a specific user (vault) with a UID/GID that may not match your host.

This task discovers the UID/GID inside the image and sets facts so directories are created with correct ownership:

---
# Discover the UID/GID for the 'vault' user in the Vault image
- name: Get vault UID
  ansible.builtin.command: >
    docker run --rm {{ vault_docker_image }} sh -lc 'id -u vault'
  register: _vault_uid_cmd
  changed_when: false

- name: Get vault GID
  ansible.builtin.command: >
    docker run --rm {{ vault_docker_image }} sh -lc 'id -g vault'
  register: _vault_gid_cmd
  changed_when: false

- name: Set facts for vault uid/gid (with sane fallback to 100). Detected {{ _vault_uid_cmd.stdout }}
  ansible.builtin.set_fact:
    vault_uid_num: "{{ (_vault_uid_cmd.stdout | default('100')) | int }}"
    vault_gid_num: "{{ (_vault_gid_cmd.stdout | default('100')) | int }}"
...

This small step removes a lot of friction when running Vault with bind mounts, maybe there is a better alternative but thats how I handled it for now

Vault init flow (optional, leader-only)

This task:

  • Waits for Vault to be reachable
  • Checks status in JSON
  • If uninitialized and vault_do_init: true, runs vault operator init
  • Stores init.json on the controller machine (not on the node)
---
- name: Wait for Vault to listen on published port (host)
  ansible.builtin.wait_for:
    host: "127.0.0.1"
    port: "{{ vault_host_port }}"
    timeout: 60

- name: Query Vault status
  command: >
    docker exec {{ vault_container_name }}
    vault status -format=json -address=http://127.0.0.1:8200
  register: vault_status_json
  changed_when: false
  failed_when: false

- name: Parse status
  set_fact:
    vault_status: "{{ (vault_status_json.stdout | default('{}')) | from_json }}"
  when: vault_status_json.rc == 0

- name: Decide if it needs init
  set_fact:
    vault_needs_init: true
  when: vault_status_json.rc != 0 or (vault_status.initialized is defined and not vault_status.initialized)

- name: Initialize Vault (only if needed)
  when: vault_do_init and vault_needs_init | default(false)
  block:
    - name: Run vault operator init
      ansible.builtin.command: >
        docker exec {{ vault_container_name }}
        vault operator init
        -key-shares={{ vault_key_shares }}
        -key-threshold={{ vault_key_threshold }}
        -format=json
        -address=http://127.0.0.1:{{ vault_internal_port }}
      register: vault_init_out
      changed_when: true

    - name: Create local directory {{ vault_init_dir_controller }} to store init.json
      delegate_to: localhost
      file:
        path: "{{ vault_init_dir_controller }}"
        state: directory
        mode: "0700"

    - name: Save init.json locally (controller) to {{ vault_init_json_controller }}
      delegate_to: localhost
      copy:
        dest: "{{ vault_init_json_controller }}"
        content: "{{ vault_init_out.stdout }}"
        mode: "0600"
...

Operational note: initialization is only part of the story. Unseal and secure handling of recovery keys are critical and should be handled with care (offline storage, password manager, etc.).

Vault container setup (Docker network, dirs, config, container)

This task set:

  • Creates a Docker network
  • Creates the bind-mount directories with correct ownership
  • Renders config.hcl
  • Runs Vault container with:
    • explicit entrypoint vault
    • explicit command server -config=...
    • bind mounts /vault/config and /vault/data
---
- name: Create Docker network for Vault {{ vault_network_name }}
  community.docker.docker_network:
    name: "{{ vault_network_name }}"
    state: present
  tags: [vault_network]

- name: Create Vault directories
  ansible.builtin.file:
    path: "{{ item }}"
    state: directory
    owner: "{{ vault_uid_num | default(100) }}"
    group: "{{ vault_gid_num | default(100) }}"
    mode: "0755"
  loop:
    - "{{ vault_base_dir }}"
    - "{{ vault_config_dir }}"
    - "{{ vault_data_dir }}"
  tags: [vault_dirs]

- name: Render config.hcl
  ansible.builtin.template:
    src: "config.hcl.j2"
    dest: "{{ vault_config_dir }}/config.hcl"
    owner: "{{ vault_uid_num | default(100) }}"
    group: "{{ vault_gid_num | default(100) }}"
    mode: "0644"
  notify: Restart vault container
  tags: [vault_config]

# Ensure data dir ownership matches the image user (in case it already existed)
- name: Ensure ownership of data dir for vault image user
  ansible.builtin.file:
    path: "{{ vault_data_dir }}"
    state: directory
    owner: "{{ vault_uid_num | default(100) }}"
    group: "{{ vault_gid_num | default(100) }}"
    mode: "0750"
  tags: [vault_dirs]

# Ensure config dir ownership too (avoid warnings)
- name: Ensure ownership of config dir for vault image user
  ansible.builtin.file:
    path: "{{ vault_config_dir }}"
    state: directory
    owner: "{{ vault_uid_num | default(100) }}"
    group: "{{ vault_gid_num | default(100) }}"
    mode: "0755"
  tags: [vault_dirs]

- name: Ensure Vault container is running (ENTRYPOINT override)
  community.docker.docker_container:
    name: "{{ vault_container_name }}"
    image: "{{ vault_docker_image }}"
    state: started
    restart_policy: "{{ vault_restart_policy }}"
    pull: yes
    user: "{{ vault_uid_num }}:{{ vault_gid_num }}"
    capabilities: ["IPC_LOCK"]
    networks:
      - name: "{{ vault_network_name }}"
    published_ports: "{{ vault_published_ports }}"
    volumes:
      - "{{ vault_config_dir }}:/vault/config{{ ':ro' if vault_config_read_only else '' }}"
      - "{{ vault_data_dir }}:/vault/data"
    # Run Vault directly (no image entrypoint logic)
    entrypoint:
      - "vault"
    command:
      - "server"
      - "-config=/vault/config/config.hcl"
    env: "{{ {'VAULT_ADDR': 'http://127.0.0.1:' ~ vault_internal_port } | combine(vault_env_extra) }}"
    # Healthcheck omitted while iterating/debugging
    healthcheck: "{{ omit }}"
    recreate: true
...

Vault config template

This is where Raft and HA are defined. The follower uses retry_join to join the leader

listener "tcp" {
  address     = "{{ vault_listen_address }}"
  tls_disable = {{ 1 if vault_tls_disable else 0 }}
  {% if not vault_tls_disable %}
  # tls_cert_file = "/vault/config/tls/server.crt"
  # tls_key_file  = "/vault/config/tls/server.key"
  {% endif %}
}

storage "raft" {
  path    = "/vault/data"
  node_id = "{{ vault_node_id }}"
  {% for join in vault_retry_join %}
  retry_join {
    leader_api_addr = "{{ join.leader_api_addr }}"
    {% if join.ca_cert_file is defined %}
    leader_ca_cert_file = "{{ join.ca_cert_file }}"
    {% endif %}
  }
  {% endfor %}
}

api_addr     = "{{ vault_api_addr }}"
cluster_addr = "{{ vault_cluster_addr }}"

ui = {{ 'true' if vault_ui else 'false' }}
disable_mlock = true

With that in place I can setup the vault as code whenever I need . For now its going to be played with ansible engine, but the main idea is deploy that over an awx instance .

After that we will recover the initial credentials json and we will store somewhere the root key and the unseal keys etc and we are ready to do some interesting stuff