2022-07-29

Syncing Files among 3 Nodes with Ansible

Imagine a scenario:

There are three nodes and each of them runs a daemon. The daemon will generate a token file in the name of its hostname under a specific location, say /tmp/node-01. To form a fully functional cluster, the daemon on one node needs to know the tokens of the other two nodes. The only way is to sync those token files across these three nodes so that each node has all three nodes’ token files, including the one generated by itself.

The Environment

Let’s say the hostname of the three nodes are node-01, node-02, and node-03. For demonstration, I use Harvester to create VMs, you can use whatever you have in hand. Or you already have similar setup, just skip this section. Note that there’s a VM called tower, it is the place where we run Ansible.

4 VMs on Harvester
HCI

The token files generated by the daemon will be /tmp/node-01, /tmp/node-02, and /tmp/node-03 respectively.

3 Nodes with Token
Files

Prerequisites

First thing first, install Ansible on tower machine:

$ sudo apt update
$ sudo apt install -y ansible
$ ansible --version
ansible 2.9.6
  config file = /etc/ansible/ansible.cfg
  configured module search path = ['/home/ubuntu/.ansible/plugins/modules', '/usr/share/ansible/plugins/modules']
  ansible python module location = /usr/lib/python3/dist-packages/ansible
  executable location = /usr/bin/ansible
  python version = 3.8.10 (default, Mar 15 2022, 12:22:08) [GCC 9.4.0]

Create the inventory file hosts.yml:

---
all:
  hosts:
    node-01:
      ansible_host: 10.52.0.124
      ansible_user: ubuntu
    node-02:
      ansible_host: 10.52.0.125
      ansible_user: ubuntu
    node-03:
      ansible_host: 10.52.0.126
      ansible_user: ubuntu
  vars:
    ansible_ssh_common_args: '-o StrictHostKeyChecking=no'

Now validate the inventory file we just created and also test the connectivity between the target nodes and the Ansible host.

$ ansible -i hosts.yml -m ping all
node-01 | SUCCESS => {
    "ansible_facts": {
        "discovered_interpreter_python": "/usr/bin/python3"
    },
    "changed": false,
    "ping": "pong"
}
node-03 | SUCCESS => {
    "ansible_facts": {
        "discovered_interpreter_python": "/usr/bin/python3"
    },
    "changed": false,
    "ping": "pong"
}
node-02 | SUCCESS => {
    "ansible_facts": {
        "discovered_interpreter_python": "/usr/bin/python3"
    },
    "changed": false,
    "ping": "pong"
}

We’re good to go!

Setting Up Demo Scenario

Since there’s no such daemon, it’s just our imagination, we have to generate the token files by ourselves. Let’s also do this with Ansible!

---
- name: Set up demo environment
  hosts: all
  tasks:
  - name: Genrate the token
    ansible.builtin.set_fact:
      token: "{{ lookup('ansible.builtin.password', '/dev/null') }}"
  - name: Print the token
    ansible.builtin.debug:
      msg: "{{ token }}"
  - name: Create the token file
    ansible.builtin.copy:
      dest: "/tmp/{{ inventory_hostname }}"
      content: "{{ token }}"

Then run the playbook:

$ ansible-playbook -i hosts.yml setup.yml

PLAY [Set up demo environment] *****************************************************************************************

TASK [Gathering Facts] *************************************************************************************************
ok: [node-03]
ok: [node-02]
ok: [node-01]

TASK [Genrate the token] ***********************************************************************************************
ok: [node-01]
ok: [node-02]
ok: [node-03]

TASK [Print the token] *************************************************************************************************
ok: [node-01] => {
    "changed": false,
    "msg": "Awp5JzuQeAyvsSDu6l2a"
}
ok: [node-02] => {
    "changed": false,
    "msg": ",ktiL0H0:YSYBNmiCREy"
}
ok: [node-03] => {
    "changed": false,
    "msg": "bLn7A87q4aqLLzSVlC02"
}

TASK [Create the token file] *******************************************************************************************
changed: [node-02]
changed: [node-01]
changed: [node-03]

PLAY RECAP *************************************************************************************************************
node-01                    : ok=4    changed=1    unreachable=0    failed=0    skipped=0    rescued=0    ignored=0
node-02                    : ok=4    changed=1    unreachable=0    failed=0    skipped=0    rescued=0    ignored=0
node-03                    : ok=4    changed=1    unreachable=0    failed=0    skipped=0    rescued=0    ignored=0

In case you want to verify that the tokens are actually deployed on the right node, right path:

$ ssh -o StrictHostKeyChecking=no 10.52.0.124 cat /tmp/node-01; echo
Awp5JzuQeAyvsSDu6l2a
$ ssh -o StrictHostKeyChecking=no 10.52.0.125 cat /tmp/node-02; echo
,ktiL0H0:YSYBNmiCREy
$ ssh -o StrictHostKeyChecking=no 10.52.0.126 cat /tmp/node-03; echo
bLn7A87q4aqLLzSVlC02

It’s time for the real work: file synchronization.

Syncing Files with Ansible

Then how do we achieve our goal using Ansible? Basically, we utilize the synchronize module of Ansible, and there’re two ways of doing that:

The centralized way
The ad-hoc way

Each way has its pros and cons, I’ll explain them in the following respectively.

The Centralized Way

I believe it’s much easier to understand how this work with a diagram.

Centralized Sync

First, we pull the token files from all the nodes to the local directory on Ansible host. Then push these fetched token files back to the nodes. In the end, all the nodes have all the token files. And we’ll deliberately skip the “push to itself” part.

Create the playbook centralized-sync.yml:

---
- name: Centralized sync
  hosts: all
  tasks:
  - name: Create buffer directory
    local_action:
      module: ansible.builtin.file
      path: buffer
      state: directory
    run_once: yes
  - name: Pull the token file to localhost
    synchronize:
      src: "/tmp/{{ inventory_hostname }}"
      dest: "buffer/{{ inventory_hostname }}"
      mode: pull
  - name: Push back token files to each node
    synchronize:
      src: "buffer/{{ item }}"
      dest: "/tmp/{{ item }}"
      mode: push
    loop: "{{ groups['all'] }}"
    when: item != inventory_hostname

As you can see, we create a buffer directory on the Ansible host for collecting the token files. After that we initiate connections and pull the token file with synchronize module from each nodes to the buffer directory. Finally, pushing all the token files to each node. Note the last line, what it means is when the name of the token file and the name of the node are the same, just skip and straight to the next iteration.

To execute the playbook we’ve just created:

$ ansible-playbook -i hosts.yml centralized-sync.yml

PLAY [Centralized sync] *************************************************************************************************************

TASK [Gathering Facts] *************************************************************************************************
ok: [node-02]
ok: [node-03]
ok: [node-01]

TASK [Create buffer directory] *****************************************************************************************
ok: [node-01 -> localhost]

TASK [Pull the token file to localhost] ********************************************************************************
changed: [node-02]
changed: [node-03]
changed: [node-01]

TASK [Push back token files to each node] ******************************************************************************
skipping: [node-01] => (item=node-01)
changed: [node-01] => (item=node-02)
changed: [node-02] => (item=node-01)
skipping: [node-02] => (item=node-02)
changed: [node-03] => (item=node-01)
changed: [node-02] => (item=node-03)
changed: [node-01] => (item=node-03)
changed: [node-03] => (item=node-02)
skipping: [node-03] => (item=node-03)

PLAY RECAP *************************************************************************************************************
node-01                    : ok=4    changed=2    unreachable=0    failed=0    skipped=0    rescued=0    ignored=0
node-02                    : ok=3    changed=2    unreachable=0    failed=0    skipped=0    rescued=0    ignored=0
node-03                    : ok=3    changed=2    unreachable=0    failed=0    skipped=0    rescued=0    ignored=0

To verify that each node has all the token files, you can ssh to each node and execute a for loop to show the file content.

$ ssh -o StrictHostKeyChecking=no 10.52.0.124 'for i in {1..3}; do cat /tmp/node-0$i; echo; done'
Awp5JzuQeAyvsSDu6l2a
,ktiL0H0:YSYBNmiCREy
bLn7A87q4aqLLzSVlC02
$ ssh -o StrictHostKeyChecking=no 10.52.0.125 'for i in {1..3}; do cat /tmp/node-0$i; echo; done'
Awp5JzuQeAyvsSDu6l2a
,ktiL0H0:YSYBNmiCREy
bLn7A87q4aqLLzSVlC02
$ ssh -o StrictHostKeyChecking=no 10.52.0.126 'for i in {1..3}; do cat /tmp/node-0$i; echo; done'
Awp5JzuQeAyvsSDu6l2a
,ktiL0H0:YSYBNmiCREy
bLn7A87q4aqLLzSVlC02

This is actually what we want. Now let’s take a look of the other method.

The Ad-hoc Way

Same, the diagram tells us all.

Ad-hoc Sync

This time, we utilize the delegate_to keyword heavily, though it just appears once in the playbook shown below. The point is, we don’t want to install Ansible and execute the playbook on every node. If so, what’s the difference between copying files with scp manually and using Ansible? What the delegate_to keyword does is, it delegates the task to the host specified and references to the other hosts. In our scenario, as node-01, we want to fetch the token files on node-02 and node-03. Then we go to node-02 to fetch the token files on node-01 and node-03. Finally, we fetch node-01 and node-02’s token files from node-03’s perspective. It’s pretty suitable to the usage of delegate_to.

Create the playbook adhoc-sync.yml:

---
- name: Ad-hoc sync
  hosts: all
  tasks:
  - name: Create ssh public key buffer directory
    local_action:
      module: ansible.builtin.file
      path: buffer/keys
      state: directory
    run_once: yes
  - name: Generate ssh keypair on each node
    ansible.builtin.user:
      name: "{{ ansible_user }}"
      generate_ssh_key: yes
  - name: Fetch all public keys from each node
    ansible.builtin.fetch:
      src: "/home/{{ ansible_user }}/.ssh/id_rsa.pub"
      dest: "buffer/keys/{{ inventory_hostname }}-id_rsa.pub"
      flat: yes
  - name: Assemble authorized keys from buffer
    local_action:
      module: ansible.builtin.assemble
      src: buffer/keys
      dest: buffer/keys/authorized_keys
    run_once: yes
  - name: Update authorized keys on each node
    ansible.builtin.blockinfile:
      block: "{{ lookup('file', 'buffer/keys/authorized_keys') }}"
      path: "/home/{{ ansible_user }}/.ssh/authorized_keys"
      backup: yes
      create: yes
      mode: 0600
      state: present
  - name: Synchronize files from each other
    ansible.builtin.synchronize:
      src: "/tmp/{{ inventory_hostname }}"
      dest: "/tmp/{{ inventory_hostname }}"
      mode: pull
    delegate_to: "{{ item }}"
    loop: "{{ groups['all'] }}"
    when: item != inventory_hostname

This time the playbook is quite lengthy, but the real work, which is file synchronization, is written in the last task. In order to rsync token files successfully, we have to make sure that each one of the node can ssh to the other two nodes using SSH keys. So the first thing is to generate SSH key pair on every node, then distribute the public keys. We achieve this by collecting public keys from each node, assembling them into a temporary file in buffer directory, and updating authorized_keys on each node with the content of the temporary file.

To execute the playbook:

$ ansible-playbook -i hosts.yml adhoc-sync.yml

PLAY [Ad-hoc sync] *****************************************************************************************************

TASK [Gathering Facts] *************************************************************************************************
ok: [node-01]
ok: [node-02]
ok: [node-03]

TASK [Create ssh public key buffer directory] **************************************************************************
changed: [node-01 -> localhost]

TASK [Generate ssh keypair on each node] *******************************************************************************
changed: [node-01]
changed: [node-03]
changed: [node-02]

TASK [Fetch all public keys from each node] ****************************************************************************
changed: [node-03]
changed: [node-01]
changed: [node-02]

TASK [Assemble authorized keys from buffer] ****************************************************************************
changed: [node-01 -> localhost]

TASK [Update authorized keys on each node] *****************************************************************************
changed: [node-01]
changed: [node-02]
changed: [node-03]

TASK [Synchronize files from each other] *******************************************************************************
skipping: [node-01] => (item=node-01)
changed: [node-01 -> 10.52.0.125] => (item=node-02)
changed: [node-03 -> 10.52.0.124] => (item=node-01)
changed: [node-02 -> 10.52.0.124] => (item=node-01)
skipping: [node-02] => (item=node-02)
changed: [node-01 -> 10.52.0.126] => (item=node-03)
changed: [node-03 -> 10.52.0.125] => (item=node-02)
skipping: [node-03] => (item=node-03)
changed: [node-02 -> 10.52.0.126] => (item=node-03)

PLAY RECAP *************************************************************************************************************
node-01                    : ok=7    changed=6    unreachable=0    failed=0    skipped=0    rescued=0    ignored=0
node-02                    : ok=5    changed=4    unreachable=0    failed=0    skipped=0    rescued=0    ignored=0
node-03                    : ok=5    changed=4    unreachable=0    failed=0    skipped=0    rescued=0    ignored=0

Now that all the nodes have all the token files. You can verify that with the ssh commands in the last section, I’ll just skip the detail for simplicity.

Pros and Cons

The centralized way
- Advantages
  - Simple logic (pull then push)
  - No prerequisites
- Drawbacks
  - Waste of traffic (outbound bandwidth might be expensive if this is in public cloud environment)
  - Security issue (traffic will leave the cluster network)
The ad-hoc way
- Advantages
  - Simple in architecture (direct sync among nodes)
  - Performant when it comes to large file synchronization, not just small file like token
  - More secure in terms of data leak (traffic remains in the cluster network)
- Drawbacks
  - Rather complicated logic (delegate_to keyword could be hard to understand at the beginning)
  - Got things to do beforehand (SSH pubkey distribution)

Tearing Down Demo Scenario

Things have to be cleaned up after we’ve done the demonstration successfully.

---
- name: Tear down demo environment
  hosts: all
  tasks:
  - name: Remove buffered token files
    local_action:
      module: ansible.builtin.file
      path: "buffer/{{ item }}"
      state: absent
    loop: "{{ groups['all'] }}"
    run_once: yes
  - name: Remove combined authorized keys on each node
    ansible.builtin.blockinfile:
      block: "{{ lookup('file', 'buffer/keys/authorized_keys', errors='ignore') }}"
      path: "/home/{{ ansible_user }}/.ssh/authorized_keys"
      backup: yes
      state: absent
  - name: Remove buffered public keys
    local_action:
      module: ansible.builtin.file
      path: "buffer/keys/"
      state: absent
    run_once: yes
  - name: Remove the token file
    ansible.builtin.file:
      path: "/tmp/{{ item }}"
      state: absent
    loop: "{{ groups['all'] }}"

Now clean up the intermediate files and synced token files if you want to redo the demo. Otherwise, you can wipe out the VMs directly.

$ ansible-playbook -i hosts.yml teardown.yml

PLAY [Tear down demo environment] **************************************************************************************

TASK [Gathering Facts] *************************************************************************************************
ok: [node-02]
ok: [node-01]
ok: [node-03]

TASK [Remove buffered token files] *************************************************************************************
ok: [node-01 -> localhost] => (item=node-01)
ok: [node-01 -> localhost] => (item=node-02)
ok: [node-01 -> localhost] => (item=node-03)

TASK [Remove combined authorized keys on each node] ********************************************************************
changed: [node-02]
changed: [node-03]
changed: [node-01]

TASK [Remove buffered public keys] *************************************************************************************
changed: [node-01 -> localhost]

TASK [Remove the token file] *******************************************************************************************
changed: [node-01] => (item=node-01)
changed: [node-02] => (item=node-01)
changed: [node-03] => (item=node-01)
changed: [node-01] => (item=node-02)
changed: [node-02] => (item=node-02)
changed: [node-03] => (item=node-02)
changed: [node-01] => (item=node-03)
changed: [node-02] => (item=node-03)
changed: [node-03] => (item=node-03)

PLAY RECAP *************************************************************************************************************
node-01                    : ok=5    changed=3    unreachable=0    failed=0    skipped=0    rescued=0    ignored=0
node-02                    : ok=3    changed=2    unreachable=0    failed=0    skipped=0    rescued=0    ignored=0
node-03                    : ok=3    changed=2    unreachable=0    failed=0    skipped=0    rescued=0    ignored=0