selfhosted-apps-docker/esxi
2023-04-23 22:34:31 +02:00
..
readme.md update 2023-04-23 22:34:31 +02:00

Esxi

guide-by-example

logo

Purpose

Type 1 hypervisor hosting virtual machines, running straight on metal, managed through web GUI.

ESXi is made by vmware. It is a commercial product part of vSphere, but offers free version when one registers on their site.
The free version has limit of max 2 physical cpus, max 8 vCPU per VM, and no access to API which prevents ease of making backups of the VMs. But for backups there is solution of using ghettovcb script.

ESXi is also considered bit picky when it comes to hardware. Having natively supported network card and disk controller is not given.

gui-pic

Basic settings

Password complexity

  • Host > Manage > System > Advanced settings > Password quality control
    retry=10 min=1,1,1,1,1

New user

  • Add new user
    Host > Manage > Security & users > Users > Add user
    Set name and password.
  • Set permissions for the new user
    Host > Actions > Permissions
    Set the same user as before, pick role

Interface

  • Right top corner > user name
    • Auto-refresh=60
    • Settings, turn off everything - statistics, recent only, welcome message, visual effects

NTP time sync

  • Host > Manage > System > Time & date > Edit NTP Settings
    • Use Network Time Protocol (enable NTP client)
    • Start and stop with host
    • pool.ntp.org
    • Host > Manage > Services > search for ntpd > Start

Hostname and domain

  • ssh in
  • esxcli system hostname set --host esxi-2023
  • if domain on network
    esxcli system hostname set --domain example.local

Network

Should just work, but if there is more complex setup, like if a VM serves as a firewall...
Be sure you ssh in and try to ping google.com to see if network and DNS work.

To check and set the default gateway

  • esxcfg-route
  • esxcfg-route 10.65.26.25

To change DNS server

  • esxcli network ip dns server list
  • esxcli network ip dns server add --server=8.8.8.8
  • esxcli network ip dns server remove --server=1.1.1.1
  • esxcli network ip dns server list

To disable ipv6

  • esxcli network ip set --ipv6-enabled=false

Logs

Documentation

Host > Monitor > Logs

The one worth knowing about

  • shell.log - History of shell commands when SSH in
  • syslog.log - General info of what's happening on the system.
  • vmkernel.log - Activities of esxi and VMs

Will update with some actual use, when I use logs.

Logs from systems in VMs are in >Virtual Machines > Name-of-VM > Monitor > Logs

logs-pic

Backups using ghettoVCB

The script makes snapshot of a VM, copies the "old" vmdk and other files to a backup location, then deletes the snapshot.
This approach, where backup in time is full backup takes up a lot of space. Some form of deduplication might be a solution.

VMs that have any existing snapshot wont get backed up.

Files that are backed up:

  • vmdk - virtual disk file, every virtual disk has a separate file. In webgui datastore browser only one vmdk file is seen per disk, but on filesystem theres blabla.vmdk and blablka-flat.vmdk. The flat one is where the data actually are, the other one is a descriptor file.
  • nvram - bios settings of a VM
  • vmx - virtual machine settings, can be edited

Backup storage locations

  • Local disk datastore
  • NFS share
    For nfs share on trueNAS scale
    • Maproot User -> root
    • Maproot Group -> nogroup

Note the exact path from webgui of your datastore for backups.
Looks like this /vmfs/volumes/6187f7e1-c584077c-d7f6-3c4937073090

Install

  • Download the repo files on a pc from which you would upload them on to esxi
  • create a directory on a datastore where the script and configs will reside
    /vmfs/volumes/6187f7e1-c584077c-d7f6-3c4937073090/ghetto_script
  • upload the files, should be 6, can skip build directory and readme.md
  • ssh in to esxi
  • cd in to the datastore ghetto directory
  • make all the shell files executable
    chmod +x ./*.sh

Config and preparation

Gotta know basics how to edit files with ancient vi

  • cd in to the datastore ghetto directory cp ./ghettoVCB.conf ./ghetto_1.conf
  • Only edit this file, for starter setting where to copy backups
    vi ./ghetto_1.conf
    VM_BACKUP_VOLUME=/vmfs/volumes/6187f7e1-c584077c-d7f6-3c4937073090/Backups
  • Create a file that will contain list of VMs to backup
    touch ./vms_to_backup_list
    vi ./vms_to_backup_list
    OPNsense
    Arch-Docker-Host
    
  • Create a shell script that starts ghetto script using this config for listed VMs
    touch ./ghetto_run.sh
    vi ./ghetto_run.sh
    #!/bin/sh
    
    GHETTO_DIR=/vmfs/volumes/6187f7e1-c584077c-d7f6-3c4937073090/ghetto_script
    
    $GHETTO_DIR/ghettoVCB.sh \
        -g $GHETTO_DIR/ghetto_1.conf \
        -f $GHETTO_DIR/vms_to_backup_list \
        &> /dev/null
    
    Make the script executable
    chmod +x ./ghetto_run.sh
  • for my use case where TrueNAS VM cant be snapshoted while running because of a passthrough pcie HBA card there needs to be another config
  • Make new config copy
    cp ./ghetto_1.conf ./ghetto_2.conf
  • Edit the config, setting it to shut down VMs before backup.
    vi ./ghetto_2.conf
    POWER_VM_DOWN_BEFORE_BACKUP=1
  • edit the run script, add another execution for specific VM using ghetto_2.conf
    vi ./ghetto_run.sh
    #!/bin/sh
    
    GHETTO_DIR=/vmfs/volumes/6187f7e1-c584077c-d7f6-3c4937073090/ghetto_script
    
    $GHETTO_DIR/ghettoVCB.sh \
        -g $GHETTO_DIR/ghetto_1.conf \
        -f $GHETTO_DIR/vms_to_backup_list \
        &> /dev/null
    
    $GHETTO_DIR/ghettoVCB.sh \
        -g $GHETTO_DIR/ghetto_2.conf \
        -m TrueNAS_scale \
        &> /dev/null
    

To one time manually execute:

  • ./ghetto_run.sh

Scheduled runs

See "Cronjob FAQ" in documentation

To execute it periodicly cron is used. But theres an issue of cronjob being lost on esxi restart, which require few extra steps to solve.

  • Make backup of roots crontab
    cp /var/spool/cron/crontabs/root /vmfs/volumes/datastore1/ghetto_script/root_cron.backup
  • Edit roots crontab to execute the run script at 4:00
    add the following line at the end in cron format
    vi /var/spool/cron/crontabs/root
    0    4    *   *   *   /vmfs/volumes/6187f7e1-c584077c-d7f6-3c4937073090/ghetto_script/ghetto_run.sh
    
    To save read only file in vi use :wq!
  • restart cron service
    kill $(cat /var/run/crond.pid)
    crond

To make the cronjob permanent

  • Make backup of local.sh file
    cp /etc/rc.local.d/local.sh /vmfs/volumes/datastore1/ghetto_script/local.sh.backup
  • Edit etc/rc.local.d/local.sh file, adding the following lines at the end, but before the exit 0 line. Replace the part in quotes in the echo line with your cronjob line.
    vi /etc/rc.local.d/local.sh
    /bin/kill $(cat /var/run/crond.pid) > /dev/null 2>&1
    /bin/echo "0    4    *   *   *   /vmfs/volumes/6187f7e1-c584077c-d7f6-3c4937073090/ghetto_script/ghetto_run.sh" >> /var/spool/cron/crontabs/root
    /bin/crond
    
    ESXi host must have disabled secure boot for local.sh to execute.
  • Run esxi config backup for change to be saved
    /sbin/auto-backup.sh
  • Restart host, check if the cronjob is still there and if cron is running, and check if the date is correct
    vi /var/spool/cron/crontabs/root
    ps | grep crond | grep -v grep
    date

Logs about backups are in /tmp

Restore from backup

Documentation

  • In webgui create a full path where to restore the VM
  • The restore-config-template-file is in the ghetto_script directory on datastore
    named ghettoVCB-restore_vm_restore_configuration_template Make copy of it
    cp ./ghettoVCB-restore_vm_restore_configuration_template ./vms_to_restore_list
  • Edit this file, adding new line, in which separated by ; are:
    • path to the backup, the directory has date in name
    • path where to restore this backup
    • disk type - 1=thick | 2=2gbsparse | 3=thin | 4=eagerzeroedthick
    • optional - new name of the VM
      vi ./vms_to_restore_list
    "/vmfs/volumes/6187f7e1-c584077c-d7f6-3c4937073090/Backups/OPNsense/OPNsense-2023-04-16_04-00-00;/vmfs/volumes/6378107d-b71bee00-873d-b42e99f40944/OPNsense_restored;3;OPNsense-restored"
    
  • Execute the restore script with the config given as a parameter.
    ./ghettoVCB-restore.sh -c ./vms_to_restore_list
  • Register the restored VM.
    If it's in the same location as the original was, it should just go through. If the location is different, then esxi asks if it was moved or copied.
    • Copied - You are planning to use both VMs at the same time, selecting this option generates new UUID for the VM, new MAC address, maybe some other hardware identifiers as well.
    • Moved - All old settings are kept, for restoring backups this is usually the correct choice.

Switching from Thick to Thin disks

Kinda issue is that vmdk are actually two files.
Small plain .vmdk that holds some info, and the flat.vmdk with actual gigabytes of data of the disk. In webgui this fact is hidden.

  • have backups
  • down the VM
  • unregister the VM
  • ssh in
  • navigate to where its vmdk files are in datastore
    cd /vmfs/volumes/6187f7e1-c584077c-d7f6-3c4937073090/linux/
  • execute command that converts the vmdk vmkfstools -i "./linux.vmdk" -d thin "./linux-thin.vmdk"
  • zeropunch the image file, so that unused blocks are properly zeroed.
    vmkfstools --punchzero "./linux-thin.vmdk"
  • remove or move both original files
    rm linux.vmdk
    rm linux-flat.vmdk
  • In webgui navigate to the datastore. Use move command to rename thin version to the original name.
    This changes the values in linux.vmdk to point to correct flat.vmdk
  • register the VM back to esxi gui.

Disk space reclamation

If you run VMs with thin disks, the idea is that it uses only as much space as is needed. But if you copy 50GB file to a VM, then deletes it, it's not always seamless that the VMDK shrinks by 50GB too.

Correctly functioning reclamation can save time and space for backups.

  • Modern windows should just work, did just one test with win10.
  • linux machines need fstrim run that marks blocks as empty.
  • Unix machine, like opnsense based on FreeBSD needed to be started from ISO, so that partition is not mounted and executed
    fsck_ufs -Ey /dev/da0p3
    afterwards it needed one more run of vmkfstools --punchzero "./OPNsense.vmdk"
    And it still uses roughly twice as much space as it should.

links