Skip to content

Provisioning a (new) disk

SMART

see also ./smart.md

Check SMART attributes for failures:

smartctl -a /dev/sda | grep -iE '(error|uncorrect|pending|recovered|fail)' | grep -v '0$'

Run SMART selftests:

smartctl -t short /dev/sda
smartctl -l selftest /dev/sda

smartctl -t long /dev/sda
smartctl -l selftest /dev/sda

Badblocks

If disk can get nuked:

badblocks -svw -b 4096 -c 65536 -p 1 /dev/sdX

Afterwards, check SMART log again

NVME / SSD

Do a secure erase, see ./ssd.md

Monitoring

  • Be sure prometheus node-exporter exports smartmon_* or nvme_* metrics:

    curl -s localhost:9100/metrics | grep -iE '^(nvme|smartmon)'

  • Make sure these metrics get scraped, see metrics from
    • smartmon_device_info
    • nvme_critical_warning_total
    • node_nvme_info