How to Test SSD/HDD Health in Linux?

Introduction

Smartmontools is a set of applications that can test hard drives and read their hardware SMART statistics. Note: SMART data may not accurately predict future drive failure, however abnormal error rates may be an indication of possible hardware failure or data inconsistency.

This how to will help you to configure Smartmontools to do actions such as shut down the computer or send an e-mail when the disk is going to fail.

Prerequisites

  • A modern S.M.A.R.T. capable hard disk

Setting up

Installation

You can install the smartmontools package from the Synaptic Package Manager (see SynapticHowto), or by typing the following into the terminal:

sudo apt-get install smartmontools

Checking a drive for SMART Capability

To ensure that your drive supports SMART, type:

sudo smartctl -i /dev/sda

where /dev/sda is your hard drive. This will give you brief information about your drive. The last two lines may look something like this:

SMART support is: Available - device has SMART capability. SMART support is: Enabled

Enabling SMART

In the case that SMART is not enabled for your drive, you can enable it by typing:

sudo smartctl -s on /dev/sda

Testing a Drive

  1. Short
  2. Extended (Long)
  3. Conveyance

To find an estimate of the time it takes to conduct each test, type:

sudo smartctl -c /dev/sda

The most useful test is the extended test (long). You can initiate the test by typing:

sudo smartctl -t long /dev/sda

Results

You can view a drive’s test statistics by typing:

sudo smartctl -l selftest /dev/sda

To display detailed SMART information for an IDE drive, type:

sudo smartctl -a /dev/sda

To display detailed SMART information for a SATA drive, type:

sudo smartctl -a -d ata /dev/sda

Note: This also works for IDE drives in new kernels that are being run through the SCSI stack and show up as /dev/sdX

Suggested application: GSmartControl

Take a look at GSmartControl. It’s a nice graphical frontend to smartctl; it shows all SMART values, and highlights those that indicate old age or impending failure, plus you may run tests on demand:

GSmartControl main window GSmartControl attribute list

As usual, you may install it from Synaptic or running sudo apt-get install gsmartcontrol.

Advanced: Running as Smartmontools as a Daemon

You can run Smartmontools in the background and have it check drives and email when there are issues:

Open the file /etc/default/smartmontools with your favourite text editor. For example (using vim): sudo vim /etc/default/smartmontools. Uncomment the line start_smartd=yes.

How smartd is going to scan the disks and what it will do in case of errors is controlled by the daemon configuration file, /etc/smartd.conf. Again, use your favourite text editor to open this file. There should be one uncommented line, similar to:

DEVICESCAN -m root -M exec /usr/share/smartmontools/smartd-runner
  • scan for all ATA/SCSI devices (DEVICESCAN). The rest of the file will be ignored;
  • mail a report to the ‘root’ account in case of trouble (-m);
  • but instead of the mail command, it will execute /usr/share/smartmontools/smartd-runner and feed the report to it (-M exec program).

/usr/share/smartmontools/smartd-runner is a script that basically saves the report to a temporary file, and then runs anything it finds in /etc/smartmontools/run.d/; take a look there to understand what you already have (there should be a script that mails the report).

Читать статью  Как восстановить контроллер жесткого диска

There are several -M directives that change when and how often reports are sent. You need to specify (-m something) in order to use them, even if you’re not sending any mail.

You may include some useful options:

DEVICESCAN -H -l error -l selftest -f -s (O/../../5/11|L/../../5/13|C/../../5/15) -m root -M exec /usr/share/smartmontools/smartd-runner
  • check the SMART health status (-H);
  • report increases in both SMART error logs (-l);
  • check for failure of any Usage Attributes (-f);
  • schedule an Offline Immediate Test every Friday at 11 am, a Long Self-Test every Friday at 1 pm, and a Conveyance Self-Test every Friday at 3 pm (-s) — see the smartd manual page for what these tests do so you can choose what suits you.

You may also replace DEVICESCAN with the path of the device which you’d like to be monitored (e.g. /dev/sda), and the daemon will only monitor this drive. You’ll need one such line for each device.

Actions in case of trouble

You’ll want to configure the actions smartd will take in case of trouble. If all you want is a notification shown on your desktop, skip to «Personal computer» below.

Most of the time, you only need to place a script in /etc/smartmontools/run.d/. Whenever smartd wants to send a report, it will execute smart-runner and the latter will run your script.

You have several variables available to your script (again, see the smartd manpage). These come from a test run:

SMARTD_MAILER=/usr/share/smartmontools/smartd-runner SMARTD_SUBJECT=SMART error (EmailTest) detected on host: XXXXX SMARTD_ADDRESS=root SMARTD_TFIRSTEPOCH=1267409738 SMARTD_FAILTYPE=EmailTest SMARTD_TFIRST=Sun Feb 28 21:45:38 2010 VET SMARTD_DEVICE=/dev/sda SMARTD_DEVICETYPE=sat SMARTD_DEVICESTRING=/dev/sda SMARTD_FULLMESSAGE=This email was generated by the smartd daemon running on: SMARTD_MESSAGE=TEST EMAIL from smartd for device: /dev/sda

Your script also has a temporary copy of the report available as «$1». It will be deleted after you finish but the same content is written to /var/log/syslog.

Personal computer

For a visual notification, you may just install smart-notifier. You will see a large popup with the report:

smart-notifier_warning.png

Alternatively, you may create a custom notification (bubble) as seen in other GNOME programs.

You will need to install the libnotify-bin package:

sudo aptitude install libnotify-bin

Now create a text file called 60notify in /etc/smartmontools/run.d:

sudo vi /etc/smartmontools/run.d/60notify

and add the following to the file:

DISPLAY=:0.0 notify-send --icon=important "Possible disk failure" "$SMARTD_DEVICE may have a problem"

(The DISPLAY=:0.0 part is a variable assignment that helps programs to locate your X server. It’s already set for your terminal, but the script lacks it since it is being run inside a different session).

Now give it execute permissions:

sudo chmod +x /etc/smartmontools/run.d/60notify

This will produce a nice libnotify bubble with a warning icon:

smart_notification.png

You may also experiment with Zenity:

DISPLAY=:0.0 zenity --text-info --filename="$1" --title="smartd: $SMARTD_DEVICE may have a problem"

Notice: Be very careful with these scripts as they are run under the root account.

Server

Here, you may wish to handle things differently. In this example we want to mail an admin and shut down the server. Comment out the line that contains DEVICESCAN, by adding # to the beginning of the line. Then, add this to the end of the file:

/dev/hda -H -l error -l selftest -f -s (O/../../5/11|L/../../5/13|C/../../5/15)  -m admin@somewhere.com -M exec /usr/share/smartmontools/smartd-runner

(Be sure not to add any whitespace after the «»)

Now, we are going to make the script which is going to shut down the computer *after* we mail the admin. Create a text file called 99shutdown in /etc/smartmontools/run.d and add the following to the file:

sleep 40 shutdown -h now

The number 99 at the start of the filename is to ensure that it is called last when smartd-runner runs. It will wait 40 seconds and then shut down the computer. Of course, you may customize this at will; you may not wish to turn off the server.

Читать статью  4 уязвимости, которые ведут к повреждениям жесткого диска ноутбука

Now, it is time to start the daemon:

sudo service smartmontools start

Testing

If you want to test all these actions, add -M test after exec /usr/share/smartmontools/smartd-runner and restart the daemon (sudo service smartmontools restart). When the daemon comes up, it will execute the script immediately with a test message. Notice: If you included the shutdown -h line, the script will shut down the computer as soon as the service starts. To fix this, you will have to start the computer in recovery mode and remove the -M test option from /etc/smartd.conf.

Note

Before running this, be sure to check that you have a «mail» command, and do a test first to your address. On my default Fiesty:

  • mailx
  • mailutils

Try: sudo apt-get install

Make sure you have the ‘universe’ component enabled

bash: mail: command not found

Utility: Checking all disks at once

Note: Following the Gentoo Wiki I made a modified script which checks all the disk in /dev/disk/by-id/ Just invoke the script below as follows:

./smart.sh short|long|offline

The script creates a directory named smart-logs and stores all the files there.

# Script by Meliorator. irc://irc.freenode.net/Meliorator # modified by Ranpha [ ! "$@" ] && echo "Usage: $0 type [type] [type]" [ ! -e smart-logs ] && mkdir smart-logs [ ! -d smart-logs ] && Can not create smart-logs dir && exit 1 a=0 for t in "$@"; do case "$t" in offline) l=error;; short|long) l=selftest;; *) echo $t is an unrecognised test type. Skipping. && continue esac for hd in /dev/disk/by-id/ata*; do r=$(( $(smartctl -t $t -d ata $hd | grep 'Please wait' | awk '') )) echo Check $hd - $t test in $r minutes [ $r -gt $a ] && a=$r done echo "Waiting $a minutes for all tests to complete" sleep $(($a))m for hd in /dev/disk/by-id/ata*; do smartctl -l $l -d ata $hd 2>&1 >> smart-logs/smart-$-$.log done done for i in ; do sleep .01 echo -n -e \a done echo "All tests have completed"

(Remember to give execute permissions to the script with chmod +x smart.sh).

Testing drives behind MegaRAID

If /dev/sda is a MegaRAID device then straightforward execution of smartctl on it is not effective. It just returns an empty report for the controller itself. To get S.M.A.R.T. attributes of a drive behind the RAID controller you need to use the following command:

for i in ; do sudo smartctl -x /dev/sda -d megaraid,$i >>./OUTPUT; done

Examples of SMART reports

See large collection of smartctl reports for various hard drives here.

Smartmontools (последним исправлял пользователь vz-out 2019-12-24 11:07:49)

The material on this wiki is available under a free license, see Copyright / License for details
You can contribute to this wiki, see Wiki Guide for details

How to Test SSD/HDD Health in Linux?

To protect the data on the SSD/HDD disks, especially with the limited lifespan, the experts recommend keeping a check on its health. The SSD/HDD is not immune to damage and corruption, which leads to data loss. To prevent it, users can test SSD/HDD health after identification of drives using various methods on Linux.

This guide elaborates on the approaches to test SSD/HDD health in Linux effectively.

  • View List of Disks
  • Method 1: Using Smartctl
  • Method 2: Using nvme-cli
  • Method 3: Using Disks Application (GUI)

How to View the List of Disks on Linux?

Many tools are available to view the list of disks on Linux, such as lsblk, df, fdisk, hwinfo, and many more, discussed here in detail. Let’s use the lsblk command to view a list of disks:

$ lsblk

The output shows disk partitions “sda1”, “sda2” and “sda3” in the operating system and a system can have multiple hard drives.

Читать статью  Отключаем автозапуск дисков и флешек в Windows

Note: The above output is from Ubuntu installed on the Vmware Workstation, and the output on your end can be different.

Method 1: Using Smartctl on Linux

The Smartctl comes in the “smartmontools” package. It has a self-monitoring feature to monitor the system’s performance. The Smartctl tool supports ATA/SATA, SCSI/SAD, and NVME. It is not pre-installed in the Linux distro. It can be installed using these commands:

$ sudo apt install smartmontools #Ubuntu/Debian/LinuxMint $ sudo dnf install smartmontools #Fedora $ sudo yum install smartmontools # RHEL/CentOS

The above figure confirms that the installation of smartmontools has been done on Ubuntu.

To test the SSD/HDD for health, understand this command before executing it:

  • smartctl is there to view the information on storage devices
  • -t long specifies that a long self-test is to be performed to check the entire surface of the drive for any potential issues
  • -a displays the current status and attributes of the disk
  • /dev/sda is the name of the drive to check
$ sudo smartctl -t long -a /dev/sda

The current drive on the system does not support Self Test logging but this

Method 2: Using nvme-cli to Test SSD/HDD Health

Another popular tool named “nvme-cli” can be used to check SSD/HDD health. This is designed especially for the NVME type SSDs, as expected from the name. To install it, use these commands:

$ sudo apt install nvme-cli #Ubuntu/Debian $ sudo dnf install nvme-cli #Fedora $ sudo yum install nvme-cli #RHEL/CentOS

The above image confirms the installation of nvme-cli has been done on Ubuntu.

To test the SSD/HDD for health, use this command after understanding it:

  • The watch utility is used to continuously monitor the SMART log
  • -n 1 tells the watch command to monitor the SMART log every second
  • nvme is used to manage the NVMe devices
  • smart-log is used with the nvme command to view the SMART logs
  • /dev/nvme0n1p is the drive being monitored
$ sudo watch -n 1 nvme smart-log /dev/nvme0n1p6

In the above figure, users can check the percentage_used, which is 3% (good health). While if it is over 50%, you should be worried and consider changing the drive. Additionally, users can visualize the “power_on_hours”, “unsafe_shutdowns”, and many more in the terminal.

Note: The above command is executed in dual-boot, and the drive used is NVME.

To check the temperature of the NVME, use the grep command to filter it like this:

$ sudo nvme smart-log /dev/nvme0n1p6 | grep "^temperature"

As seen in the above image, make sure that the temperature remains between 0 and 70 degrees Celsius. If the temperature exceeds this limit, it could cause serious damage to the drive, which ultimately leads to data loss.

Method 3: Using Disks Application (GUI)

To check the SSD/HDD health via the graphical interface, use the GNOME Disk Application by following these steps:

Step 1: Open Disk Application

To open the disk application, click on “Activities” from the top right corner of the screen and then type “Disks” in the search bar and open it:

After clicking on “Disks,” a new window opens up.

Step 2: Select Disk and do SMART Data & Self-Tests

From the new screen, select the drive you want to test (1). Next, choose disk (2). Now click on three dots and then pick the “SMART Data & Self-Tests” option: The disk on the current system does not support this feature, so the option is greyed out. So use the CLI methods discussed above when this option is unavailable.

Conclusion

Linux offers the “Smartctl” and “nvme-cli” command line tools to test the health of SSD/HDD. These tools are used to visualize the “percentage_used ”, “power_on_hours”, “unsafe_shutdowns”, and many more in the terminal. Users can also utilize the “Disks” application in GUI to test SSD/HDD’s health.

This guide explained the methods to test SSD/HDD health on Linux.

TUTORIALS ON LINUX, PROGRAMMING & TECHNOLOGY

Источник https://help.ubuntu.com/community/Smartmontools

Источник https://itslinuxfoss.com/test-ssd-hdd-health-in-linux/

Добавить комментарий

Ваш адрес email не будет опубликован. Обязательные поля помечены *