Created
June 11, 2025 14:25
-
-
Save maurerle/a4f1bbdf47d978eef72f6c140f77a4a2 to your computer and use it in GitHub Desktop.
NVIDIA DGX A100 nvsm cleanup procedure
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
When a hardware replacement is completed on a system, use the below procedure to clear existing alerts, and the events that generated the alert from the system. | |
1. sudo systemctl stop nvsm #stop nvsm services | |
2. sudo rm /var/lib/nvsm/sqlite/nvsm.db #remove the nvsm alert data base | |
3. sudo ipmitool sel clear # clear the SEL current logs | |
4. sudo rm /var/log/bmc_sel_archive_for_BMC_*.log #clear any archived SEL logs that can have the error | |
5. sudo systemctl start nvsm #start nvsm services |
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment