Created
February 2, 2025 20:26
-
-
Save puneetloya/2ec191a1ac79b3e23f76036e1a7d70fa to your computer and use it in GitHub Desktop.
dcgm-exporter.log
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
2025/02/02 10:40:41 maxprocs: Leaving GOMAXPROCS=8: CPU quota undefined | |
2025/02/02 10:40:41 INFO Starting dcgm-exporter Version=4.0.0-4.0.1 | |
2025/02/02 10:40:41 INFO Attempting to initialize DCGM. | |
2025/02/02 10:40:41 INFO Initialized base logger [/builds/dcgm/dcgm/dcgmlib/src/DcgmApi.cpp:5277] [{anonymous}::StartEmbeddedV2] dcgm_level=DEBUG | |
2025/02/02 10:40:41 INFO Not changing to a home directory - 'DCGM_HOME_DIR' is not defined in the environment. [/builds/dcgm/dcgm/dcgmlib/src/DcgmApi.cpp:5290] [{anonymous}::StartEmbeddedV2] dcgm_level=DEBUG | |
2025/02/02 10:40:41 INFO version:4.0.0;arch:x86_64;buildtype:RelWithDebInfo;buildid:10349;builddate:2024-12-10;commit:4288e26f9a6fdabf2f48827baca7c26d0bff23f5;branch:v4.0.0;buildplatform:Linux 5.15.0-122-generic #132-Ubuntu SMP Thu Aug 29 13:45:52 UTC 2024 x86_64;;crc:ab4177067a165f926320c6f1623b7337 [/builds/dcgm/dcgm/dcgmlib/src/DcgmApi.cpp:5294] [{anonymous}::StartEmbeddedV2] dcgm_level=INFO | |
2025/02/02 10:40:41 INFO __DCGM_XID_KMSG__ unset. Not loading [/builds/dcgm/dcgm/dcgmlib/src/DcgmKmsgReader.cpp:40] [ReadEnvXidAndUpdate] dcgm_level=DEBUG | |
2025/02/02 10:40:41 INFO __DCGM_TEST_KMSG_FILENAME__ unset. Not loading [/builds/dcgm/dcgm/dcgmlib/src/DcgmKmsgReader.cpp:149] [ReadEnvKmsgFilenameAndUpdate] dcgm_level=DEBUG | |
2025/02/02 10:40:41 INFO Set m_forceProfMetricsThroughGpm to 0 [/builds/dcgm/dcgm/dcgmlib/src/DcgmCacheManager.cpp:945] [DcgmCacheManager::DcgmCacheManager] dcgm_level=DEBUG | |
2025/02/02 10:40:41 INFO Parsed driver string is 5603503, IsR450OrNewer: 1, IsR520OrNewer: 1 [/builds/dcgm/dcgm/dcgmlib/src/DcgmCacheManager.cpp:2692] [DcgmCacheManager::ReadAndCacheDriverVersions] dcgm_level=INFO | |
2025/02/02 10:40:41 INFO nvmlDevice 0x7a380d607018 is arch 8 [/builds/dcgm/dcgm/dcgmlib/src/DcgmCacheManager.cpp:1558] [DcgmCacheManager::HelperGetLiveChipArch] dcgm_level=DEBUG | |
2025/02/02 10:40:41 INFO Detected 0 NVLinks for GPU 0 [/builds/dcgm/dcgm/dcgmlib/src/DcgmCacheManager.cpp:1277] [DcgmCacheManager::InitializeNvLinkCount] dcgm_level=INFO | |
2025/02/02 10:40:41 INFO [CacheManager][MIG] nvmlDeviceGetMigMode result: (3) Not Supported. CurrentMode: 0, PendingMode: 0 [/builds/dcgm/dcgm/dcgmlib/src/DcgmCacheManager.cpp:706] [DcgmCacheManager::InitializeGpuInstances] dcgm_level=DEBUG | |
2025/02/02 10:40:41 INFO Cannot check for MIG devices: Not Supported [/builds/dcgm/dcgm/dcgmlib/src/DcgmCacheManager.cpp:725] [DcgmCacheManager::InitializeGpuInstances] dcgm_level=DEBUG | |
2025/02/02 10:40:41 INFO Added GPU 0000:00:03.0 with GPU ID 0 to the pciBusGpuIdMap [/builds/dcgm/dcgm/dcgmlib/src/DcgmCacheManager.cpp:1255] [DcgmCacheManager::MergeNewlyDetectedGpuList] dcgm_level=DEBUG | |
2025/02/02 10:40:41 INFO Allowlist NOT bypassed with env variable [/builds/dcgm/dcgm/dcgmlib/src/DcgmCacheManager.cpp:1032] [DcgmCacheManager::IsGpuAllowlisted] dcgm_level=DEBUG | |
2025/02/02 10:40:41 INFO gpuId 0, arch 8 is on the allowlist. [/builds/dcgm/dcgm/dcgmlib/src/DcgmCacheManager.cpp:1056] [DcgmCacheManager::IsGpuAllowlisted] dcgm_level=DEBUG | |
2025/02/02 10:40:41 INFO gpuId 0 has migIsEnabledForGpu = 0 migIsEnabledForAnyGpu 0 [/builds/dcgm/dcgm/dcgmlib/src/DcgmCacheManager.cpp:12957] [DcgmCacheManager::UpdateNvLinkLinkState] dcgm_level=DEBUG | |
2025/02/02 10:40:41 INFO gpuId 0 has migIsEnabledForAnyGpu 0 [/builds/dcgm/dcgm/dcgmlib/src/DcgmTopology.cpp:462] [UpdateNvLinkLinkStateFromNvml] dcgm_level=DEBUG | |
2025/02/02 10:40:41 INFO Got 0 excluded GPUs [/builds/dcgm/dcgm/dcgmlib/src/DcgmCacheManager.cpp:1092] [DcgmCacheManager::ReadAndCacheGpuExclusionList] dcgm_level=INFO | |
2025/02/02 10:40:41 INFO gpuId 0, desiredEvents x8, m_currentEventMask x0 [/builds/dcgm/dcgm/dcgmlib/src/DcgmCacheManager.cpp:2556] [DcgmCacheManager::ManageDeviceEvents] dcgm_level=DEBUG | |
2025/02/02 10:40:41 INFO Set nvmlIndex 0 event mask to x8 [/builds/dcgm/dcgm/dcgmlib/src/DcgmCacheManager.cpp:2611] [DcgmCacheManager::ManageDeviceEvents] dcgm_level=DEBUG | |
2025/02/02 10:40:41 INFO Created thread named "cache_mgr_event" ID 220198464 DcgmThread ptr 0x0x15ea4d50 [/builds/dcgm/dcgm/common/DcgmThread/DcgmThread.cpp:115] [DcgmThread::Start] dcgm_level=INFO | |
2025/02/02 10:40:41 INFO Created thread named "" ID 211805760 DcgmThread ptr 0x0x15e8e790 [/builds/dcgm/dcgm/common/DcgmThread/DcgmThread.cpp:115] [DcgmThread::Start] dcgm_level=INFO | |
2025/02/02 10:40:41 INFO Thread handle 220198464 running [/builds/dcgm/dcgm/common/DcgmThread/DcgmThread.cpp:300] [DcgmThread::RunInternal] dcgm_level=DEBUG | |
2025/02/02 10:40:41 INFO AddEntityToGroup groupId 0, eg 1, eid 0 added to the group [/builds/dcgm/dcgm/dcgmlib/src/DcgmGroupManager.cpp:683] [DcgmGroupInfo::AddEntityToGroup] dcgm_level=INFO | |
2025/02/02 10:40:41 INFO DcgmCacheManagerEventThread started [/builds/dcgm/dcgm/dcgmlib/src/DcgmCacheManager.cpp:13317] [DcgmCacheManagerEventThread::run] dcgm_level=INFO | |
2025/02/02 10:40:41 INFO Added GroupId 0 name DCGM_ALL_SUPPORTED_GPUS for connectionId 0 [/builds/dcgm/dcgm/dcgmlib/src/DcgmGroupManager.cpp:273] [DcgmGroupManager::AddNewGroup] dcgm_level=DEBUG | |
2025/02/02 10:40:41 INFO __DCGM_FATAL_XIDS__ unset. Not loading [/builds/dcgm/dcgm/dcgmlib/src/DcgmCacheManager.cpp:6907] [ReadEnvForFatalXids] dcgm_level=DEBUG | |
2025/02/02 10:40:41 INFO Thread handle 211805760 running [/builds/dcgm/dcgm/common/DcgmThread/DcgmThread.cpp:300] [DcgmThread::RunInternal] dcgm_level=DEBUG | |
2025/02/02 10:40:41 INFO Entering dcgmModuleIdToName(dcgmModuleId_t id, char const **name) (1, 0x7fff392a3180) [/builds/dcgm/dcgm/dcgmlib/entry_point.h:857] [dcgmModuleIdToName] dcgm_level=DEBUG | |
2025/02/02 10:40:41 INFO Returning 0 [/builds/dcgm/dcgm/dcgmlib/entry_point.h:857] [dcgmModuleIdToName] dcgm_level=DEBUG | |
2025/02/02 10:40:41 INFO [[NvSwitch]] Initialized logging for module 1 [/builds/dcgm/dcgm/modules/DcgmModule.h:90] [DcgmModuleWithCoreProxy<moduleId>::DcgmModuleWithCoreProxy] dcgm_level=DEBUG | |
2025/02/02 10:40:41 INFO [[NvSwitch]] Loading NVSDM [/builds/dcgm/dcgm/modules/nvswitch/DcgmModuleNvSwitch.cpp:29] [DcgmNs::createSwitchManager] dcgm_level=DEBUG | |
2025/02/02 10:40:41 INFO [[NvSwitch]] Initializing NVSDM Manager [/builds/dcgm/dcgm/modules/nvswitch/DcgmNvsdmManager.cpp:468] [DcgmNs::DcgmNvsdmManager::Init] dcgm_level=DEBUG | |
2025/02/02 10:40:41 INFO [[NvSwitch]] Could not load NVSDM [/builds/dcgm/dcgm/modules/nvswitch/DcgmNvsdmManager.cpp:501] [DcgmNs::DcgmNvsdmManager::AttachToNvsdm] dcgm_level=ERROR | |
2025/02/02 10:40:41 INFO [[NvSwitch]] AttachToNvsdm() returned -25 [/builds/dcgm/dcgm/modules/nvswitch/DcgmNvsdmManager.cpp:473] [DcgmNs::DcgmNvsdmManager::Init] dcgm_level=ERROR | |
2025/02/02 10:40:41 INFO [[NvSwitch]] Loading NSCQ [/builds/dcgm/dcgm/modules/nvswitch/DcgmModuleNvSwitch.cpp:37] [DcgmNs::createSwitchManager] dcgm_level=DEBUG | |
2025/02/02 10:40:41 INFO [[NvSwitch]] Not attached to NVSDM [/builds/dcgm/dcgm/modules/nvswitch/DcgmNvsdmManager.cpp:520] [DcgmNs::DcgmNvsdmManager::DetachFromNvsdm] dcgm_level=WARN | |
2025/02/02 10:40:41 INFO [[NvSwitch]] Could not load NSCQ. dlwrap_attach ret: Can not access a needed shared library (-79): If this system has NvSwitches, please ensure that the package libnvidia-nscq is installed on your system and that the service user has permissions to access it. [/builds/dcgm/dcgm/modules/nvswitch/DcgmNscqManager.cpp:500] [DcgmNs::DcgmNscqManager::AttachToNscq] dcgm_level=ERROR | |
2025/02/02 10:40:41 INFO [[NvSwitch]] AttachToNscq() returned -25 [/builds/dcgm/dcgm/modules/nvswitch/DcgmNscqManager.cpp:336] [DcgmNs::DcgmNscqManager::Init] dcgm_level=ERROR | |
2025/02/02 10:40:41 INFO [[NvSwitch]] Could not initialize NSCQ. Ret: DCGM library could not be found [/builds/dcgm/dcgm/modules/nvswitch/DcgmModuleNvSwitch.cpp:45] [DcgmNs::createSwitchManager] dcgm_level=WARN | |
2025/02/02 10:40:41 INFO [[NvSwitch]] Constructing NvSwitch Module [/builds/dcgm/dcgm/modules/nvswitch/DcgmModuleNvSwitch.cpp:55] [DcgmNs::DcgmModuleNvSwitch::DcgmModuleNvSwitch] dcgm_level=DEBUG | |
2025/02/02 10:40:41 INFO [[NvSwitch]] Created thread named "" ID 4294964800 DcgmThread ptr 0x0x15ea7d78 [/builds/dcgm/dcgm/common/DcgmThread/DcgmThread.cpp:115] [DcgmThread::Start] dcgm_level=INFO | |
2025/02/02 10:40:41 INFO Loaded module 1 [/builds/dcgm/dcgm/dcgmlib/src/DcgmHostEngineHandler.cpp:1913] [DcgmHostEngineHandler::LoadModule] dcgm_level=INFO | |
2025/02/02 10:40:41 INFO [[NvSwitch]] Thread handle 4294964800 running [/builds/dcgm/dcgm/common/DcgmThread/DcgmThread.cpp:300] [DcgmThread::RunInternal] dcgm_level=DEBUG | |
2025/02/02 10:40:41 INFO [[NvSwitch]] Rescanning switch states [/builds/dcgm/dcgm/modules/nvswitch/DcgmModuleNvSwitch.cpp:451] [DcgmNs::DcgmModuleNvSwitch::RunOnce] dcgm_level=DEBUG | |
2025/02/02 10:40:41 INFO [[NvSwitch]] Reading switch status for all switches [/builds/dcgm/dcgm/modules/nvswitch/DcgmNscqManager.cpp:672] [DcgmNs::DcgmNscqManager::ReadNvSwitchStatusAllSwitches] dcgm_level=DEBUG | |
2025/02/02 10:40:41 INFO [[NvSwitch]] Not attached to NvSwitches. Aborting [/builds/dcgm/dcgm/modules/nvswitch/DcgmNscqManager.cpp:677] [DcgmNs::DcgmNscqManager::ReadNvSwitchStatusAllSwitches] dcgm_level=ERROR | |
2025/02/02 10:40:41 INFO [[NvSwitch]] ReadNvSwitchStatusAllSwitches() returned Object is in an undefined state [/builds/dcgm/dcgm/modules/nvswitch/DcgmModuleNvSwitch.cpp:455] [DcgmNs::DcgmModuleNvSwitch::RunOnce] dcgm_level=WARN | |
2025/02/02 10:40:41 INFO [[NvSwitch]] No fields to update [/builds/dcgm/dcgm/modules/nvswitch/DcgmNvSwitchManagerBase.cpp:344] [DcgmNs::DcgmNvSwitchManagerBase::UpdateFields] dcgm_level=DEBUG | |
2025/02/02 10:40:41 INFO [[NvSwitch]] No fields to update [/builds/dcgm/dcgm/modules/nvswitch/DcgmNvSwitchManagerBase.cpp:344] [DcgmNs::DcgmNvSwitchManagerBase::UpdateFields] dcgm_level=DEBUG | |
2025/02/02 10:40:41 INFO Got 0 entities from GetAllEntitiesOfEntityGroup() of eg 3 [/builds/dcgm/dcgm/dcgmlib/src/DcgmGroupManager.cpp:160] [DcgmGroupManager::AddAllEntitiesToGroup] dcgm_level=WARN | |
2025/02/02 10:40:41 INFO Added GroupId 1 name DCGM_ALL_SUPPORTED_NVSWITCHES for connectionId 0 [/builds/dcgm/dcgm/dcgmlib/src/DcgmGroupManager.cpp:273] [DcgmGroupManager::AddNewGroup] dcgm_level=DEBUG | |
2025/02/02 10:40:41 INFO Added field group id 1, name DCGM_INTERNAL_30SEC, connectionId 0 [/builds/dcgm/dcgm/dcgmlib/src/DcgmFieldGroup.cpp:172] [DcgmFieldGroupManager::AddFieldGroup] dcgm_level=DEBUG | |
2025/02/02 10:40:41 INFO GetGroupEntities got 1 entities for dynamic group 0 [/builds/dcgm/dcgm/dcgmlib/src/DcgmGroupManager.cpp:411] [DcgmGroupManager::GetGroupEntities] dcgm_level=DEBUG | |
2025/02/02 10:40:41 INFO Got 1 entities and 1 fields [/builds/dcgm/dcgm/dcgmlib/src/DcgmHostEngineHandler.cpp:3704] [DcgmHostEngineHandler::WatchFieldGroup] dcgm_level=DEBUG | |
2025/02/02 10:40:41 INFO Adding WatchInfo on entityKey 0x1012c00000000 (eg 1, entityId 0, fieldId 300) [/builds/dcgm/dcgm/dcgmlib/src/DcgmCacheManager.cpp:2254] [DcgmCacheManager::GetEntityWatchInfo] dcgm_level=DEBUG | |
2025/02/02 10:40:41 INFO Adding new watcher type 1, connectionId 0 [/builds/dcgm/dcgm/dcgmlib/src/DcgmCacheManager.cpp:3018] [DcgmCacheManager::AddOrUpdateWatcher] dcgm_level=DEBUG | |
2025/02/02 10:40:41 INFO UpdateWatchFromWatchers minMonitorFreqUsec 30000000, minMaxAgeUsec 15840000000, hsw 0 [/builds/dcgm/dcgm/dcgmlib/src/DcgmCacheManager.cpp:3065] [DcgmCacheManager::UpdateWatchFromWatchers] dcgm_level=DEBUG | |
2025/02/02 10:40:41 INFO AddFieldWatch eg 1, eid 0, fieldId 300, mfu 30000000, msa 14400, mka 480, sfu 0 [/builds/dcgm/dcgm/dcgmlib/src/DcgmCacheManager.cpp:3206] [DcgmCacheManager::AddEntityFieldWatch] dcgm_level=DEBUG | |
2025/02/02 10:40:41 INFO Skipping waitForUpdate since cache thread hasn't run yet. [/builds/dcgm/dcgm/dcgmlib/src/DcgmCacheManager.cpp:2507] [DcgmCacheManager::UpdateAllFields] dcgm_level=DEBUG | |
2025/02/02 10:40:41 INFO Added field group id 2, name DCGM_INTERNAL_HOURLY, connectionId 0 [/builds/dcgm/dcgm/dcgmlib/src/DcgmFieldGroup.cpp:172] [DcgmFieldGroupManager::AddFieldGroup] dcgm_level=DEBUG | |
2025/02/02 10:40:41 INFO Got 1 gpus and 6 fields [/builds/dcgm/dcgm/dcgmlib/src/DcgmHostEngineHandler.cpp:3962] [DcgmHostEngineHandler::WatchFieldGroupAllGpus] dcgm_level=DEBUG | |
2025/02/02 10:40:41 INFO Adding WatchInfo on entityKey 0x101f500000000 (eg 1, entityId 0, fieldId 501) [/builds/dcgm/dcgm/dcgmlib/src/DcgmCacheManager.cpp:2254] [DcgmCacheManager::GetEntityWatchInfo] dcgm_level=DEBUG | |
2025/02/02 10:40:41 INFO Adding new watcher type 1, connectionId 0 [/builds/dcgm/dcgm/dcgmlib/src/DcgmCacheManager.cpp:3018] [DcgmCacheManager::AddOrUpdateWatcher] dcgm_level=DEBUG | |
2025/02/02 10:40:41 INFO UpdateWatchFromWatchers minMonitorFreqUsec 3600000000, minMaxAgeUsec 15840000000, hsw 0 [/builds/dcgm/dcgm/dcgmlib/src/DcgmCacheManager.cpp:3065] [DcgmCacheManager::UpdateWatchFromWatchers] dcgm_level=DEBUG | |
2025/02/02 10:40:41 INFO AddFieldWatch eg 1, eid 0, fieldId 501, mfu 3600000000, msa 14400, mka 4, sfu 0 [/builds/dcgm/dcgm/dcgmlib/src/DcgmCacheManager.cpp:3206] [DcgmCacheManager::AddEntityFieldWatch] dcgm_level=DEBUG | |
2025/02/02 10:40:41 INFO Adding WatchInfo on entityKey 0x101fd00000000 (eg 1, entityId 0, fieldId 509) [/builds/dcgm/dcgm/dcgmlib/src/DcgmCacheManager.cpp:2254] [DcgmCacheManager::GetEntityWatchInfo] dcgm_level=DEBUG | |
2025/02/02 10:40:41 INFO Adding new watcher type 1, connectionId 0 [/builds/dcgm/dcgm/dcgmlib/src/DcgmCacheManager.cpp:3018] [DcgmCacheManager::AddOrUpdateWatcher] dcgm_level=DEBUG | |
2025/02/02 10:40:41 INFO UpdateWatchFromWatchers minMonitorFreqUsec 3600000000, minMaxAgeUsec 15840000000, hsw 0 [/builds/dcgm/dcgm/dcgmlib/src/DcgmCacheManager.cpp:3065] [DcgmCacheManager::UpdateWatchFromWatchers] dcgm_level=DEBUG | |
2025/02/02 10:40:41 INFO AddFieldWatch eg 1, eid 0, fieldId 509, mfu 3600000000, msa 14400, mka 4, sfu 0 [/builds/dcgm/dcgm/dcgmlib/src/DcgmCacheManager.cpp:3206] [DcgmCacheManager::AddEntityFieldWatch] dcgm_level=DEBUG | |
2025/02/02 10:40:41 INFO Adding WatchInfo on entityKey 0x101fe00000000 (eg 1, entityId 0, fieldId 510) [/builds/dcgm/dcgm/dcgmlib/src/DcgmCacheManager.cpp:2254] [DcgmCacheManager::GetEntityWatchInfo] dcgm_level=DEBUG | |
2025/02/02 10:40:41 INFO Adding new watcher type 1, connectionId 0 [/builds/dcgm/dcgm/dcgmlib/src/DcgmCacheManager.cpp:3018] [DcgmCacheManager::AddOrUpdateWatcher] dcgm_level=DEBUG | |
2025/02/02 10:40:41 INFO UpdateWatchFromWatchers minMonitorFreqUsec 3600000000, minMaxAgeUsec 15840000000, hsw 0 [/builds/dcgm/dcgm/dcgmlib/src/DcgmCacheManager.cpp:3065] [DcgmCacheManager::UpdateWatchFromWatchers] dcgm_level=DEBUG | |
2025/02/02 10:40:41 INFO AddFieldWatch eg 1, eid 0, fieldId 510, mfu 3600000000, msa 14400, mka 4, sfu 0 [/builds/dcgm/dcgm/dcgmlib/src/DcgmCacheManager.cpp:3206] [DcgmCacheManager::AddEntityFieldWatch] dcgm_level=DEBUG | |
2025/02/02 10:40:41 INFO Adding WatchInfo on entityKey 0x101ff00000000 (eg 1, entityId 0, fieldId 511) [/builds/dcgm/dcgm/dcgmlib/src/DcgmCacheManager.cpp:2254] [DcgmCacheManager::GetEntityWatchInfo] dcgm_level=DEBUG | |
2025/02/02 10:40:41 INFO Adding new watcher type 1, connectionId 0 [/builds/dcgm/dcgm/dcgmlib/src/DcgmCacheManager.cpp:3018] [DcgmCacheManager::AddOrUpdateWatcher] dcgm_level=DEBUG | |
2025/02/02 10:40:41 INFO UpdateWatchFromWatchers minMonitorFreqUsec 3600000000, minMaxAgeUsec 15840000000, hsw 0 [/builds/dcgm/dcgm/dcgmlib/src/DcgmCacheManager.cpp:3065] [DcgmCacheManager::UpdateWatchFromWatchers] dcgm_level=DEBUG | |
2025/02/02 10:40:41 INFO AddFieldWatch eg 1, eid 0, fieldId 511, mfu 3600000000, msa 14400, mka 4, sfu 0 [/builds/dcgm/dcgm/dcgmlib/src/DcgmCacheManager.cpp:3206] [DcgmCacheManager::AddEntityFieldWatch] dcgm_level=DEBUG | |
2025/02/02 10:40:41 INFO Adding WatchInfo on entityKey 0x1020000000000 (eg 1, entityId 0, fieldId 512) [/builds/dcgm/dcgm/dcgmlib/src/DcgmCacheManager.cpp:2254] [DcgmCacheManager::GetEntityWatchInfo] dcgm_level=DEBUG | |
2025/02/02 10:40:41 INFO Adding new watcher type 1, connectionId 0 [/builds/dcgm/dcgm/dcgmlib/src/DcgmCacheManager.cpp:3018] [DcgmCacheManager::AddOrUpdateWatcher] dcgm_level=DEBUG | |
2025/02/02 10:40:41 INFO UpdateWatchFromWatchers minMonitorFreqUsec 3600000000, minMaxAgeUsec 15840000000, hsw 0 [/builds/dcgm/dcgm/dcgmlib/src/DcgmCacheManager.cpp:3065] [DcgmCacheManager::UpdateWatchFromWatchers] dcgm_level=DEBUG | |
2025/02/02 10:40:41 INFO AddFieldWatch eg 1, eid 0, fieldId 512, mfu 3600000000, msa 14400, mka 4, sfu 0 [/builds/dcgm/dcgm/dcgmlib/src/DcgmCacheManager.cpp:3206] [DcgmCacheManager::AddEntityFieldWatch] dcgm_level=DEBUG | |
2025/02/02 10:40:41 INFO Adding WatchInfo on entityKey 0x1020100000000 (eg 1, entityId 0, fieldId 513) [/builds/dcgm/dcgm/dcgmlib/src/DcgmCacheManager.cpp:2254] [DcgmCacheManager::GetEntityWatchInfo] dcgm_level=DEBUG | |
2025/02/02 10:40:41 INFO Adding new watcher type 1, connectionId 0 [/builds/dcgm/dcgm/dcgmlib/src/DcgmCacheManager.cpp:3018] [DcgmCacheManager::AddOrUpdateWatcher] dcgm_level=DEBUG | |
2025/02/02 10:40:41 INFO UpdateWatchFromWatchers minMonitorFreqUsec 3600000000, minMaxAgeUsec 15840000000, hsw 0 [/builds/dcgm/dcgm/dcgmlib/src/DcgmCacheManager.cpp:3065] [DcgmCacheManager::UpdateWatchFromWatchers] dcgm_level=DEBUG | |
2025/02/02 10:40:41 INFO AddFieldWatch eg 1, eid 0, fieldId 513, mfu 3600000000, msa 14400, mka 4, sfu 0 [/builds/dcgm/dcgm/dcgmlib/src/DcgmCacheManager.cpp:3206] [DcgmCacheManager::AddEntityFieldWatch] dcgm_level=DEBUG | |
2025/02/02 10:40:41 INFO Skipping waitForUpdate since cache thread hasn't run yet. [/builds/dcgm/dcgm/dcgmlib/src/DcgmCacheManager.cpp:2507] [DcgmCacheManager::UpdateAllFields] dcgm_level=DEBUG | |
2025/02/02 10:40:41 INFO Added field group id 3, name DCGM_INTERNAL_JOB, connectionId 0 [/builds/dcgm/dcgm/dcgmlib/src/DcgmFieldGroup.cpp:172] [DcgmFieldGroupManager::AddFieldGroup] dcgm_level=DEBUG | |
2025/02/02 10:40:41 INFO Created thread named "cache_mgr_main" ID 4286572096 DcgmThread ptr 0x0x15ea0fd0 [/builds/dcgm/dcgm/common/DcgmThread/DcgmThread.cpp:115] [DcgmThread::Start] dcgm_level=INFO | |
2025/02/02 10:40:41 INFO Waited 0 usec for the cache manager thread to start. [/builds/dcgm/dcgm/dcgmlib/src/DcgmCacheManager.cpp:2382] [DcgmCacheManager::Start] dcgm_level=DEBUG | |
2025/02/02 10:40:41 INFO Thread handle 4286572096 running [/builds/dcgm/dcgm/common/DcgmThread/DcgmThread.cpp:300] [DcgmThread::RunInternal] dcgm_level=DEBUG | |
2025/02/02 10:40:41 INFO Cache manager update thread starting [/builds/dcgm/dcgm/dcgmlib/src/DcgmCacheManager.cpp:6246] [DcgmCacheManager::run] dcgm_level=INFO | |
2025/02/02 10:40:41 INFO Preparing to update watchInfo 0x15ea8a30, eg 1, eid 0, fieldId 300 [/builds/dcgm/dcgm/dcgmlib/src/DcgmCacheManager.cpp:5496] [DcgmCacheManager::ActuallyUpdateAllFields] dcgm_level=DEBUG | |
2025/02/02 10:40:41 INFO Checking status for gpu 0 [/builds/dcgm/dcgm/dcgmlib/src/DcgmCacheManager.cpp:2396] [DcgmCacheManager::GetGpuStatus] dcgm_level=DEBUG | |
2025/02/02 10:40:41 INFO Preparing to update watchInfo 0x15ea8d60, eg 1, eid 0, fieldId 501 [/builds/dcgm/dcgm/dcgmlib/src/DcgmCacheManager.cpp:5496] [DcgmCacheManager::ActuallyUpdateAllFields] dcgm_level=DEBUG | |
2025/02/02 10:40:41 INFO Checking status for gpu 0 [/builds/dcgm/dcgm/dcgmlib/src/DcgmCacheManager.cpp:2396] [DcgmCacheManager::GetGpuStatus] dcgm_level=DEBUG | |
2025/02/02 10:40:41 INFO Appended entity blob eg 1, eid 0, fieldId 501, ts 1738492841355298, valueSize 300, cached 1, buffered 0 [/builds/dcgm/dcgm/dcgmlib/src/DcgmCacheManager.cpp:6637] [DcgmCacheManager::AppendEntityBlob] dcgm_level=DEBUG | |
2025/02/02 10:40:41 INFO Preparing to update watchInfo 0x15ebb1b0, eg 1, eid 0, fieldId 512 [/builds/dcgm/dcgm/dcgmlib/src/DcgmCacheManager.cpp:5496] [DcgmCacheManager::ActuallyUpdateAllFields] dcgm_level=DEBUG | |
2025/02/02 10:40:41 INFO Checking status for gpu 0 [/builds/dcgm/dcgm/dcgmlib/src/DcgmCacheManager.cpp:2396] [DcgmCacheManager::GetGpuStatus] dcgm_level=DEBUG | |
2025/02/02 10:40:41 INFO Appended entity blob eg 1, eid 0, fieldId 512, ts 1738492841355330, valueSize 2048, cached 1, buffered 0 [/builds/dcgm/dcgm/dcgmlib/src/DcgmCacheManager.cpp:6637] [DcgmCacheManager::AppendEntityBlob] dcgm_level=DEBUG | |
2025/02/02 10:40:41 INFO Preparing to update watchInfo 0x15ea8e70, eg 1, eid 0, fieldId 509 [/builds/dcgm/dcgm/dcgmlib/src/DcgmCacheManager.cpp:5496] [DcgmCacheManager::ActuallyUpdateAllFields] dcgm_level=DEBUG | |
2025/02/02 10:40:41 INFO Checking status for gpu 0 [/builds/dcgm/dcgm/dcgmlib/src/DcgmCacheManager.cpp:2396] [DcgmCacheManager::GetGpuStatus] dcgm_level=DEBUG | |
2025/02/02 10:40:41 INFO Appended entity blob eg 1, eid 0, fieldId 509, ts 1738492841355354, valueSize 4, cached 1, buffered 0 [/builds/dcgm/dcgm/dcgmlib/src/DcgmCacheManager.cpp:6637] [DcgmCacheManager::AppendEntityBlob] dcgm_level=DEBUG | |
2025/02/02 10:40:41 INFO Preparing to update watchInfo 0x15ea8f40, eg 1, eid 0, fieldId 510 [/builds/dcgm/dcgm/dcgmlib/src/DcgmCacheManager.cpp:5496] [DcgmCacheManager::ActuallyUpdateAllFields] dcgm_level=DEBUG | |
2025/02/02 10:40:41 INFO Checking status for gpu 0 [/builds/dcgm/dcgm/dcgmlib/src/DcgmCacheManager.cpp:2396] [DcgmCacheManager::GetGpuStatus] dcgm_level=DEBUG | |
2025/02/02 10:40:41 INFO Appended entity blob eg 1, eid 0, fieldId 510, ts 1738492841355376, valueSize 64, cached 1, buffered 0 [/builds/dcgm/dcgm/dcgmlib/src/DcgmCacheManager.cpp:6637] [DcgmCacheManager::AppendEntityBlob] dcgm_level=DEBUG | |
2025/02/02 10:40:41 INFO Preparing to update watchInfo 0x15ebb0e0, eg 1, eid 0, fieldId 511 [/builds/dcgm/dcgm/dcgmlib/src/DcgmCacheManager.cpp:5496] [DcgmCacheManager::ActuallyUpdateAllFields] dcgm_level=DEBUG | |
2025/02/02 10:40:41 INFO Checking status for gpu 0 [/builds/dcgm/dcgm/dcgmlib/src/DcgmCacheManager.cpp:2396] [DcgmCacheManager::GetGpuStatus] dcgm_level=DEBUG | |
2025/02/02 10:40:41 INFO Appended entity blob eg 1, eid 0, fieldId 511, ts 1738492841355402, valueSize 2048, cached 1, buffered 0 [/builds/dcgm/dcgm/dcgmlib/src/DcgmCacheManager.cpp:6637] [DcgmCacheManager::AppendEntityBlob] dcgm_level=DEBUG | |
2025/02/02 10:40:41 INFO Preparing to update watchInfo 0x15ebb360, eg 1, eid 0, fieldId 513 [/builds/dcgm/dcgm/dcgmlib/src/DcgmCacheManager.cpp:5496] [DcgmCacheManager::ActuallyUpdateAllFields] dcgm_level=DEBUG | |
2025/02/02 10:40:41 INFO Checking status for gpu 0 [/builds/dcgm/dcgm/dcgmlib/src/DcgmCacheManager.cpp:2396] [DcgmCacheManager::GetGpuStatus] dcgm_level=DEBUG | |
2025/02/02 10:40:41 INFO Appended entity blob eg 1, eid 0, fieldId 513, ts 1738492841355431, valueSize 4096, cached 1, buffered 0 [/builds/dcgm/dcgm/dcgmlib/src/DcgmCacheManager.cpp:6637] [DcgmCacheManager::AppendEntityBlob] dcgm_level=DEBUG | |
2025/02/02 10:40:41 INFO Got 1 field value fields for gpuId 0 [/builds/dcgm/dcgm/dcgmlib/src/DcgmCacheManager.cpp:5608] [DcgmCacheManager::ActuallyUpdateAllFields] dcgm_level=DEBUG | |
2025/02/02 10:40:41 INFO fieldId 1 got good value type 3, value 0X1 [/builds/dcgm/dcgm/dcgmlib/src/DcgmCacheManager.cpp:6013] [DcgmCacheManager::ActuallyUpdateGpuFieldValues] dcgm_level=DEBUG | |
2025/02/02 10:40:41 INFO Appended entity i64 eg 1, eid 0, fieldId 300, ts 1738492841355462, value1 1, value2 0, cached 1, buffered 0 [/builds/dcgm/dcgm/dcgmlib/src/DcgmCacheManager.cpp:6528] [DcgmCacheManager::AppendEntityInt64] dcgm_level=DEBUG | |
2025/02/02 10:40:41 INFO DoOneUpdateAllFields returned 1738492871355462 [/builds/dcgm/dcgm/dcgmlib/src/DcgmCacheManager.cpp:2503] [DcgmCacheManager::UpdateAllFields] dcgm_level=DEBUG | |
2025/02/02 10:40:41 INFO dcgmStartEmbedded(): Embedded host engine started [/builds/dcgm/dcgm/dcgmlib/src/DcgmApi.cpp:5318] [{anonymous}::StartEmbeddedV2] dcgm_level=DEBUG | |
2025/02/02 10:40:41 INFO Initialized DCGM Fields module. | |
2025/02/02 10:40:41 INFO DCGM successfully initialized! | |
2025/02/02 10:40:41 INFO Attempting to initialize NVML library. | |
2025/02/02 10:40:41 INFO NVML provider successfully initialized! | |
2025/02/02 10:40:41 INFO Entering dcgmProfGetSupportedMetricGroups(dcgmHandle_t pDcgmHandle, dcgmProfGetMetricGroups_t *metricGroups) (2147483647, 0xc00062e000) [/builds/dcgm/dcgm/dcgmlib/entry_point.h:823] [dcgmProfGetSupportedMetricGroups] dcgm_level=DEBUG | |
2025/02/02 10:40:41 INFO nvmlGpmQueryDeviceSupport returned isSupportedDevice 0 for nvmlDevice x0x7a380d607018 [/builds/dcgm/dcgm/dcgmlib/src/DcgmGpmManager.cpp:403] [DcgmGpmManager::DoesNvmlDeviceSupportGpm] dcgm_level=DEBUG | |
2025/02/02 10:40:41 INFO gpuId 0 was not a GPM GPU [/builds/dcgm/dcgm/modules/core/DcgmModuleCore.cpp:2032] [DcgmModuleCore::ProcessProfGetMetricGroups] dcgm_level=DEBUG | |
2025/02/02 10:40:41 INFO Failed to load module 8 - dlopen(libdcgmmoduleprofiling.so.4) returned: libdcgmmoduleprofiling.so.4: cannot open shared object file: No such file or directory [/builds/dcgm/dcgm/dcgmlib/src/DcgmHostEngineHandler.cpp:1866] [DcgmHostEngineHandler::LoadModule] dcgm_level=ERROR | |
2025/02/02 10:40:41 INFO Core module subcommand 51 returned: This request is serviced by a module of DCGM that is not currently loaded [/builds/dcgm/dcgm/modules/core/DcgmModuleCore.cpp:274] [DcgmModuleCore::ProcessMessage] dcgm_level=ERROR | |
2025/02/02 10:40:41 INFO Returning -33 [/builds/dcgm/dcgm/dcgmlib/entry_point.h:823] [dcgmProfGetSupportedMetricGroups] dcgm_level=DEBUG | |
2025/02/02 10:40:41 INFO Not collecting DCP metrics: This request is serviced by a module of DCGM that is not currently loaded | |
2025/02/02 10:40:41 INFO Falling back to metric file '/etc/dcgm-exporter/default-counters.csv' | |
2025/02/02 10:40:41 WARN Skipping line 20 ('DCGM_FI_PROF_GR_ENGINE_ACTIVE'): metric not enabled | |
2025/02/02 10:40:41 WARN Skipping line 21 ('DCGM_FI_PROF_PIPE_TENSOR_ACTIVE'): metric not enabled | |
2025/02/02 10:40:41 WARN Skipping line 22 ('DCGM_FI_PROF_DRAM_ACTIVE'): metric not enabled | |
2025/02/02 10:40:41 WARN Skipping line 23 ('DCGM_FI_PROF_PCIE_TX_BYTES'): metric not enabled | |
2025/02/02 10:40:41 WARN Skipping line 24 ('DCGM_FI_PROF_PCIE_RX_BYTES'): metric not enabled | |
2025/02/02 10:40:41 INFO Initializing system entities of type 'GPU' | |
2025/02/02 10:40:41 INFO Entering dcgmGetAllDevices(dcgmHandle_t pDcgmHandle, unsigned int gpuIdList[DCGM_MAX_NUM_DEVICES], int *count) (2147483647 0xc000420000 0xc0004080d8) [/builds/dcgm/dcgm/dcgmlib/entry_point.h:81] [dcgmGetAllDevices] dcgm_level=DEBUG | |
2025/02/02 10:40:41 INFO Returning 0 [/builds/dcgm/dcgm/dcgmlib/entry_point.h:81] [dcgmGetAllDevices] dcgm_level=DEBUG | |
2025/02/02 10:40:41 INFO Entering dcgmGetDeviceAttributes(dcgmHandle_t pDcgmHandle, unsigned int gpuId, dcgmDeviceAttributes_t *pDcgmDeviceAttr) (2147483647 0 0xc0000e0000) [/builds/dcgm/dcgm/dcgmlib/entry_point.h:97] [dcgmGetDeviceAttributes] dcgm_level=DEBUG | |
2025/02/02 10:40:41 INFO Appended entity i64 eg 1, eid 0, fieldId 158, ts 1738492841359460, value1 95, value2 0, cached 0, buffered 1 [/builds/dcgm/dcgm/dcgmlib/src/DcgmCacheManager.cpp:6528] [DcgmCacheManager::AppendEntityInt64] dcgm_level=DEBUG | |
2025/02/02 10:40:41 INFO Appended entity i64 eg 1, eid 0, fieldId 159, ts 1738492841362047, value1 98, value2 0, cached 0, buffered 1 [/builds/dcgm/dcgm/dcgmlib/src/DcgmCacheManager.cpp:6528] [DcgmCacheManager::AppendEntityInt64] dcgm_level=DEBUG | |
2025/02/02 10:40:41 INFO Appended entity double eg 1, eid 0, fieldId 164, ts 1738492841362211, value1 72, value2 0, cached 0, buffered 1 [/builds/dcgm/dcgm/dcgmlib/src/DcgmCacheManager.cpp:6431] [DcgmCacheManager::AppendEntityDouble] dcgm_level=DEBUG | |
2025/02/02 10:40:41 INFO Appended entity double eg 1, eid 0, fieldId 160, ts 1738492841366552, value1 72, value2 0, cached 0, buffered 1 [/builds/dcgm/dcgm/dcgmlib/src/DcgmCacheManager.cpp:6431] [DcgmCacheManager::AppendEntityDouble] dcgm_level=DEBUG | |
2025/02/02 10:40:41 INFO Appended entity double eg 1, eid 0, fieldId 163, ts 1738492841366751, value1 72, value2 0, cached 0, buffered 1 [/builds/dcgm/dcgm/dcgmlib/src/DcgmCacheManager.cpp:6431] [DcgmCacheManager::AppendEntityDouble] dcgm_level=DEBUG | |
2025/02/02 10:40:41 INFO Appended entity double eg 1, eid 0, fieldId 162, ts 1738492841366763, value1 72, value2 0, cached 0, buffered 1 [/builds/dcgm/dcgm/dcgmlib/src/DcgmCacheManager.cpp:6431] [DcgmCacheManager::AppendEntityDouble] dcgm_level=DEBUG | |
2025/02/02 10:40:41 INFO Appended entity double eg 1, eid 0, fieldId 161, ts 1738492841366774, value1 40, value2 0, cached 0, buffered 1 [/builds/dcgm/dcgm/dcgmlib/src/DcgmCacheManager.cpp:6431] [DcgmCacheManager::AppendEntityDouble] dcgm_level=DEBUG | |
2025/02/02 10:40:41 INFO Appended entity blob eg 1, eid 0, fieldId 130, ts 1738492841366781, valueSize 1844, cached 0, buffered 1 [/builds/dcgm/dcgm/dcgmlib/src/DcgmCacheManager.cpp:6637] [DcgmCacheManager::AppendEntityBlob] dcgm_level=DEBUG | |
2025/02/02 10:40:41 INFO Appended entity string eg 1, eid 0, fieldId 54, ts 1738492841369602, value "GPU-6294038a-1619-f07a-507f-2753da483223", cached 0, buffered 1 [/builds/dcgm/dcgm/dcgmlib/src/DcgmCacheManager.cpp:6582] [DcgmCacheManager::AppendEntityString] dcgm_level=DEBUG | |
2025/02/02 10:40:41 INFO Appended entity string eg 1, eid 0, fieldId 85, ts 1738492841369610, value "95.04.29.00.07", cached 0, buffered 1 [/builds/dcgm/dcgm/dcgmlib/src/DcgmCacheManager.cpp:6582] [DcgmCacheManager::AppendEntityString] dcgm_level=DEBUG | |
2025/02/02 10:40:41 INFO Appended entity string eg 1, eid 0, fieldId 82, ts 1738492841369780, value "G193.0202.00.01", cached 0, buffered 1 [/builds/dcgm/dcgm/dcgmlib/src/DcgmCacheManager.cpp:6582] [DcgmCacheManager::AppendEntityString] dcgm_level=DEBUG | |
2025/02/02 10:40:41 INFO Appended entity string eg 1, eid 0, fieldId 51, ts 1738492841369794, value "NVIDIA", cached 0, buffered 1 [/builds/dcgm/dcgm/dcgmlib/src/DcgmCacheManager.cpp:6582] [DcgmCacheManager::AppendEntityString] dcgm_level=DEBUG | |
2025/02/02 10:40:41 INFO Appended entity string eg 1, eid 0, fieldId 50, ts 1738492841369800, value "NVIDIA L4", cached 0, buffered 1 [/builds/dcgm/dcgm/dcgmlib/src/DcgmCacheManager.cpp:6582] [DcgmCacheManager::AppendEntityString] dcgm_level=DEBUG | |
2025/02/02 10:40:41 INFO Appended entity string eg 1, eid 0, fieldId 53, ts 1738492841369814, value "1325222007319", cached 0, buffered 1 [/builds/dcgm/dcgm/dcgmlib/src/DcgmCacheManager.cpp:6582] [DcgmCacheManager::AppendEntityString] dcgm_level=DEBUG | |
2025/02/02 10:40:41 INFO Appended entity string eg 1, eid 0, fieldId 57, ts 1738492841369828, value "00000000:00:03.0", cached 0, buffered 1 [/builds/dcgm/dcgm/dcgmlib/src/DcgmCacheManager.cpp:6582] [DcgmCacheManager::AppendEntityString] dcgm_level=DEBUG | |
2025/02/02 10:40:41 INFO Appended entity i64 eg 1, eid 0, fieldId 58, ts 1738492841369839, value1 666374366, value2 0, cached 0, buffered 1 [/builds/dcgm/dcgm/dcgmlib/src/DcgmCacheManager.cpp:6528] [DcgmCacheManager::AppendEntityInt64] dcgm_level=DEBUG | |
2025/02/02 10:40:41 INFO Appended entity i64 eg 1, eid 0, fieldId 59, ts 1738492841369860, value1 384700638, value2 0, cached 0, buffered 1 [/builds/dcgm/dcgm/dcgmlib/src/DcgmCacheManager.cpp:6528] [DcgmCacheManager::AppendEntityInt64] dcgm_level=DEBUG | |
2025/02/02 10:40:41 INFO Appended entity i64 eg 1, eid 0, fieldId 90, ts 1738492841369873, value1 32768, value2 0, cached 0, buffered 1 [/builds/dcgm/dcgm/dcgmlib/src/DcgmCacheManager.cpp:6528] [DcgmCacheManager::AppendEntityInt64] dcgm_level=DEBUG | |
2025/02/02 10:40:41 INFO Appended entity i64 eg 1, eid 0, fieldId 250, ts 1738492841370089, value1 23034, value2 0, cached 0, buffered 1 [/builds/dcgm/dcgm/dcgmlib/src/DcgmCacheManager.cpp:6528] [DcgmCacheManager::AppendEntityInt64] dcgm_level=DEBUG | |
2025/02/02 10:40:41 INFO Appended entity i64 eg 1, eid 0, fieldId 252, ts 1738492841370112, value1 17196, value2 0, cached 0, buffered 1 [/builds/dcgm/dcgm/dcgmlib/src/DcgmCacheManager.cpp:6528] [DcgmCacheManager::AppendEntityInt64] dcgm_level=DEBUG | |
2025/02/02 10:40:41 INFO Appended entity i64 eg 1, eid 0, fieldId 251, ts 1738492841370128, value1 5397, value2 0, cached 0, buffered 1 [/builds/dcgm/dcgm/dcgmlib/src/DcgmCacheManager.cpp:6528] [DcgmCacheManager::AppendEntityInt64] dcgm_level=DEBUG | |
2025/02/02 10:40:41 INFO Fixed entityGroupId to be DCGM_FE_NONE fieldId 1 [/builds/dcgm/dcgm/dcgmlib/src/DcgmCacheManager.cpp:5697] [DcgmCacheManager::GetMultipleLatestLiveSamples] dcgm_level=DEBUG | |
2025/02/02 10:40:41 INFO Appended entity string eg 1, eid 0, fieldId 1, ts 1738492841370147, value "560.35.03", cached 0, buffered 1 [/builds/dcgm/dcgm/dcgmlib/src/DcgmCacheManager.cpp:6582] [DcgmCacheManager::AppendEntityString] dcgm_level=DEBUG | |
2025/02/02 10:40:41 INFO Appended entity i64 eg 1, eid 0, fieldId 500, ts 1738492841370153, value1 1, value2 0, cached 0, buffered 1 [/builds/dcgm/dcgm/dcgmlib/src/DcgmCacheManager.cpp:6528] [DcgmCacheManager::AppendEntityInt64] dcgm_level=DEBUG | |
2025/02/02 10:40:41 INFO Appended entity i64 eg 1, eid 0, fieldId 66, ts 1738492841370158, value1 0, value2 0, cached 0, buffered 1 [/builds/dcgm/dcgm/dcgmlib/src/DcgmCacheManager.cpp:6528] [DcgmCacheManager::AppendEntityInt64] dcgm_level=DEBUG | |
2025/02/02 10:40:41 INFO Appended entity i64 eg 1, eid 0, fieldId 67, ts 1738492841370193, value1 0, value2 0, cached 0, buffered 1 [/builds/dcgm/dcgm/dcgmlib/src/DcgmCacheManager.cpp:6528] [DcgmCacheManager::AppendEntityInt64] dcgm_level=DEBUG | |
2025/02/02 10:40:41 INFO Appended entity i64 eg 1, eid 0, fieldId 74, ts 1738492841370200, value1 0, value2 0, cached 0, buffered 1 [/builds/dcgm/dcgm/dcgmlib/src/DcgmCacheManager.cpp:6528] [DcgmCacheManager::AppendEntityInt64] dcgm_level=DEBUG | |
2025/02/02 10:40:41 INFO Returning 0 [/builds/dcgm/dcgm/dcgmlib/entry_point.h:97] [dcgmGetDeviceAttributes] dcgm_level=DEBUG | |
2025/02/02 10:40:41 INFO Entering dcgmGetAllSupportedDevices(dcgmHandle_t pDcgmHandle, unsigned int gpuIdList[DCGM_MAX_NUM_DEVICES], int *count) (2147483647 0xc000420080 0xc0004080f0) [/builds/dcgm/dcgm/dcgmlib/entry_point.h:89] [dcgmGetAllSupportedDevices] dcgm_level=DEBUG | |
2025/02/02 10:40:41 INFO Returning 0 [/builds/dcgm/dcgm/dcgmlib/entry_point.h:89] [dcgmGetAllSupportedDevices] dcgm_level=DEBUG | |
2025/02/02 10:40:41 INFO Entering dcgmFieldGroupCreate(dcgmHandle_t pDcgmHandle, int numFieldIds, unsigned short *fieldIds, const char *fieldGroupName, dcgmFieldGrp_t *dcgmFieldGroupId) (2147483647 4, 0xc000408110, cpuAffFields3277177236626796047, 0xc000408140) [/builds/dcgm/dcgm/dcgmlib/entry_point.h:586] [dcgmFieldGroupCreate] dcgm_level=DEBUG | |
2025/02/02 10:40:41 INFO Added field group id 4, name cpuAffFields3277177236626796047, connectionId 0 [/builds/dcgm/dcgm/dcgmlib/src/DcgmFieldGroup.cpp:172] [DcgmFieldGroupManager::AddFieldGroup] dcgm_level=DEBUG | |
2025/02/02 10:40:41 INFO Returning 0 [/builds/dcgm/dcgm/dcgmlib/entry_point.h:586] [dcgmFieldGroupCreate] dcgm_level=DEBUG | |
2025/02/02 10:40:41 INFO Entering dcgmGroupCreate(dcgmHandle_t pDcgmHandle, dcgmGroupType_t type, const char *groupName, dcgmGpuGrp_t *pDcgmGrpId) (2147483647 1 cpuAff12333533104354541714 0xc000408150) [/builds/dcgm/dcgm/dcgmlib/entry_point.h:177] [dcgmGroupCreate] dcgm_level=DEBUG | |
2025/02/02 10:40:41 INFO Added GroupId 2 name cpuAff12333533104354541714 for connectionId 0 [/builds/dcgm/dcgm/dcgmlib/src/DcgmGroupManager.cpp:273] [DcgmGroupManager::AddNewGroup] dcgm_level=DEBUG | |
2025/02/02 10:40:41 INFO Returning 0 [/builds/dcgm/dcgm/dcgmlib/entry_point.h:177] [dcgmGroupCreate] dcgm_level=DEBUG | |
2025/02/02 10:40:41 INFO Entering dcgmGroupAddDevice(dcgmHandle_t pDcgmHandle, dcgmGpuGrp_t groupId, unsigned int gpuId) (2147483647 2 0) [/builds/dcgm/dcgm/dcgmlib/entry_point.h:193] [dcgmGroupAddDevice] dcgm_level=DEBUG | |
2025/02/02 10:40:41 INFO AddEntityToGroup groupId 2, eg 1, eid 0 added to the group [/builds/dcgm/dcgm/dcgmlib/src/DcgmGroupManager.cpp:683] [DcgmGroupInfo::AddEntityToGroup] dcgm_level=INFO | |
2025/02/02 10:40:41 INFO groupId 2 added eg 1, eid 0. ret 0 [/builds/dcgm/dcgm/dcgmlib/src/DcgmGroupManager.cpp:490] [DcgmGroupManager::AddEntityToGroup] dcgm_level=DEBUG | |
2025/02/02 10:40:41 INFO Returning 0 [/builds/dcgm/dcgm/dcgmlib/entry_point.h:193] [dcgmGroupAddDevice] dcgm_level=DEBUG | |
2025/02/02 10:40:41 INFO Entering dcgmWatchFields(dcgmHandle_t pDcgmHandle, dcgmGpuGrp_t groupId, dcgmFieldGrp_t fieldGroupId, long long updateFreq, double maxKeepAge, int maxKeepSamples) (2147483647 2, 4, 30000000, 0, 1) [/builds/dcgm/dcgm/dcgmlib/entry_point.h:562] [dcgmWatchFields] dcgm_level=DEBUG | |
2025/02/02 10:40:41 INFO Got 1 entities and 4 fields [/builds/dcgm/dcgm/dcgmlib/src/DcgmHostEngineHandler.cpp:3704] [DcgmHostEngineHandler::WatchFieldGroup] dcgm_level=DEBUG | |
2025/02/02 10:40:41 INFO Adding WatchInfo on entityKey 0x1004600000000 (eg 1, entityId 0, fieldId 70) [/builds/dcgm/dcgm/dcgmlib/src/DcgmCacheManager.cpp:2254] [DcgmCacheManager::GetEntityWatchInfo] dcgm_level=DEBUG | |
2025/02/02 10:40:41 INFO Adding new watcher type 0, connectionId 0 [/builds/dcgm/dcgm/dcgmlib/src/DcgmCacheManager.cpp:3018] [DcgmCacheManager::AddOrUpdateWatcher] dcgm_level=DEBUG | |
2025/02/02 10:40:41 INFO UpdateWatchFromWatchers minMonitorFreqUsec 30000000, minMaxAgeUsec 33000000, hsw 0 [/builds/dcgm/dcgm/dcgmlib/src/DcgmCacheManager.cpp:3065] [DcgmCacheManager::UpdateWatchFromWatchers] dcgm_level=DEBUG | |
2025/02/02 10:40:41 INFO AddFieldWatch eg 1, eid 0, fieldId 70, mfu 30000000, msa 0, mka 1, sfu 0 [/builds/dcgm/dcgm/dcgmlib/src/DcgmCacheManager.cpp:3206] [DcgmCacheManager::AddEntityFieldWatch] dcgm_level=DEBUG | |
2025/02/02 10:40:41 INFO Adding WatchInfo on entityKey 0x1004700000000 (eg 1, entityId 0, fieldId 71) [/builds/dcgm/dcgm/dcgmlib/src/DcgmCacheManager.cpp:2254] [DcgmCacheManager::GetEntityWatchInfo] dcgm_level=DEBUG | |
2025/02/02 10:40:41 INFO Adding new watcher type 0, connectionId 0 [/builds/dcgm/dcgm/dcgmlib/src/DcgmCacheManager.cpp:3018] [DcgmCacheManager::AddOrUpdateWatcher] dcgm_level=DEBUG | |
2025/02/02 10:40:41 INFO UpdateWatchFromWatchers minMonitorFreqUsec 30000000, minMaxAgeUsec 33000000, hsw 0 [/builds/dcgm/dcgm/dcgmlib/src/DcgmCacheManager.cpp:3065] [DcgmCacheManager::UpdateWatchFromWatchers] dcgm_level=DEBUG | |
2025/02/02 10:40:41 INFO AddFieldWatch eg 1, eid 0, fieldId 71, mfu 30000000, msa 0, mka 1, sfu 0 [/builds/dcgm/dcgm/dcgmlib/src/DcgmCacheManager.cpp:3206] [DcgmCacheManager::AddEntityFieldWatch] dcgm_level=DEBUG | |
2025/02/02 10:40:41 INFO Adding WatchInfo on entityKey 0x1004800000000 (eg 1, entityId 0, fieldId 72) [/builds/dcgm/dcgm/dcgmlib/src/DcgmCacheManager.cpp:2254] [DcgmCacheManager::GetEntityWatchInfo] dcgm_level=DEBUG | |
2025/02/02 10:40:41 INFO Adding new watcher type 0, connectionId 0 [/builds/dcgm/dcgm/dcgmlib/src/DcgmCacheManager.cpp:3018] [DcgmCacheManager::AddOrUpdateWatcher] dcgm_level=DEBUG | |
2025/02/02 10:40:41 INFO UpdateWatchFromWatchers minMonitorFreqUsec 30000000, minMaxAgeUsec 33000000, hsw 0 [/builds/dcgm/dcgm/dcgmlib/src/DcgmCacheManager.cpp:3065] [DcgmCacheManager::UpdateWatchFromWatchers] dcgm_level=DEBUG | |
2025/02/02 10:40:41 INFO AddFieldWatch eg 1, eid 0, fieldId 72, mfu 30000000, msa 0, mka 1, sfu 0 [/builds/dcgm/dcgm/dcgmlib/src/DcgmCacheManager.cpp:3206] [DcgmCacheManager::AddEntityFieldWatch] dcgm_level=DEBUG | |
2025/02/02 10:40:41 INFO Adding WatchInfo on entityKey 0x1004900000000 (eg 1, entityId 0, fieldId 73) [/builds/dcgm/dcgm/dcgmlib/src/DcgmCacheManager.cpp:2254] [DcgmCacheManager::GetEntityWatchInfo] dcgm_level=DEBUG | |
2025/02/02 10:40:41 INFO Adding new watcher type 0, connectionId 0 [/builds/dcgm/dcgm/dcgmlib/src/DcgmCacheManager.cpp:3018] [DcgmCacheManager::AddOrUpdateWatcher] dcgm_level=DEBUG | |
2025/02/02 10:40:41 INFO UpdateWatchFromWatchers minMonitorFreqUsec 30000000, minMaxAgeUsec 33000000, hsw 0 [/builds/dcgm/dcgm/dcgmlib/src/DcgmCacheManager.cpp:3065] [DcgmCacheManager::UpdateWatchFromWatchers] dcgm_level=DEBUG | |
2025/02/02 10:40:41 INFO AddFieldWatch eg 1, eid 0, fieldId 73, mfu 30000000, msa 0, mka 1, sfu 0 [/builds/dcgm/dcgm/dcgmlib/src/DcgmCacheManager.cpp:3206] [DcgmCacheManager::AddEntityFieldWatch] dcgm_level=DEBUG | |
2025/02/02 10:40:41 INFO Preparing to update watchInfo 0x7a3808003790, eg 1, eid 0, fieldId 70 [/builds/dcgm/dcgm/dcgmlib/src/DcgmCacheManager.cpp:5496] [DcgmCacheManager::ActuallyUpdateAllFields] dcgm_level=DEBUG | |
2025/02/02 10:40:41 INFO Checking status for gpu 0 [/builds/dcgm/dcgm/dcgmlib/src/DcgmCacheManager.cpp:2396] [DcgmCacheManager::GetGpuStatus] dcgm_level=DEBUG | |
2025/02/02 10:40:41 INFO Appended entity i64 eg 1, eid 0, fieldId 70, ts 1738492841370955, value1 255, value2 0, cached 1, buffered 0 [/builds/dcgm/dcgm/dcgmlib/src/DcgmCacheManager.cpp:6528] [DcgmCacheManager::AppendEntityInt64] dcgm_level=DEBUG | |
2025/02/02 10:40:41 INFO Preparing to update watchInfo 0x7a3808003800, eg 1, eid 0, fieldId 71 [/builds/dcgm/dcgm/dcgmlib/src/DcgmCacheManager.cpp:5496] [DcgmCacheManager::ActuallyUpdateAllFields] dcgm_level=DEBUG | |
2025/02/02 10:40:41 INFO Checking status for gpu 0 [/builds/dcgm/dcgm/dcgmlib/src/DcgmCacheManager.cpp:2396] [DcgmCacheManager::GetGpuStatus] dcgm_level=DEBUG | |
2025/02/02 10:40:41 INFO Appended entity i64 eg 1, eid 0, fieldId 71, ts 1738492841377385, value1 0, value2 0, cached 1, buffered 0 [/builds/dcgm/dcgm/dcgmlib/src/DcgmCacheManager.cpp:6528] [DcgmCacheManager::AppendEntityInt64] dcgm_level=DEBUG | |
2025/02/02 10:40:41 INFO Preparing to update watchInfo 0x7a38080038d0, eg 1, eid 0, fieldId 72 [/builds/dcgm/dcgm/dcgmlib/src/DcgmCacheManager.cpp:5496] [DcgmCacheManager::ActuallyUpdateAllFields] dcgm_level=DEBUG | |
2025/02/02 10:40:41 INFO Checking status for gpu 0 [/builds/dcgm/dcgm/dcgmlib/src/DcgmCacheManager.cpp:2396] [DcgmCacheManager::GetGpuStatus] dcgm_level=DEBUG | |
2025/02/02 10:40:41 INFO Appended entity i64 eg 1, eid 0, fieldId 72, ts 1738492841377401, value1 0, value2 0, cached 1, buffered 0 [/builds/dcgm/dcgm/dcgmlib/src/DcgmCacheManager.cpp:6528] [DcgmCacheManager::AppendEntityInt64] dcgm_level=DEBUG | |
2025/02/02 10:40:41 INFO Preparing to update watchInfo 0x7a38080039a0, eg 1, eid 0, fieldId 73 [/builds/dcgm/dcgm/dcgmlib/src/DcgmCacheManager.cpp:5496] [DcgmCacheManager::ActuallyUpdateAllFields] dcgm_level=DEBUG | |
2025/02/02 10:40:41 INFO Checking status for gpu 0 [/builds/dcgm/dcgm/dcgmlib/src/DcgmCacheManager.cpp:2396] [DcgmCacheManager::GetGpuStatus] dcgm_level=DEBUG | |
2025/02/02 10:40:41 INFO Appended entity i64 eg 1, eid 0, fieldId 73, ts 1738492841377415, value1 0, value2 0, cached 1, buffered 0 [/builds/dcgm/dcgm/dcgmlib/src/DcgmCacheManager.cpp:6528] [DcgmCacheManager::AppendEntityInt64] dcgm_level=DEBUG | |
2025/02/02 10:40:41 INFO DoOneUpdateAllFields returned 1738492871355462 [/builds/dcgm/dcgm/dcgmlib/src/DcgmCacheManager.cpp:2503] [DcgmCacheManager::UpdateAllFields] dcgm_level=DEBUG | |
2025/02/02 10:40:41 INFO Returning 0 [/builds/dcgm/dcgm/dcgmlib/entry_point.h:562] [dcgmWatchFields] dcgm_level=DEBUG | |
2025/02/02 10:40:41 INFO Entering dcgmUpdateAllFields(dcgmHandle_t pDcgmHandle, int waitForUpdate) (2147483647 1) [/builds/dcgm/dcgm/dcgmlib/entry_point.h:409] [dcgmUpdateAllFields] dcgm_level=DEBUG | |
2025/02/02 10:40:41 INFO DoOneUpdateAllFields returned 1738492871355462 [/builds/dcgm/dcgm/dcgmlib/src/DcgmCacheManager.cpp:2503] [DcgmCacheManager::UpdateAllFields] dcgm_level=DEBUG | |
2025/02/02 10:40:41 INFO Returning 0 [/builds/dcgm/dcgm/dcgmlib/entry_point.h:409] [dcgmUpdateAllFields] dcgm_level=DEBUG | |
2025/02/02 10:40:41 INFO Entering dcgmGetLatestValuesForFields(dcgmHandle_t pDcgmHandle, int gpuId, unsigned short fieldIds[], unsigned int count, dcgmFieldValue_v1 values[]) (2147483647 0 0xc000408110 4 0xc0003d8000) [/builds/dcgm/dcgm/dcgmlib/entry_point.h:337] [dcgmGetLatestValuesForFields] dcgm_level=DEBUG | |
2025/02/02 10:40:41 INFO Returning 0 [/builds/dcgm/dcgm/dcgmlib/entry_point.h:337] [dcgmGetLatestValuesForFields] dcgm_level=DEBUG | |
2025/02/02 10:40:41 INFO Entering dcgmGroupDestroy(dcgmHandle_t pDcgmHandle, dcgmGpuGrp_t groupId) (2147483647 2) [/builds/dcgm/dcgm/dcgmlib/entry_point.h:186] [dcgmGroupDestroy] dcgm_level=DEBUG | |
2025/02/02 10:40:41 INFO Unknown subcommand: 3 [/builds/dcgm/dcgm/modules/core/DcgmModuleCore.cpp:267] [DcgmModuleCore::ProcessMessage] dcgm_level=DEBUG | |
2025/02/02 10:40:41 INFO [[NvSwitch]] Unknown subcommand: 3 [/builds/dcgm/dcgm/modules/nvswitch/DcgmModuleNvSwitch.cpp:431] [DcgmNs::DcgmModuleNvSwitch::ProcessCoreMessage] dcgm_level=DEBUG | |
2025/02/02 10:40:41 INFO Removed GroupId 2 [/builds/dcgm/dcgm/dcgmlib/src/DcgmGroupManager.cpp:320] [DcgmGroupManager::RemoveGroup] dcgm_level=DEBUG | |
2025/02/02 10:40:41 INFO Returning 0 [/builds/dcgm/dcgm/dcgmlib/entry_point.h:186] [dcgmGroupDestroy] dcgm_level=DEBUG | |
2025/02/02 10:40:41 INFO Entering dcgmFieldGroupDestroy(dcgmHandle_t pDcgmHandle, dcgmFieldGrp_t dcgmFieldGroupId) (2147483647 4) [/builds/dcgm/dcgm/dcgmlib/entry_point.h:600] [dcgmFieldGroupDestroy] dcgm_level=DEBUG | |
2025/02/02 10:40:41 INFO dcgmFieldGroupDestroy fieldGroupId 0x4 [/builds/dcgm/dcgm/dcgmlib/src/DcgmApi.cpp:2729] [tsapiFieldGroupDestroy] dcgm_level=DEBUG | |
2025/02/02 10:40:41 INFO Removed field group 4 [/builds/dcgm/dcgm/dcgmlib/src/DcgmFieldGroup.cpp:237] [DcgmFieldGroupManager::RemoveFieldGroup] dcgm_level=DEBUG | |
2025/02/02 10:40:41 INFO tsapiFieldGroupDestroy ret 0, fieldGroupId 0x4 [/builds/dcgm/dcgm/dcgmlib/src/DcgmApi.cpp:2744] [tsapiFieldGroupDestroy] dcgm_level=DEBUG | |
2025/02/02 10:40:41 INFO Returning 0 [/builds/dcgm/dcgm/dcgmlib/entry_point.h:600] [dcgmFieldGroupDestroy] dcgm_level=DEBUG | |
2025/02/02 10:40:41 INFO Entering dcgmGetDeviceTopology(dcgmHandle_t pDcgmHandle, unsigned int gpuId, dcgmDeviceTopology_t *deviceTopology) (2147483647 0 0xc0001821c0) [/builds/dcgm/dcgm/dcgmlib/entry_point.h:715] [dcgmGetDeviceTopology] dcgm_level=DEBUG | |
2025/02/02 10:40:41 INFO GetGroupEntities got 1 entities for dynamic group 0 [/builds/dcgm/dcgm/dcgmlib/src/DcgmGroupManager.cpp:411] [DcgmGroupManager::GetGroupEntities] dcgm_level=DEBUG | |
2025/02/02 10:40:41 INFO Fixing entityGroupId for global field [/builds/dcgm/dcgm/dcgmlib/src/DcgmCacheManager.cpp:4592] [DcgmCacheManager::GetLatestSample] dcgm_level=DEBUG | |
2025/02/02 10:40:41 INFO watch key eg 0, eid 0, fieldId 60 doesn't exist. createIfNotExists == false [/builds/dcgm/dcgm/dcgmlib/src/DcgmCacheManager.cpp:2246] [DcgmCacheManager::GetEntityWatchInfo] dcgm_level=DEBUG | |
2025/02/02 10:40:41 INFO PrecheckWatchInfoForSamples: not watched [/builds/dcgm/dcgm/dcgmlib/src/DcgmCacheManager.cpp:3758] [DcgmCacheManager::PrecheckWatchInfoForSamples] dcgm_level=DEBUG | |
2025/02/02 10:40:41 INFO Fixed entityGroupId to be DCGM_FE_NONE fieldId 60 [/builds/dcgm/dcgm/dcgmlib/src/DcgmCacheManager.cpp:5697] [DcgmCacheManager::GetMultipleLatestLiveSamples] dcgm_level=DEBUG | |
2025/02/02 10:40:41 INFO Two devices not detected on this system [/builds/dcgm/dcgm/dcgmlib/src/DcgmCacheManager.cpp:10656] [DcgmCacheManager::BufferOrCacheLatestGpuValue] dcgm_level=DEBUG | |
2025/02/02 10:40:41 INFO Error: unable to retrieve PCIe topology information: Success [/builds/dcgm/dcgm/dcgmlib/src/DcgmHostEngineHandler.cpp:617] [DcgmHostEngineHandler::HelperGetTopologyIO] dcgm_level=ERROR | |
2025/02/02 10:40:41 INFO helperGetTopologyPci returned 0 [/builds/dcgm/dcgm/dcgmlib/src/DcgmApi.cpp:3292] [helperGetTopologyPci] dcgm_level=DEBUG | |
2025/02/02 10:40:41 INFO helperGetTopologyPci returned -6 [/builds/dcgm/dcgm/dcgmlib/src/DcgmApi.cpp:4413] [tsapiEngineGetDeviceTopology] dcgm_level=DEBUG | |
2025/02/02 10:40:41 INFO Returning -6 [/builds/dcgm/dcgm/dcgmlib/entry_point.h:715] [dcgmGetDeviceTopology] dcgm_level=DEBUG | |
2025/02/02 10:40:41 INFO Entering dcgmFieldGroupCreate(dcgmHandle_t pDcgmHandle, int numFieldIds, unsigned short *fieldIds, const char *fieldGroupName, dcgmFieldGrp_t *dcgmFieldGroupId) (2147483647 2, 0xc000408158, pciBandwidthFields144373375743944463, 0xc000408168) [/builds/dcgm/dcgm/dcgmlib/entry_point.h:586] [dcgmFieldGroupCreate] dcgm_level=DEBUG | |
2025/02/02 10:40:41 INFO Added field group id 5, name pciBandwidthFields144373375743944463, connectionId 0 [/builds/dcgm/dcgm/dcgmlib/src/DcgmFieldGroup.cpp:172] [DcgmFieldGroupManager::AddFieldGroup] dcgm_level=DEBUG | |
2025/02/02 10:40:41 INFO Returning 0 [/builds/dcgm/dcgm/dcgmlib/entry_point.h:586] [dcgmFieldGroupCreate] dcgm_level=DEBUG | |
2025/02/02 10:40:41 INFO Entering dcgmGroupCreate(dcgmHandle_t pDcgmHandle, dcgmGroupType_t type, const char *groupName, dcgmGpuGrp_t *pDcgmGrpId) (2147483647 1 pciBandwidth11288183673022212047 0xc000408178) [/builds/dcgm/dcgm/dcgmlib/entry_point.h:177] [dcgmGroupCreate] dcgm_level=DEBUG | |
2025/02/02 10:40:41 INFO Added GroupId 3 name pciBandwidth11288183673022212047 for connectionId 0 [/builds/dcgm/dcgm/dcgmlib/src/DcgmGroupManager.cpp:273] [DcgmGroupManager::AddNewGroup] dcgm_level=DEBUG | |
2025/02/02 10:40:41 INFO Returning 0 [/builds/dcgm/dcgm/dcgmlib/entry_point.h:177] [dcgmGroupCreate] dcgm_level=DEBUG | |
2025/02/02 10:40:41 INFO Entering dcgmGroupAddDevice(dcgmHandle_t pDcgmHandle, dcgmGpuGrp_t groupId, unsigned int gpuId) (2147483647 3 0) [/builds/dcgm/dcgm/dcgmlib/entry_point.h:193] [dcgmGroupAddDevice] dcgm_level=DEBUG | |
2025/02/02 10:40:41 INFO AddEntityToGroup groupId 3, eg 1, eid 0 added to the group [/builds/dcgm/dcgm/dcgmlib/src/DcgmGroupManager.cpp:683] [DcgmGroupInfo::AddEntityToGroup] dcgm_level=INFO | |
2025/02/02 10:40:41 INFO groupId 3 added eg 1, eid 0. ret 0 [/builds/dcgm/dcgm/dcgmlib/src/DcgmGroupManager.cpp:490] [DcgmGroupManager::AddEntityToGroup] dcgm_level=DEBUG | |
2025/02/02 10:40:41 INFO Returning 0 [/builds/dcgm/dcgm/dcgmlib/entry_point.h:193] [dcgmGroupAddDevice] dcgm_level=DEBUG | |
2025/02/02 10:40:41 INFO Entering dcgmWatchFields(dcgmHandle_t pDcgmHandle, dcgmGpuGrp_t groupId, dcgmFieldGrp_t fieldGroupId, long long updateFreq, double maxKeepAge, int maxKeepSamples) (2147483647 3, 5, 30000000, 0, 1) [/builds/dcgm/dcgm/dcgmlib/entry_point.h:562] [dcgmWatchFields] dcgm_level=DEBUG | |
2025/02/02 10:40:41 INFO Got 1 entities and 2 fields [/builds/dcgm/dcgm/dcgmlib/src/DcgmHostEngineHandler.cpp:3704] [DcgmHostEngineHandler::WatchFieldGroup] dcgm_level=DEBUG | |
2025/02/02 10:40:41 INFO Adding WatchInfo on entityKey 0x100eb00000000 (eg 1, entityId 0, fieldId 235) [/builds/dcgm/dcgm/dcgmlib/src/DcgmCacheManager.cpp:2254] [DcgmCacheManager::GetEntityWatchInfo] dcgm_level=DEBUG | |
2025/02/02 10:40:41 INFO Adding new watcher type 0, connectionId 0 [/builds/dcgm/dcgm/dcgmlib/src/DcgmCacheManager.cpp:3018] [DcgmCacheManager::AddOrUpdateWatcher] dcgm_level=DEBUG | |
2025/02/02 10:40:41 INFO UpdateWatchFromWatchers minMonitorFreqUsec 30000000, minMaxAgeUsec 33000000, hsw 0 [/builds/dcgm/dcgm/dcgmlib/src/DcgmCacheManager.cpp:3065] [DcgmCacheManager::UpdateWatchFromWatchers] dcgm_level=DEBUG | |
2025/02/02 10:40:41 INFO AddFieldWatch eg 1, eid 0, fieldId 235, mfu 30000000, msa 0, mka 1, sfu 0 [/builds/dcgm/dcgm/dcgmlib/src/DcgmCacheManager.cpp:3206] [DcgmCacheManager::AddEntityFieldWatch] dcgm_level=DEBUG | |
2025/02/02 10:40:41 INFO Adding WatchInfo on entityKey 0x100ec00000000 (eg 1, entityId 0, fieldId 236) [/builds/dcgm/dcgm/dcgmlib/src/DcgmCacheManager.cpp:2254] [DcgmCacheManager::GetEntityWatchInfo] dcgm_level=DEBUG | |
2025/02/02 10:40:41 INFO Adding new watcher type 0, connectionId 0 [/builds/dcgm/dcgm/dcgmlib/src/DcgmCacheManager.cpp:3018] [DcgmCacheManager::AddOrUpdateWatcher] dcgm_level=DEBUG | |
2025/02/02 10:40:41 INFO UpdateWatchFromWatchers minMonitorFreqUsec 30000000, minMaxAgeUsec 33000000, hsw 0 [/builds/dcgm/dcgm/dcgmlib/src/DcgmCacheManager.cpp:3065] [DcgmCacheManager::UpdateWatchFromWatchers] dcgm_level=DEBUG | |
2025/02/02 10:40:41 INFO AddFieldWatch eg 1, eid 0, fieldId 236, mfu 30000000, msa 0, mka 1, sfu 0 [/builds/dcgm/dcgm/dcgmlib/src/DcgmCacheManager.cpp:3206] [DcgmCacheManager::AddEntityFieldWatch] dcgm_level=DEBUG | |
2025/02/02 10:40:41 INFO Preparing to update watchInfo 0x7a3808003ae0, eg 1, eid 0, fieldId 235 [/builds/dcgm/dcgm/dcgmlib/src/DcgmCacheManager.cpp:5496] [DcgmCacheManager::ActuallyUpdateAllFields] dcgm_level=DEBUG | |
2025/02/02 10:40:41 INFO Checking status for gpu 0 [/builds/dcgm/dcgm/dcgmlib/src/DcgmCacheManager.cpp:2396] [DcgmCacheManager::GetGpuStatus] dcgm_level=DEBUG | |
2025/02/02 10:40:41 INFO Appended entity i64 eg 1, eid 0, fieldId 235, ts 1738492841380619, value1 3, value2 0, cached 1, buffered 0 [/builds/dcgm/dcgm/dcgmlib/src/DcgmCacheManager.cpp:6528] [DcgmCacheManager::AppendEntityInt64] dcgm_level=DEBUG | |
2025/02/02 10:40:41 INFO Preparing to update watchInfo 0x7a3808003b50, eg 1, eid 0, fieldId 236 [/builds/dcgm/dcgm/dcgmlib/src/DcgmCacheManager.cpp:5496] [DcgmCacheManager::ActuallyUpdateAllFields] dcgm_level=DEBUG | |
2025/02/02 10:40:41 INFO Checking status for gpu 0 [/builds/dcgm/dcgm/dcgmlib/src/DcgmCacheManager.cpp:2396] [DcgmCacheManager::GetGpuStatus] dcgm_level=DEBUG | |
2025/02/02 10:40:41 INFO Appended entity i64 eg 1, eid 0, fieldId 236, ts 1738492841380886, value1 16, value2 0, cached 1, buffered 0 [/builds/dcgm/dcgm/dcgmlib/src/DcgmCacheManager.cpp:6528] [DcgmCacheManager::AppendEntityInt64] dcgm_level=DEBUG | |
2025/02/02 10:40:41 INFO DoOneUpdateAllFields returned 1738492871355462 [/builds/dcgm/dcgm/dcgmlib/src/DcgmCacheManager.cpp:2503] [DcgmCacheManager::UpdateAllFields] dcgm_level=DEBUG | |
2025/02/02 10:40:41 INFO Returning 0 [/builds/dcgm/dcgm/dcgmlib/entry_point.h:562] [dcgmWatchFields] dcgm_level=DEBUG | |
2025/02/02 10:40:41 INFO Entering dcgmUpdateAllFields(dcgmHandle_t pDcgmHandle, int waitForUpdate) (2147483647 1) [/builds/dcgm/dcgm/dcgmlib/entry_point.h:409] [dcgmUpdateAllFields] dcgm_level=DEBUG | |
2025/02/02 10:40:41 INFO DoOneUpdateAllFields returned 1738492871355462 [/builds/dcgm/dcgm/dcgmlib/src/DcgmCacheManager.cpp:2503] [DcgmCacheManager::UpdateAllFields] dcgm_level=DEBUG | |
2025/02/02 10:40:41 INFO Returning 0 [/builds/dcgm/dcgm/dcgmlib/entry_point.h:409] [dcgmUpdateAllFields] dcgm_level=DEBUG | |
2025/02/02 10:40:41 INFO Entering dcgmGetLatestValuesForFields(dcgmHandle_t pDcgmHandle, int gpuId, unsigned short fieldIds[], unsigned int count, dcgmFieldValue_v1 values[]) (2147483647 0 0xc000408158 2 0xc0003ea000) [/builds/dcgm/dcgm/dcgmlib/entry_point.h:337] [dcgmGetLatestValuesForFields] dcgm_level=DEBUG | |
2025/02/02 10:40:41 INFO Returning 0 [/builds/dcgm/dcgm/dcgmlib/entry_point.h:337] [dcgmGetLatestValuesForFields] dcgm_level=DEBUG | |
2025/02/02 10:40:41 INFO Entering dcgmFieldGroupDestroy(dcgmHandle_t pDcgmHandle, dcgmFieldGrp_t dcgmFieldGroupId) (2147483647 5) [/builds/dcgm/dcgm/dcgmlib/entry_point.h:600] [dcgmFieldGroupDestroy] dcgm_level=DEBUG | |
2025/02/02 10:40:41 INFO dcgmFieldGroupDestroy fieldGroupId 0x5 [/builds/dcgm/dcgm/dcgmlib/src/DcgmApi.cpp:2729] [tsapiFieldGroupDestroy] dcgm_level=DEBUG | |
2025/02/02 10:40:41 INFO Removed field group 5 [/builds/dcgm/dcgm/dcgmlib/src/DcgmFieldGroup.cpp:237] [DcgmFieldGroupManager::RemoveFieldGroup] dcgm_level=DEBUG | |
2025/02/02 10:40:41 INFO tsapiFieldGroupDestroy ret 0, fieldGroupId 0x5 [/builds/dcgm/dcgm/dcgmlib/src/DcgmApi.cpp:2744] [tsapiFieldGroupDestroy] dcgm_level=DEBUG | |
2025/02/02 10:40:41 INFO Returning 0 [/builds/dcgm/dcgm/dcgmlib/entry_point.h:600] [dcgmFieldGroupDestroy] dcgm_level=DEBUG | |
2025/02/02 10:40:41 INFO Entering dcgmGroupDestroy(dcgmHandle_t pDcgmHandle, dcgmGpuGrp_t groupId) (2147483647 3) [/builds/dcgm/dcgm/dcgmlib/entry_point.h:186] [dcgmGroupDestroy] dcgm_level=DEBUG | |
2025/02/02 10:40:41 INFO Unknown subcommand: 3 [/builds/dcgm/dcgm/modules/core/DcgmModuleCore.cpp:267] [DcgmModuleCore::ProcessMessage] dcgm_level=DEBUG | |
2025/02/02 10:40:41 INFO [[NvSwitch]] Unknown subcommand: 3 [/builds/dcgm/dcgm/modules/nvswitch/DcgmModuleNvSwitch.cpp:431] [DcgmNs::DcgmModuleNvSwitch::ProcessCoreMessage] dcgm_level=DEBUG | |
2025/02/02 10:40:41 INFO Removed GroupId 3 [/builds/dcgm/dcgm/dcgmlib/src/DcgmGroupManager.cpp:320] [DcgmGroupManager::RemoveGroup] dcgm_level=DEBUG | |
2025/02/02 10:40:41 INFO Returning 0 [/builds/dcgm/dcgm/dcgmlib/entry_point.h:186] [dcgmGroupDestroy] dcgm_level=DEBUG | |
2025/02/02 10:40:41 INFO Entering dcgmGetGpuInstanceHierarchy(dcgmHandle_t dcgmHandle, dcgmMigHierarchy_v2 *hierarchy) (2147483647 0xc000016000) [/builds/dcgm/dcgm/dcgmlib/entry_point.h:119] [dcgmGetGpuInstanceHierarchy] dcgm_level=DEBUG | |
2025/02/02 10:40:41 INFO Got total GPUs/GPU Instances/GPU Compute Instances back: 0. dcgmReturn: 0 [/builds/dcgm/dcgm/dcgmlib/src/DcgmApi.cpp:3504] [tsapiGetGpuInstanceHierarchy] dcgm_level=DEBUG | |
2025/02/02 10:40:41 INFO Returning 0 [/builds/dcgm/dcgm/dcgmlib/entry_point.h:119] [dcgmGetGpuInstanceHierarchy] dcgm_level=DEBUG | |
2025/02/02 10:40:41 INFO Initializing system entities of type 'NvSwitch' | |
2025/02/02 10:40:41 INFO Entering dcgmGetEntityGroupEntities(dcgmHandle_t dcgmHandle, dcgm_field_entity_group_t entityGroup, dcgm_field_eid_t *entities, int *numEntities, unsigned int flags) (2147483647 3 0xc000420100, 0xc000408290, x0) [/builds/dcgm/dcgm/dcgmlib/entry_point.h:105] [dcgmGetEntityGroupEntities] dcgm_level=DEBUG | |
2025/02/02 10:40:41 INFO [[NvSwitch]] No fields to update [/builds/dcgm/dcgm/modules/nvswitch/DcgmNvSwitchManagerBase.cpp:344] [DcgmNs::DcgmNvSwitchManagerBase::UpdateFields] dcgm_level=DEBUG | |
2025/02/02 10:40:41 INFO Not collecting NvSwitch metrics; no switches to monitor | |
2025/02/02 10:40:41 INFO Returning 0 [/builds/dcgm/dcgm/dcgmlib/entry_point.h:105] [dcgmGetEntityGroupEntities] dcgm_level=DEBUG | |
2025/02/02 10:40:41 INFO Initializing system entities of type 'NvLink' | |
2025/02/02 10:40:41 INFO Entering dcgmGetEntityGroupEntities(dcgmHandle_t dcgmHandle, dcgm_field_entity_group_t entityGroup, dcgm_field_eid_t *entities, int *numEntities, unsigned int flags) (2147483647 3 0xc000420180, 0xc000408358, x0) [/builds/dcgm/dcgm/dcgmlib/entry_point.h:105] [dcgmGetEntityGroupEntities] dcgm_level=DEBUG | |
2025/02/02 10:40:41 INFO Not collecting NvLink metrics; no switches to monitor | |
2025/02/02 10:40:41 INFO Returning 0 [/builds/dcgm/dcgm/dcgmlib/entry_point.h:105] [dcgmGetEntityGroupEntities] dcgm_level=DEBUG | |
2025/02/02 10:40:41 INFO Initializing system entities of type 'CPU' | |
2025/02/02 10:40:41 INFO [[NvSwitch]] No fields to update [/builds/dcgm/dcgm/modules/nvswitch/DcgmNvSwitchManagerBase.cpp:344] [DcgmNs::DcgmNvSwitchManagerBase::UpdateFields] dcgm_level=DEBUG | |
2025/02/02 10:40:41 INFO Entering dcgmGetCpuHierarchy(dcgmHandle_t dcgmHandle, dcgmCpuHierarchy_v1 *cpuHierarchy) (2147483647 0xc0003f8000) [/builds/dcgm/dcgm/dcgmlib/entry_point.h:147] [dcgmGetCpuHierarchy] dcgm_level=DEBUG | |
2025/02/02 10:40:41 INFO Entering dcgmModuleIdToName(dcgmModuleId_t id, char const **name) (9, 0x7a382cc6d4a8) [/builds/dcgm/dcgm/dcgmlib/entry_point.h:857] [dcgmModuleIdToName] dcgm_level=DEBUG | |
2025/02/02 10:40:41 INFO Returning 0 [/builds/dcgm/dcgm/dcgmlib/entry_point.h:857] [dcgmModuleIdToName] dcgm_level=DEBUG | |
2025/02/02 10:40:41 INFO [[SysMon]] Initialized logging for module 9 [/builds/dcgm/dcgm/modules/DcgmModule.h:90] [DcgmModuleWithCoreProxy<moduleId>::DcgmModuleWithCoreProxy] dcgm_level=DEBUG | |
2025/02/02 10:40:41 INFO [[SysMon]] failed to open file [/sys/devices/soc0/soc_id] [/builds/dcgm/dcgm/common/FileSystemOperator.cpp:32] [FileSystemOperator::Read] dcgm_level=DEBUG | |
2025/02/02 10:40:41 INFO [[SysMon]] fail to read file content of path: /sys/devices/soc0/soc_id [/builds/dcgm/dcgm/common/CpuHelpers.cpp:68] [CpuHelpers::ReadCpuVendorAndModel] dcgm_level=DEBUG | |
2025/02/02 10:40:41 INFO [[SysMon]] Constructing Sysmon Module [/builds/dcgm/dcgm/modules/sysmon/DcgmModuleSysmon.cpp:56] [DcgmNs::DcgmModuleSysmon::DcgmModuleSysmon] dcgm_level=DEBUG | |
2025/02/02 10:40:41 INFO [[SysMon]] failed to open file [/sys/devices/soc0/soc_id] [/builds/dcgm/dcgm/common/FileSystemOperator.cpp:32] [FileSystemOperator::Read] dcgm_level=DEBUG | |
2025/02/02 10:40:41 INFO [[SysMon]] fail to read file content of path: /sys/devices/soc0/soc_id [/builds/dcgm/dcgm/common/CpuHelpers.cpp:68] [CpuHelpers::ReadCpuVendorAndModel] dcgm_level=DEBUG | |
2025/02/02 10:40:41 INFO [[SysMon]] Child process (36) terminated successfully. [/builds/dcgm/dcgm/common/DcgmUtilities.cpp:844] [DcgmNs::Utils::RunCmdAndGetOutput] dcgm_level=DEBUG | |
2025/02/02 10:40:41 INFO [[SysMon]] Error running cmd '/usr/bin/lshw -json': dcgm_level=ERROR | |
2025/02/02 10:40:41 INFO [[SysMon]] Child process (37) terminated successfully. [/builds/dcgm/dcgm/common/DcgmUtilities.cpp:844] [DcgmNs::Utils::RunCmdAndGetOutput] dcgm_level=DEBUG | |
2025/02/02 10:40:41 INFO [[SysMon]] Error running cmd '/usr/sbin/lshw -json': dcgm_level=ERROR | |
2025/02/02 10:40:41 INFO [[SysMon]] failed to get serials from lshw. [/builds/dcgm/dcgm/common/CpuHelpers.cpp:160] [CpuHelpers::GetCpuSerials] dcgm_level=DEBUG | |
2025/02/02 10:40:41 INFO [[SysMon]] Could not retrieve serial numbers for CPUs [/builds/dcgm/dcgm/modules/sysmon/DcgmModuleSysmon.cpp:605] [DcgmNs::DcgmModuleSysmon::PopulateCpusIfNeeded] dcgm_level=WARN | |
2025/02/02 10:40:41 INFO [[SysMon]] Child process (38) terminated successfully. [/builds/dcgm/dcgm/common/DcgmUtilities.cpp:844] [DcgmNs::Utils::RunCmdAndGetOutput] dcgm_level=DEBUG | |
SIGSEGV: segmentation violation | |
PC=0x7a38771506aa m=9 sigcode=1 addr=0x4 | |
signal arrived during cgo execution | |
goroutine 1 gp=0xc0000061c0 m=9 mp=0xc000580008 [syscall]: | |
runtime.cgocall(0x1975150, 0xc0006cc148) | |
/usr/local/go/src/runtime/cgocall.go:157 +0x4b fp=0xc0006cc120 sp=0xc0006cc0e8 pc=0x4195cb | |
github.com/NVIDIA/go-dcgm/pkg/dcgm._Cfunc_dcgmGetCpuHierarchy(0x7fffffff, 0xc0003f8000) | |
_cgo_gotypes.go:1178 +0x4b fp=0xc0006cc148 sp=0xc0006cc120 pc=0x7f0c8b | |
github.com/NVIDIA/go-dcgm/pkg/dcgm.GetCpuHierarchy() | |
/go/pkg/mod/github.com/!n!v!i!d!i!a/[email protected]/pkg/dcgm/cpu.go:42 +0x6b fp=0xc0006ccdf8 sp=0xc0006cc148 pc=0x7f3beb | |
github.com/NVIDIA/dcgm-exporter/internal/pkg/dcgmprovider.dcgmProvider.GetCpuHierarchy(...) | |
/go/src/github.com/NVIDIA/dcgm-exporter/internal/pkg/dcgmprovider/dcgm.go:163 | |
github.com/NVIDIA/dcgm-exporter/internal/pkg/dcgmprovider.(*dcgmProvider).GetCpuHierarchy(_) | |
<autogenerated>:1 +0x7c fp=0xc0006cd028 sp=0xc0006ccdf8 pc=0x17a1b7c | |
github.com/NVIDIA/dcgm-exporter/internal/pkg/deviceinfo.(*Info).initializeCPUInfo(0xc000448f08, {0x1, {0x0, 0x0, 0x0}, {0x0, 0x0, 0x0}}) | |
/go/src/github.com/NVIDIA/dcgm-exporter/internal/pkg/deviceinfo/device_info.go:196 +0x9f fp=0xc0006cd3d0 sp=0xc0006cd028 pc=0x17a393f | |
github.com/NVIDIA/dcgm-exporter/internal/pkg/deviceinfo.Initialize({0x1, {0x0, 0x0, 0x0}, {0x0, 0x0, 0x0}}, {0x1, {0x0, 0x0, ...}, ...}, ...) | |
/go/src/github.com/NVIDIA/dcgm-exporter/internal/pkg/deviceinfo/device_info.go:108 +0x349 fp=0xc0006cd440 sp=0xc0006cd3d0 pc=0x17a2cc9 | |
github.com/NVIDIA/dcgm-exporter/internal/pkg/devicewatchlistmanager.(*WatchListManager).CreateEntityWatchList(0xc000520270, 0x7, {0x2140538, 0x33d9600}, 0x7530) | |
/go/src/github.com/NVIDIA/dcgm-exporter/internal/pkg/devicewatchlistmanager/device_watchlist_manager.go:131 +0x458 fp=0xc0006cd788 sp=0xc0006cd440 pc=0x17cc358 | |
github.com/NVIDIA/dcgm-exporter/pkg/cmd.startDeviceWatchListManager(0xc0002dce40, 0xc000103880) | |
/go/src/github.com/NVIDIA/dcgm-exporter/pkg/cmd/app.go:412 +0x305 fp=0xc0006cd878 sp=0xc0006cd788 pc=0x1972bc5 | |
github.com/NVIDIA/dcgm-exporter/pkg/cmd.startDCGMExporter(0xc00013e600, 0xc00051fce0) | |
/go/src/github.com/NVIDIA/dcgm-exporter/pkg/cmd/app.go:346 +0x34b fp=0xc0006cda70 sp=0xc0006cd878 pc=0x1971f2b | |
github.com/NVIDIA/dcgm-exporter/pkg/cmd.action.func1() | |
/go/src/github.com/NVIDIA/dcgm-exporter/pkg/cmd/app.go:304 +0x5b fp=0xc0006cdac0 sp=0xc0006cda70 pc=0x19719db | |
github.com/NVIDIA/dcgm-exporter/internal/pkg/stdout.Capture({0x2157448, 0xc000216050}, 0xc000515b78) | |
/go/src/github.com/NVIDIA/dcgm-exporter/internal/pkg/stdout/capture.go:76 +0x1e6 fp=0xc0006cdb50 sp=0xc0006cdac0 pc=0x196f406 | |
github.com/NVIDIA/dcgm-exporter/pkg/cmd.action(0xc00013e600) | |
/go/src/github.com/NVIDIA/dcgm-exporter/pkg/cmd/app.go:295 +0x67 fp=0xc0006cdba8 sp=0xc0006cdb50 pc=0x1971947 | |
github.com/NVIDIA/dcgm-exporter/pkg/cmd.NewApp.func1(0xc00013e600?) | |
/go/src/github.com/NVIDIA/dcgm-exporter/pkg/cmd/app.go:276 +0x13 fp=0xc0006cdbc0 sp=0xc0006cdba8 pc=0x1974c53 | |
github.com/urfave/cli/v2.(*Command).Run(0xc00023a160, 0xc00013e600, {0xc000052120, 0x3, 0x3}) | |
/go/pkg/mod/github.com/urfave/cli/[email protected]/command.go:279 +0x97d fp=0xc0006cde48 sp=0xc0006cdbc0 pc=0x818ffd | |
github.com/urfave/cli/v2.(*App).RunContext(0xc000144600, {0x21571e0, 0x33d9600}, {0xc000052120, 0x3, 0x3}) | |
/go/pkg/mod/github.com/urfave/cli/[email protected]/app.go:337 +0x58b fp=0xc0006cdea8 sp=0xc0006cde48 pc=0x81588b | |
github.com/urfave/cli/v2.(*App).Run(0xc000515f30?, {0xc000052120?, 0x1?, 0x48453a?}) | |
/go/pkg/mod/github.com/urfave/cli/[email protected]/app.go:311 +0x2f fp=0xc0006cdee8 sp=0xc0006cdea8 pc=0x8152af | |
main.main() | |
/go/src/github.com/NVIDIA/dcgm-exporter/cmd/dcgm-exporter/main.go:32 +0x5f fp=0xc0006cdf50 sp=0xc0006cdee8 pc=0x1974d7f | |
runtime.main() | |
/usr/local/go/src/runtime/proc.go:271 +0x29d fp=0xc0006cdfe0 sp=0xc0006cdf50 pc=0x45185d | |
runtime.goexit({}) | |
/usr/local/go/src/runtime/asm_amd64.s:1695 +0x1 fp=0xc0006cdfe8 sp=0xc0006cdfe0 pc=0x4848e1 | |
goroutine 2 gp=0xc000006c40 m=nil [force gc (idle)]: | |
runtime.gopark(0x0?, 0x0?, 0x0?, 0x0?, 0x0?) | |
/usr/local/go/src/runtime/proc.go:402 +0xce fp=0xc000090fa8 sp=0xc000090f88 pc=0x451c8e | |
runtime.goparkunlock(...) | |
/usr/local/go/src/runtime/proc.go:408 | |
runtime.forcegchelper() | |
/usr/local/go/src/runtime/proc.go:326 +0xb3 fp=0xc000090fe0 sp=0xc000090fa8 pc=0x451b13 | |
runtime.goexit({}) | |
/usr/local/go/src/runtime/asm_amd64.s:1695 +0x1 fp=0xc000090fe8 sp=0xc000090fe0 pc=0x4848e1 | |
created by runtime.init.6 in goroutine 1 | |
/usr/local/go/src/runtime/proc.go:314 +0x1a | |
goroutine 3 gp=0xc000007180 m=nil [GC sweep wait]: | |
runtime.gopark(0x1?, 0x0?, 0x0?, 0x0?, 0x0?) | |
/usr/local/go/src/runtime/proc.go:402 +0xce fp=0xc000091780 sp=0xc000091760 pc=0x451c8e | |
runtime.goparkunlock(...) | |
/usr/local/go/src/runtime/proc.go:408 | |
runtime.bgsweep(0xc000062070) | |
/usr/local/go/src/runtime/mgcsweep.go:318 +0xdf fp=0xc0000917c8 sp=0xc000091780 pc=0x43c33f | |
runtime.gcenable.gowrap1() | |
/usr/local/go/src/runtime/mgc.go:203 +0x25 fp=0xc0000917e0 sp=0xc0000917c8 pc=0x430c45 | |
runtime.goexit({}) | |
/usr/local/go/src/runtime/asm_amd64.s:1695 +0x1 fp=0xc0000917e8 sp=0xc0000917e0 pc=0x4848e1 | |
created by runtime.gcenable in goroutine 1 | |
/usr/local/go/src/runtime/mgc.go:203 +0x66 | |
goroutine 4 gp=0xc000007340 m=nil [GC scavenge wait]: | |
runtime.gopark(0x10000?, 0x21305b0?, 0x0?, 0x0?, 0x0?) | |
/usr/local/go/src/runtime/proc.go:402 +0xce fp=0xc000091f78 sp=0xc000091f58 pc=0x451c8e | |
runtime.goparkunlock(...) | |
/usr/local/go/src/runtime/proc.go:408 | |
runtime.(*scavengerState).park(0x3377600) | |
/usr/local/go/src/runtime/mgcscavenge.go:425 +0x49 fp=0xc000091fa8 sp=0xc000091f78 pc=0x439ce9 | |
runtime.bgscavenge(0xc000062070) | |
/usr/local/go/src/runtime/mgcscavenge.go:658 +0x59 fp=0xc000091fc8 sp=0xc000091fa8 pc=0x43a299 | |
runtime.gcenable.gowrap2() | |
/usr/local/go/src/runtime/mgc.go:204 +0x25 fp=0xc000091fe0 sp=0xc000091fc8 pc=0x430be5 | |
runtime.goexit({}) | |
/usr/local/go/src/runtime/asm_amd64.s:1695 +0x1 fp=0xc000091fe8 sp=0xc000091fe0 pc=0x4848e1 | |
created by runtime.gcenable in goroutine 1 | |
/usr/local/go/src/runtime/mgc.go:204 +0xa5 | |
goroutine 5 gp=0xc000007c00 m=nil [finalizer wait]: | |
runtime.gopark(0x0?, 0x1f9cc40?, 0x60?, 0xa0?, 0x2000000020?) | |
/usr/local/go/src/runtime/proc.go:402 +0xce fp=0xc000090620 sp=0xc000090600 pc=0x451c8e | |
runtime.runfinq() | |
/usr/local/go/src/runtime/mfinal.go:194 +0x107 fp=0xc0000907e0 sp=0xc000090620 pc=0x42fc87 | |
runtime.goexit({}) | |
/usr/local/go/src/runtime/asm_amd64.s:1695 +0x1 fp=0xc0000907e8 sp=0xc0000907e0 pc=0x4848e1 | |
created by runtime.createfing in goroutine 1 | |
/usr/local/go/src/runtime/mfinal.go:164 +0x3d | |
goroutine 22 gp=0xc000203a40 m=nil [GC worker (idle)]: | |
runtime.gopark(0xc0000927a8?, 0x41b6eb?, 0xf7?, 0xaa?, 0xc00012d0e0?) | |
/usr/local/go/src/runtime/proc.go:402 +0xce fp=0xc000092750 sp=0xc000092730 pc=0x451c8e | |
runtime.gcBgMarkWorker() | |
/usr/local/go/src/runtime/mgc.go:1310 +0xe5 fp=0xc0000927e0 sp=0xc000092750 pc=0x432d25 | |
runtime.goexit({}) | |
/usr/local/go/src/runtime/asm_amd64.s:1695 +0x1 fp=0xc0000927e8 sp=0xc0000927e0 pc=0x4848e1 | |
created by runtime.gcBgMarkStartWorkers in goroutine 1 | |
/usr/local/go/src/runtime/mgc.go:1234 +0x1c | |
goroutine 9 gp=0xc000203c00 m=nil [GC worker (idle)]: | |
runtime.gopark(0x0?, 0x0?, 0x0?, 0x0?, 0x0?) | |
/usr/local/go/src/runtime/proc.go:402 +0xce fp=0xc000092f50 sp=0xc000092f30 pc=0x451c8e | |
runtime.gcBgMarkWorker() | |
/usr/local/go/src/runtime/mgc.go:1310 +0xe5 fp=0xc000092fe0 sp=0xc000092f50 pc=0x432d25 | |
runtime.goexit({}) | |
/usr/local/go/src/runtime/asm_amd64.s:1695 +0x1 fp=0xc000092fe8 sp=0xc000092fe0 pc=0x4848e1 | |
created by runtime.gcBgMarkStartWorkers in goroutine 1 | |
/usr/local/go/src/runtime/mgc.go:1234 +0x1c | |
goroutine 35 gp=0xc00038c000 m=nil [GC worker (idle)]: | |
runtime.gopark(0x0?, 0x0?, 0x0?, 0x0?, 0x0?) | |
/usr/local/go/src/runtime/proc.go:402 +0xce fp=0xc00008c750 sp=0xc00008c730 pc=0x451c8e | |
runtime.gcBgMarkWorker() | |
/usr/local/go/src/runtime/mgc.go:1310 +0xe5 fp=0xc00008c7e0 sp=0xc00008c750 pc=0x432d25 | |
runtime.goexit({}) | |
/usr/local/go/src/runtime/asm_amd64.s:1695 +0x1 fp=0xc00008c7e8 sp=0xc00008c7e0 pc=0x4848e1 | |
created by runtime.gcBgMarkStartWorkers in goroutine 1 | |
/usr/local/go/src/runtime/mgc.go:1234 +0x1c | |
goroutine 23 gp=0xc0001028c0 m=nil [GC worker (idle)]: | |
runtime.gopark(0x4967b0d3958?, 0x0?, 0x0?, 0x0?, 0x0?) | |
/usr/local/go/src/runtime/proc.go:402 +0xce fp=0xc0004d2750 sp=0xc0004d2730 pc=0x451c8e | |
runtime.gcBgMarkWorker() | |
/usr/local/go/src/runtime/mgc.go:1310 +0xe5 fp=0xc0004d27e0 sp=0xc0004d2750 pc=0x432d25 | |
runtime.goexit({}) | |
/usr/local/go/src/runtime/asm_amd64.s:1695 +0x1 fp=0xc0004d27e8 sp=0xc0004d27e0 pc=0x4848e1 | |
created by runtime.gcBgMarkStartWorkers in goroutine 1 | |
/usr/local/go/src/runtime/mgc.go:1234 +0x1c | |
goroutine 10 gp=0xc000203dc0 m=nil [GC worker (idle)]: | |
runtime.gopark(0x33db080?, 0x1?, 0xa7?, 0x42?, 0x0?) | |
/usr/local/go/src/runtime/proc.go:402 +0xce fp=0xc000093750 sp=0xc000093730 pc=0x451c8e | |
runtime.gcBgMarkWorker() | |
/usr/local/go/src/runtime/mgc.go:1310 +0xe5 fp=0xc0000937e0 sp=0xc000093750 pc=0x432d25 | |
runtime.goexit({}) | |
/usr/local/go/src/runtime/asm_amd64.s:1695 +0x1 fp=0xc0000937e8 sp=0xc0000937e0 pc=0x4848e1 | |
created by runtime.gcBgMarkStartWorkers in goroutine 1 | |
/usr/local/go/src/runtime/mgc.go:1234 +0x1c | |
goroutine 11 gp=0xc000364000 m=nil [GC worker (idle)]: | |
runtime.gopark(0x4967b0c9dfe?, 0x1?, 0xea?, 0x89?, 0x0?) | |
/usr/local/go/src/runtime/proc.go:402 +0xce fp=0xc000093f50 sp=0xc000093f30 pc=0x451c8e | |
runtime.gcBgMarkWorker() | |
/usr/local/go/src/runtime/mgc.go:1310 +0xe5 fp=0xc000093fe0 sp=0xc000093f50 pc=0x432d25 | |
runtime.goexit({}) | |
/usr/local/go/src/runtime/asm_amd64.s:1695 +0x1 fp=0xc000093fe8 sp=0xc000093fe0 pc=0x4848e1 | |
created by runtime.gcBgMarkStartWorkers in goroutine 1 | |
/usr/local/go/src/runtime/mgc.go:1234 +0x1c | |
goroutine 24 gp=0xc000102a80 m=nil [GC worker (idle)]: | |
runtime.gopark(0x4967b0aced9?, 0x0?, 0x0?, 0x0?, 0x0?) | |
/usr/local/go/src/runtime/proc.go:402 +0xce fp=0xc0004d2f50 sp=0xc0004d2f30 pc=0x451c8e | |
runtime.gcBgMarkWorker() | |
/usr/local/go/src/runtime/mgc.go:1310 +0xe5 fp=0xc0004d2fe0 sp=0xc0004d2f50 pc=0x432d25 | |
runtime.goexit({}) | |
/usr/local/go/src/runtime/asm_amd64.s:1695 +0x1 fp=0xc0004d2fe8 sp=0xc0004d2fe0 pc=0x4848e1 | |
created by runtime.gcBgMarkStartWorkers in goroutine 1 | |
/usr/local/go/src/runtime/mgc.go:1234 +0x1c | |
goroutine 36 gp=0xc00038c1c0 m=nil [GC worker (idle)]: | |
runtime.gopark(0x4967b0b558f?, 0x0?, 0x0?, 0x0?, 0x0?) | |
/usr/local/go/src/runtime/proc.go:402 +0xce fp=0xc00008cf50 sp=0xc00008cf30 pc=0x451c8e | |
runtime.gcBgMarkWorker() | |
/usr/local/go/src/runtime/mgc.go:1310 +0xe5 fp=0xc00008cfe0 sp=0xc00008cf50 pc=0x432d25 | |
runtime.goexit({}) | |
/usr/local/go/src/runtime/asm_amd64.s:1695 +0x1 fp=0xc00008cfe8 sp=0xc00008cfe0 pc=0x4848e1 | |
created by runtime.gcBgMarkStartWorkers in goroutine 1 | |
/usr/local/go/src/runtime/mgc.go:1234 +0x1c | |
goroutine 12 gp=0xc0001036c0 m=nil [runnable]: | |
runtime.concatstring2(0xc000621ea8?, {0xc0005cbb60?, 0x55?}, {0x2130600?, 0x1?}) | |
/usr/local/go/src/runtime/string.go:59 +0x6e fp=0xc000621e60 sp=0xc000621e58 pc=0x46bc0e | |
github.com/NVIDIA/dcgm-exporter/internal/pkg/stdout.Capture.func2() | |
/go/src/github.com/NVIDIA/dcgm-exporter/internal/pkg/stdout/capture.go:65 +0x245 fp=0xc000621fe0 sp=0xc000621e60 pc=0x196f705 | |
runtime.goexit({}) | |
/usr/local/go/src/runtime/asm_amd64.s:1695 +0x1 fp=0xc000621fe8 sp=0xc000621fe0 pc=0x4848e1 | |
created by github.com/NVIDIA/dcgm-exporter/internal/pkg/stdout.Capture in goroutine 1 | |
/go/src/github.com/NVIDIA/dcgm-exporter/internal/pkg/stdout/capture.go:57 +0x1d9 | |
rax 0x0 | |
rbx 0x7a380c1fa740 | |
rcx 0x7a3877183a1c | |
rdx 0x1 | |
rdi 0x0 | |
rsi 0x7a3808001e00 | |
rbp 0x4 | |
rsp 0x7a382cc6d000 | |
r8 0x90800 | |
r9 0x7a3808001e00 | |
r10 0x0 | |
r11 0x287 | |
r12 0xffffffffffffff78 | |
r13 0x2 | |
r14 0x0 | |
r15 0x7a382cc6d1f0 | |
rip 0x7a38771506aa | |
rflags 0x10206 | |
cs 0x33 | |
fs 0x0 | |
gs 0x0 |
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment