Sally O'Malley sallyom

llm-d Observability: PromQL Queries and Trace Spans

Based on your observability needs, here's a comprehensive mapping of PromQL queries and complementary trace spans:

Tier 1: Immediate Failure & Saturation Indicators

Metric Need	PromQL Query	Trace Spans to Enhance Insight
Overall Error Rate (Platform-wide)	`sum(rate(inference_model_request_error_total[5m])) / sum(rate(inference_model_request_total[5m]))`	`gateway.request` with error status codes and error messages
Per-Model Error Rate	`sum by(model) (rate(inference_model_request_error_total[5m])) / sum by(model) (rate(inference_model_request_total[5m]))`	`gateway.request` with `gen_ai.request.model` attribute

llm-d Metrics Documentation

This document provides an overview of all metrics generated by the llm-d components.

Overview

The llm-d system uses Prometheus as the primary metrics collection framework, with metrics covering inference performance, resource utilization, error rates, and energy consumption across multiple components.

	#!/usr/bin/env bash

	set -o errexit

	# Create a container
	container=$(buildah from alpine)

	# Run this from wherever the built binaries are available
	buildah config --label maintainer="Sally O'Malley <[email protected]>" $container

	apiVersion: v1
	kind: Namespace
	metadata:
	name: rekor-system
	labels:
	openshift.io/cluster-monitoring: true
	---
	apiVersion: rbac.authorization.k8s.io/v1
	kind: Role
	metadata:

	$ df -Th
	Filesystem Type Size Used Avail Use% Mounted on
	devtmpfs devtmpfs 3.8G 0 3.8G 0% /dev
	tmpfs tmpfs 3.8G 0 3.8G 0% /dev/shm
	tmpfs tmpfs 3.8G 25M 3.8G 1% /run
	/dev/mapper/rhel-root xfs 70G 5.4G 64G 8% /
	/dev/vda1 xfs 794M 214M 581M 27% /boot
	/dev/vda2 vfat 200M 8.0K 200M 1% /boot/efi
	tmpfs tmpfs 777M 0 777M 0% /run/user/1000
	----------------------------------------------------------

	#!/bin/bash

	### Run `make sigstore-testenv-up` from local checkout of containers/skopeo before running this script.
	### This script must run from local checkout of sigstore/sigstore

	set -ex

	echo "running tests"

	export VAULT_TOKEN=testtoken

	# This is an example configTarget that is used in testing.
	# Harpoon will start with this config then will load targets from ./examples/config-reload.yaml
	targets:
	- name: config
	methods:
	configTarget:
	configUrl: https://raw.githubusercontent.com/sallyom/harpoon/config-upload/examples/config-reload.yaml
	schedule: "/1 * * *"

	# Helper function for fancy git prompt.
	# Place this in ~/.bashrc
	# Then, `source ~/.bashrc` will execute the prompt function w/out having to reboot
	# (otherwise, any ~/.bashrc settings take effect with every reboot.
	# Lines 7-38 go in ~/.bashrc

	function parse_git_branch {
	git branch --no-color 2> /dev/null \| sed -e '/^[^]/d' -e 's/ \(.*\)/(\1)/'
	}

	mkdir ~/.kube && sudo cp /var/lib/microshift/resources/kubeadmin/kubeconfig ~/.kube/config
	sudo chown -R ec2-user:ec2-user ~/.kube
	curl -o oc.tar.gz https://mirror.openshift.com/pub/openshift-v4/clients/oc/latest/linux/oc.tar.gz
	tar -xzvf oc.tar.gz
	sudo install -t /usr/local/bin oc
	rm oc.tar.gz oc kubectl README.md

	SNAPSHOTTER_VERSION=v4.2.1
	# Change to the latest supported snapshotter version

Sally O'Malley sallyom

llm-d Observability: PromQL Queries and Trace Spans

Tier 1: Immediate Failure & Saturation Indicators

llm-d Metrics Documentation

Overview

Component Metrics