Skip to content

Instantly share code, notes, and snippets.

View sallyom's full-sized avatar

Sally O'Malley sallyom

View GitHub Profile

llm-d Observability: PromQL Queries and Trace Spans

Based on your observability needs, here's a comprehensive mapping of PromQL queries and complementary trace spans:

Tier 1: Immediate Failure & Saturation Indicators

Metric Need PromQL Query Trace Spans to Enhance Insight
Overall Error Rate (Platform-wide) sum(rate(inference_model_request_error_total[5m])) / sum(rate(inference_model_request_total[5m])) gateway.request with error status codes and error messages
Per-Model Error Rate sum by(model) (rate(inference_model_request_error_total[5m])) / sum by(model) (rate(inference_model_request_total[5m])) gateway.request with gen_ai.request.model attribute
@sallyom
sallyom / llm-d-metrics-summary.md
Last active July 12, 2025 00:56
llm-d-metrics-overview

llm-d Metrics Documentation

This document provides an overview of all metrics generated by the llm-d components.

Overview

The llm-d system uses Prometheus as the primary metrics collection framework, with metrics covering inference performance, resource utilization, error rates, and energy consumption across multiple components.

Component Metrics

See the tutorial below.
@sallyom
sallyom / build-kn.sh
Created October 19, 2023 15:13
binary-image-ex
#!/usr/bin/env bash
set -o errexit
# Create a container
container=$(buildah from alpine)
# Run this from wherever the built binaries are available
buildah config --label maintainer="Sally O'Malley <[email protected]>" $container
@sallyom
sallyom / cluster-monitoring-resources.yaml
Created September 11, 2023 16:11
role,rolebinding,service-monitor
apiVersion: v1
kind: Namespace
metadata:
name: rekor-system
labels:
openshift.io/cluster-monitoring: true
---
apiVersion: rbac.authorization.k8s.io/v1
kind: Role
metadata:
@sallyom
sallyom / describe-node
Last active December 16, 2022 14:45
oc describe node
$ df -Th
Filesystem Type Size Used Avail Use% Mounted on
devtmpfs devtmpfs 3.8G 0 3.8G 0% /dev
tmpfs tmpfs 3.8G 0 3.8G 0% /dev/shm
tmpfs tmpfs 3.8G 25M 3.8G 1% /run
/dev/mapper/rhel-root xfs 70G 5.4G 64G 8% /
/dev/vda1 xfs 794M 214M 581M 27% /boot
/dev/vda2 vfat 200M 8.0K 200M 1% /boot/efi
tmpfs tmpfs 777M 0 777M 0% /run/user/1000
----------------------------------------------------------
@sallyom
sallyom / e2e-test.sh
Last active May 9, 2022 21:18
test script to run sigstore integration tests with skopeo's `make sigstore-testenv-up`
#!/bin/bash
### Run `make sigstore-testenv-up` from local checkout of containers/skopeo before running this script.
### This script must run from local checkout of sigstore/sigstore
set -ex
echo "running tests"
export VAULT_TOKEN=testtoken
# This is an example configTarget that is used in testing.
# Harpoon will start with this config then will load targets from ./examples/config-reload.yaml
targets:
- name: config
methods:
configTarget:
configUrl: https://raw.githubusercontent.com/sallyom/harpoon/config-upload/examples/config-reload.yaml
schedule: "*/1 * * * *"
@sallyom
sallyom / section-in-bashrc
Last active March 21, 2022 17:09
fancy git terminal prompt to place in ~/.bashrc
# Helper function for fancy git prompt.
# Place this in ~/.bashrc
# Then, `source ~/.bashrc` will execute the prompt function w/out having to reboot
# (otherwise, any ~/.bashrc settings take effect with every reboot.
# Lines 7-38 go in ~/.bashrc
function parse_git_branch {
git branch --no-color 2> /dev/null | sed -e '/^[^*]/d' -e 's/* \(.*\)/(\1)/'
}
@sallyom
sallyom / microshift-install-volsync
Last active December 17, 2021 15:37
microshift instance, configure storage and install volsync operator
mkdir ~/.kube && sudo cp /var/lib/microshift/resources/kubeadmin/kubeconfig ~/.kube/config
sudo chown -R ec2-user:ec2-user ~/.kube
curl -o oc.tar.gz https://mirror.openshift.com/pub/openshift-v4/clients/oc/latest/linux/oc.tar.gz
tar -xzvf oc.tar.gz
sudo install -t /usr/local/bin oc
rm oc.tar.gz oc kubectl README.md
SNAPSHOTTER_VERSION=v4.2.1
# Change to the latest supported snapshotter version