Skip to content

Instantly share code, notes, and snippets.

View nerdalert's full-sized avatar
🐈
🦀 🐿

Brent Salisbury nerdalert

🐈
🦀 🐿
View GitHub Profile
2025-06-24T04:04:35.7285268Z Current runner version: '2.325.0'
2025-06-24T04:04:35.7315212Z ##[group]Operating System
2025-06-24T04:04:35.7316005Z Ubuntu
2025-06-24T04:04:35.7316512Z 24.04.2
2025-06-24T04:04:35.7316951Z LTS
2025-06-24T04:04:35.7317533Z ##[endgroup]
2025-06-24T04:04:35.7318061Z ##[group]Runner Image
2025-06-24T04:04:35.7318693Z Image: ubuntu-24.04
2025-06-24T04:04:35.7319275Z Version: 20250615.1.0
2025-06-24T04:04:35.7320471Z Included Software: https://github.com/actions/runner-images/blob/ubuntu24/20250615.1/images/ubuntu/Ubuntu2404-Readme.md
2025-06-24T03:52:30.5394764Z Current runner version: '2.325.0'
2025-06-24T03:52:30.5428738Z ##[group]Operating System
2025-06-24T03:52:30.5429883Z Ubuntu
2025-06-24T03:52:30.5430846Z 24.04.2
2025-06-24T03:52:30.5431596Z LTS
2025-06-24T03:52:30.5432469Z ##[endgroup]
2025-06-24T03:52:30.5433460Z ##[group]Runner Image
2025-06-24T03:52:30.5434453Z Image: ubuntu-24.04
2025-06-24T03:52:30.5435403Z Version: 20250615.1.0
2025-06-24T03:52:30.5437208Z Included Software: https://github.com/actions/runner-images/blob/ubuntu24/20250615.1/images/ubuntu/Ubuntu2404-Readme.md
2025-06-24T02:53:43.4754220Z Current runner version: '2.325.0'
2025-06-24T02:53:43.4782172Z ##[group]Operating System
2025-06-24T02:53:43.4783684Z Ubuntu
2025-06-24T02:53:43.4784437Z 24.04.2
2025-06-24T02:53:43.4785176Z LTS
2025-06-24T02:53:43.4785895Z ##[endgroup]
2025-06-24T02:53:43.4786846Z ##[group]Runner Image
2025-06-24T02:53:43.4787746Z Image: ubuntu-24.04
2025-06-24T02:53:43.4788600Z Version: 20250615.1.0
2025-06-24T02:53:43.4790596Z Included Software: https://github.com/actions/runner-images/blob/ubuntu24/20250615.1/images/ubuntu/Ubuntu2404-Readme.md

Request rates 10,30,inf (num prompts max 900)

Only difference in the commands are metadata (deployment name for graphing):

./run-bench.sh   --model meta-llama/Llama-3.2-3B-Instruct \
  --base_url http://llm-d-inference-gateway.llm-d.svc.cluster.local:80 \
  --dataset-name random \
  --input-len 1000 \
 --output-len 500 \
$ ENV_METADATA_GPU="4xNVIDIA_L40S" \
./e2e-bench-control.sh --4xgpu-minikube --model meta-llama/Llama-3.2-3B-Instruct

🌟 LLM Deployment and Benchmark Orchestrator 🌟
-------------------------------------------------
--- Configuration Summary ---
Minikube Start Args (Hardcoded): --driver docker --container-runtime docker --gpus all --memory no-limit --cpus no-limit
LLMD Installer Script (Hardcoded): ./llmd-installer.sh
Test Request Script (Hardcoded): ./test-request.sh (Args: --minikube, Retry: 30s)
ubuntu@ip-172-31-16-33:~/secret-llm-d-deployer/project$ kubectl logs -n kgateway-system kgateway-7c58ddd989-nw5wc -c kgateway --previous --tail=200
{"level":"info","ts":"2025-05-17T18:01:08.979Z","caller":"probes/probes.go:57","msg":"probe server starting at :8765 listening for /healthz"}
{"level":"info","ts":"2025-05-17T18:01:08.979Z","caller":"setup/setup.go:69","msg":"got settings from env: {DnsLookupFamily:V4_PREFERRED EnableIstioIntegration:false EnableIstioAutoMtls:false IstioNamespace:istio-system XdsServiceName:kgateway XdsServicePort:9977 UseRustFormations:false EnableInferExt:true InferExtAutoProvision:false DefaultImageRegistry:cr.kgateway.dev/kgateway-dev DefaultImageTag:v2.0.0 DefaultImagePullPolicy:IfNotPresent}"}
{"level":"info","ts":"2025-05-17T18:01:08.980Z","logger":"k8s","caller":"setup/setup.go:110","msg":"starting kgateway"}
{"level":"info","ts":"2025-05-17T18:01:08.984Z","logger":"k8s","caller":"setup/setup.go:117","msg":"creating krt collections"}
{"level":"info","ts":"2025-05-17T18:01
#!/usr/bin/env bash
# -*- indent-tabs-mode: nil; tab-width: 4; sh-indentation: 4; -*-
set -euo pipefail
### GLOBALS ###
NAMESPACE="llm-d"
PROVISION_MINIKUBE=false
PROVISION_MINIKUBE_GPU=false
STORAGE_SIZE="15Gi"
#!/usr/bin/env python3
"""
transcribe_video_to_srt.py
Transcribe a video or audio file into SRT subtitles using OpenAI Whisper.
Dependencies & Install:
------------------------------------
# 1. Create & activate a virtual environment (optional but recommended):
# python3 -m venv venv
$ helm template llm-d . --debug --namespace default --values values.yaml
install.go:225: 2025-05-07 17:20:53.000638786 +0000 UTC m=+0.031145623 [debug] Original chart version: ""
install.go:242: 2025-05-07 17:20:53.000679067 +0000 UTC m=+0.031185914 [debug] CHART PATH: /home/ubuntu/tmp/llm-d-deployer/charts/llm-d
---
# Source: llm-d/charts/redis/templates/master/serviceaccount.yaml
apiVersion: v1
kind: ServiceAccount
automountServiceAccountToken: false

HW

g6e.12xlarge or at least 2x L40S

Uninstall:

minikube delete
or just for kube parts
./llmd-installer-minikube.sh --uninstall  --namespace e2e-helm