Created
August 19, 2023 22:40
-
-
Save moyix/4ce706f84d64e1f8f68118f414bcbd5a to your computer and use it in GitHub Desktop.
Setup for locally hosted LLM chat using chat-ui and TGI with WizardLM-70B
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
MONGODB_URL=mongodb://localhost:27017 | |
HF_ACCESS_TOKEN=<REDACTED> | |
# 'name', 'userMessageToken', 'assistantMessageToken' are required | |
MODELS=`[ | |
{ | |
"endpoints": [{"url": "http://localhost:8081"}], | |
"name": "WizardLM/WizardLM-70B-V1.0", | |
"description": "WizardLM: Empowering Large Pre-Trained Language Models to Follow Complex Instructions", | |
"websiteUrl": "https://huggingface.co/WizardLM/WizardLM-70B-V1.0", | |
"userMessageToken": "USER: ", | |
"assistantMessageToken": "ASSISTANT: ", | |
"messageEndToken": "</s>", | |
"preprompt": "A chat between a curious user and an artificial intelligence assistant. The assistant gives helpful, detailed, and polite answers to the user's questions. USER: Hi ASSISTANT: Hello.</s>USER: Who are you? ASSISTANT: I am WizardLM.</s>", | |
"parameters": { | |
"temperature": 0.9, | |
"top_p": 0.95, | |
"repetition_penalty": 1.2, | |
"top_k": 50, | |
"truncate": 1000, | |
"max_new_tokens": 1024 | |
} | |
} | |
]` |
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
# start mongo | |
docker run -d -p 27017:27017 --name mongo-chatui mongo:latest | |
# build | |
npm install | |
# start chatui listening on all interfaces | |
npm run dev -- --host |
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
docker run --gpus all --shm-size 1g -p 8081:80 -e HUGGING_FACE_HUB_TOKEN=<REDACTED> -v /fastdata/hfcache/transformers:/data ghcr.io/huggingface/text-generation-inference:latest --model-id WizardLM/WizardLM-70B-V1.0 --num-shard 4 |
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment