Instructions to use jetmoe/jetmoe-8b with libraries, inference providers, notebooks, and local apps. Follow these links to get started.
- Libraries
- Transformers
How to use jetmoe/jetmoe-8b with Transformers:
# Use a pipeline as a high-level helper from transformers import pipeline pipe = pipeline("text-generation", model="jetmoe/jetmoe-8b")# Load model directly from transformers import AutoTokenizer, AutoModelForCausalLM tokenizer = AutoTokenizer.from_pretrained("jetmoe/jetmoe-8b") model = AutoModelForCausalLM.from_pretrained("jetmoe/jetmoe-8b") - Notebooks
- Google Colab
- Kaggle
- Local Apps
- vLLM
How to use jetmoe/jetmoe-8b with vLLM:
Install from pip and serve model
# Install vLLM from pip: pip install vllm # Start the vLLM server: vllm serve "jetmoe/jetmoe-8b" # Call the server using curl (OpenAI-compatible API): curl -X POST "http://localhost:8000/v1/completions" \ -H "Content-Type: application/json" \ --data '{ "model": "jetmoe/jetmoe-8b", "prompt": "Once upon a time,", "max_tokens": 512, "temperature": 0.5 }'Use Docker
docker model run hf.co/jetmoe/jetmoe-8b
- SGLang
How to use jetmoe/jetmoe-8b with SGLang:
Install from pip and serve model
# Install SGLang from pip: pip install sglang # Start the SGLang server: python3 -m sglang.launch_server \ --model-path "jetmoe/jetmoe-8b" \ --host 0.0.0.0 \ --port 30000 # Call the server using curl (OpenAI-compatible API): curl -X POST "http://localhost:30000/v1/completions" \ -H "Content-Type: application/json" \ --data '{ "model": "jetmoe/jetmoe-8b", "prompt": "Once upon a time,", "max_tokens": 512, "temperature": 0.5 }'Use Docker images
docker run --gpus all \ --shm-size 32g \ -p 30000:30000 \ -v ~/.cache/huggingface:/root/.cache/huggingface \ --env "HF_TOKEN=<secret>" \ --ipc=host \ lmsysorg/sglang:latest \ python3 -m sglang.launch_server \ --model-path "jetmoe/jetmoe-8b" \ --host 0.0.0.0 \ --port 30000 # Call the server using curl (OpenAI-compatible API): curl -X POST "http://localhost:30000/v1/completions" \ -H "Content-Type: application/json" \ --data '{ "model": "jetmoe/jetmoe-8b", "prompt": "Once upon a time,", "max_tokens": 512, "temperature": 0.5 }' - Docker Model Runner
How to use jetmoe/jetmoe-8b with Docker Model Runner:
docker model run hf.co/jetmoe/jetmoe-8b
Dataset?
#1
by 0xbitches - opened
Hi, the model card claims the model is trained using publicly available datasets, but I cannot seem to find any references to what these datasets are. Are there plans to include the actual datasets used?
The pie charts have each of the datasets listed
I see. I guess my question is then are there plans to organize the dataset and publish the selected subsets to make sure the model training is reproducible.
Yes, we plan to give more details about the data selection and mixture in our technical report.