Instructions to use next-tat/tat-llm-13b-fft with libraries, inference providers, notebooks, and local apps. Follow these links to get started.

Libraries

How to use next-tat/tat-llm-13b-fft with Transformers:

# Use a pipeline as a high-level helper
from transformers import pipeline

pipe = pipeline("text-generation", model="next-tat/tat-llm-13b-fft")

# Load model directly
from transformers import AutoTokenizer, AutoModelForCausalLM

tokenizer = AutoTokenizer.from_pretrained("next-tat/tat-llm-13b-fft")
model = AutoModelForCausalLM.from_pretrained("next-tat/tat-llm-13b-fft")

Notebooks
Google Colab
Kaggle
Local Apps

vLLM

How to use next-tat/tat-llm-13b-fft with vLLM:

Install from pip and serve model

# Install vLLM from pip:
pip install vllm
# Start the vLLM server:
vllm serve "next-tat/tat-llm-13b-fft"
# Call the server using curl (OpenAI-compatible API):
curl -X POST "http://localhost:8000/v1/completions" \
	-H "Content-Type: application/json" \
	--data '{
		"model": "next-tat/tat-llm-13b-fft",
		"prompt": "Once upon a time,",
		"max_tokens": 512,
		"temperature": 0.5
	}'

Use Docker

docker model run hf.co/next-tat/tat-llm-13b-fft

SGLang

How to use next-tat/tat-llm-13b-fft with SGLang:

Install from pip and serve model

# Install SGLang from pip:
pip install sglang
# Start the SGLang server:
python3 -m sglang.launch_server \
    --model-path "next-tat/tat-llm-13b-fft" \
    --host 0.0.0.0 \
    --port 30000
# Call the server using curl (OpenAI-compatible API):
curl -X POST "http://localhost:30000/v1/completions" \
	-H "Content-Type: application/json" \
	--data '{
		"model": "next-tat/tat-llm-13b-fft",
		"prompt": "Once upon a time,",
		"max_tokens": 512,
		"temperature": 0.5
	}'

Use Docker images

docker run --gpus all \
    --shm-size 32g \
    -p 30000:30000 \
    -v ~/.cache/huggingface:/root/.cache/huggingface \
    --env "HF_TOKEN=<secret>" \
    --ipc=host \
    lmsysorg/sglang:latest \
    python3 -m sglang.launch_server \
        --model-path "next-tat/tat-llm-13b-fft" \
        --host 0.0.0.0 \
        --port 30000
# Call the server using curl (OpenAI-compatible API):
curl -X POST "http://localhost:30000/v1/completions" \
	-H "Content-Type: application/json" \
	--data '{
		"model": "next-tat/tat-llm-13b-fft",
		"prompt": "Once upon a time,",
		"max_tokens": 512,
		"temperature": 0.5
	}'

Docker Model Runner
How to use next-tat/tat-llm-13b-fft with Docker Model Runner:
```
docker model run hf.co/next-tat/tat-llm-13b-fft
```
Browse Quantizations to use this model in llama.cpp, Ollama, LM Studio, or any compatible app.

TAT-LLM: A Specialized Language Model for Discrete Reasoning over Tabular and Textual Data

Paper: https://arxiv.org/abs/2401.13223

Code: https://github.com/fengbinzhu/TAT-LLM

Introduction

We present TAT-LLM, a specialized language model crafted through the innovative Step-wise Pipeline approach, focusing on the nuanced realm of tabular and textual question answering (QA). This model is the fruit of rigorously fine-tuning the LLaMA 2 architecture with a novel dataset, autonomously generated from expertly annotated resources. TAT-LLM stands at the intersection of tabular comprehension and textual analysis, engineered to excel by embodying three fundamental phases: Extraction, Reasoning, and Execution. Our empirical findings illuminate TAT-LLM's remarkable capability to eclipse traditional benchmarks, surmounting even the most advanced models and colossal language models such as GPT-4 across a suite of demanding financial QA tasks like FinQA, TAT-QA, and TAT-DQA. This endeavor not only sets a new standard for task-specific language models but also paves the way for future explorations in optimizing smaller models for highly specialized functions.

Model	Size	FINQA	TATQA	TATDQA
GPT-3.5-Turbo	-	58.00	59.47	52.74
GPT-4	-	63.91	71.92	64.46
TAT-LLM-7B-LORA	7B	65.13	76.49	71.38
TAT-LLM-7B-FFT	7B	69.75	76.91	72.64
TAT-LLM-13B-LORA	13B	71.93	77.51	72.22
TAT-LLM-13B-FFT	13B	72.97	78.41	73.18
TAT-LLM-70B-LORA	70B	76.81	81.42	76.55
TAT-LLM-70B-FFT	70B	76.11	82.20	76.97

Training

We train our TAT-LLM model in various sizes, including 7B, 13B, and 70B, using different methods such as parameter-efficient fine-tuning and full-parameter fine-tuning of LLaMA 2 on a combination of financial data from the FinQA, TAT-QA, and TAT-DQA training sets(🤗HuggingFace Repo). To refine accuracy, we introduce an External Executor, enhancing the model by processing intermediate outputs to derive conclusive answers. Please refer to the paper for more details.

Inference & Evaluation

Please refer to code here

Citation

If you find this model helpful, please consider citing our paper:

@misc{zhu2024tatllm,
      title={TAT-LLM: A Specialized Language Model for Discrete Reasoning over Tabular and Textual Data},
      author={Fengbin Zhu and Ziyang Liu and Fuli Feng and Chao Wang and Moxin Li and Tat-Seng Chua},
      year={2024},
      eprint={2401.13223},
      archivePrefix={arXiv},
      primaryClass={cs.CL}
}

Downloads last month: 7

Safetensors

Model size

13B params

Tensor type

BF16

Model tree for next-tat/tat-llm-13b-fft

Quantizations

2 models

Paper for next-tat/tat-llm-13b-fft

TAT-LLM: A Specialized Language Model for Discrete Reasoning over Tabular and Textual Data

Paper • 2401.13223 • Published Jan 24, 2024 • 1