Roihn
/

Einstein-Puzzles-Model

Model card Files Files and versions

Einstein-Puzzles-Model / README.md

Roihn's picture

Update README.md

7f3b831 verified 6 months ago

|

history blame contribute delete

1.43 kB

	---
	base_model:
	- meta-llama/Llama-3.1-8B-Instruct
	library_name: peft
	license: mit
	datasets:
	- Roihn/Einstein-Puzzles-Data
	language:
	- en
	---
	# Einstein-Puzzles

	Communication and Verification in LLM Agents towards Collaboration under Information Asymmetry ([Arxiv](https://arxiv.org/abs/2510.25595))

	Run Peng\, Ziqiao Ma\, Amy Pang, Sikai Li, Zhang Xi-Jia, Yingzhuo Yu, Cristian-Paul Bara, Joyce Chai

	## Model Details

	For all the model fine-tuning, we employ LoRA with a rank of 32, training with a global batch size of 128 and a learning rate of 2e-4 using a cosine decay schedule for 1 epoch. Fine-tuning is conducted using [OpenRLHF](https://github.com/OpenRLHF/OpenRLHF), while FlashAttention-2 is used to speed up training. The process takes approximately 30 minutes on 4 A40 GPUs with 48GB RAM each.

	This repo provides the fine-tuned model with full capability of information providing and seeking and chain-of-thought reasoning.

	## Citation
	```bibtex
	@misc{peng2025communicationverificationllmagents,
	title={Communication and Verification in LLM Agents towards Collaboration under Information Asymmetry},
	author={Run Peng and Ziqiao Ma and Amy Pang and Sikai Li and Zhang Xi-Jia and Yingzhuo Yu and Cristian-Paul Bara and Joyce Chai},
	year={2025},
	eprint={2510.25595},
	archivePrefix={arXiv},
	primaryClass={cs.CL},
	url={https://arxiv.org/abs/2510.25595},
	}
	```