Files changed (1) hide show
  1. README.md +13 -10
README.md CHANGED
@@ -3,7 +3,7 @@ license: mit
3
  base_model:
4
  - Qwen/Qwen2.5-14B-Instruct
5
  ---
6
- <h2 align="center">ICLR25| LLMOPT: Learning to Define and Solve <br> General Optimization Problems from Scratch </h2>
7
  <p align="center">
8
  <a href=""><strong>Caigao Jiang</strong></a><sup>*</sup>
9
  ·
@@ -22,19 +22,18 @@ base_model:
22
  <sup>*</sup>Equal Contribution, <sup>†</sup>Corresponding Authors.
23
  </div>
24
  <p align="center">
25
- <b>East China Normal University    |    Ant Group   |   Nanjing University </b></p>
26
- <p align="center">
27
- <a href="https://openreview.net/pdf?id=9OMvtboTJg"><img src='https://img.shields.io/badge/Paper-LLMOPT-red'></a>
28
- <a href='https://huggingface.co/ant-opt/LLMOPT-Qwen2.5-14B'><img src='https://img.shields.io/badge/%F0%9F%A4%97%20HuggingFace-Model-yellow'></a>
29
- <!-- <a href=''><img src='https://img.shields.io/badge/%F0%9F%A4%97%20HuggingFace-Dataset-yellow'></a> -->
30
- <a href='https://github.com/ant-opt/LLMOPT/tree/main/data/testset'><img src='https://img.shields.io/badge/Dataset-Testset-blue'></a>
31
- <a href='https://github.com/ant-opt/LLMOPT'><img src='https://img.shields.io/badge/GitHub-Repo-blue'></a>
32
- </p>
33
  </p>
34
 
35
  ## 🤖Model Release
36
 
37
- We release the [LLMOPT-Qwen2.5-14B](https://huggingface.co/ant-opt/LLMOPT-Qwen2.5-14B) model on Hugging Face and conduct comprehensive performance evaluations. Unlike the version described in our paper which used [Qwen1.5-14B](https://huggingface.co/Qwen/Qwen1.5-14B), this release is fine-tuned from the latest [Qwen2.5-14B-Instruct](https://huggingface.co/Qwen/Qwen2.5-14B-Instruct) model. Additionally, we exclude all Mamo EasyLP and ComplexLP datasets from the training process, reserving them exclusively for the test. The performance metrics for [LLMOPT-Qwen2.5-14B](https://huggingface.co/ant-opt/LLMOPT-Qwen2.5-14B) are as follows:
38
 
39
  | Dataset | NL4Opt | Mamo Easy | Mamo Complex | NLP4LP | ComplexOR | IndustryOR | ICML Competition | OptiBench | OptMath | AVG |
40
  | :-------------------------------: | :--------------: | :--------------: | :--------------: | :--------------: | :--------------: | :--------------: | :--------------: | :--------------: | :--------------: | :--------------: |
@@ -45,12 +44,16 @@ We release the [LLMOPT-Qwen2.5-14B](https://huggingface.co/ant-opt/LLMOPT-Qwen2.
45
  | ER w/o self-correction | 97.42% | 98.29% | 77.73% | 97.93% | 88.89% | 61.00% | 93.90% | 73.22% | 31.93% | 80.03% |
46
  | SA w/o self-correction | 80.28% | 89.53% | 44.08% | 73.42% | 35.29% | 29.00% | 75.35% | 53.83% | 12.50% | 54.81% |
47
 
 
 
48
  ## 📊Dataset Release
49
 
50
  ### Data Structure
51
 
52
  To facilitate the evaluation, we process all datasets into a unified data structure. Specifically, each dataset is organized in a `jsonl` file, and each line is an independent piece of data. Each data includes four attributes, `question`, `answer`, `ori`, and `index`. The `question` field is a complete string description of the optimization problem, including complete data that can solve a problem. The `answer` field is a `float` type value, which indicates the objective function value corresponding to the optimal solution of the problem, i.e., the ground truth. The `ori` field indicates the source of the problem, that is, the name of the dataset. In order to facilitate statistical results, we use the `index` field to number the data in each dataset.
53
 
 
 
54
  An example: (The first data of the NL4Opt dataset)
55
 
56
  ```json
 
3
  base_model:
4
  - Qwen/Qwen2.5-14B-Instruct
5
  ---
6
+ <h2 align="center">ICLR25 | LLMOPT: Learning to Define and Solve General Optimization Problems from Scratch </h2>
7
  <p align="center">
8
  <a href=""><strong>Caigao Jiang</strong></a><sup>*</sup>
9
  ·
 
22
  <sup>*</sup>Equal Contribution, <sup>†</sup>Corresponding Authors.
23
  </div>
24
  <p align="center">
25
+ <b>East China Normal University | Ant Group | Nanjing University </b></p>
26
+ <p align="center" style="white-space: nowrap;">
27
+ <a href="https://openreview.net/pdf?id=9OMvtboTJg" style="display: inline-block;"><img src='https://img.shields.io/badge/Paper-LLMOPT-red'></a>
28
+ <a href='https://huggingface.co/ant-opt/LLMOPT-Qwen2.5-14B' style="display: inline-block;"><img src='https://img.shields.io/badge/%F0%9F%A4%97%20HuggingFace-Model-yellow'></a>
29
+ <a href='https://github.com/ant-opt/LLMOPT/tree/main/data/testset' style="display: inline-block;"><img src='https://img.shields.io/badge/Dataset-Testset-blue'></a>
30
+ <a href='https://github.com/ant-opt/LLMOPT' style="display: inline-block;"><img src='https://img.shields.io/badge/GitHub-Repo-blue'></a>
31
+ </p>
 
32
  </p>
33
 
34
  ## 🤖Model Release
35
 
36
+ We release the [LLMOPT-Qwen2.5-14B](https://huggingface.co/ant-opt/LLMOPT-Qwen2.5-14B) model on Hugging Face and conduct comprehensive performance evaluations. We have updated the model evaluation results as shown in the following table, where the original results correspond to Table 1 and Table 2 in the paper. The differences in results stem from two reasons. Firstly, we exclude all Mamo EasyLP and ComplexLP datasets from the training process, reserving them exclusively for the test. Additionally, unlike the version described in our paper which used [Qwen1.5-14B](https://huggingface.co/Qwen/Qwen1.5-14B), this release is fine-tuned from the latest [Qwen2.5-14B-Instruct](https://huggingface.co/Qwen/Qwen2.5-14B-Instruct) model. The performance metrics for [LLMOPT-Qwen2.5-14B](https://huggingface.co/ant-opt/LLMOPT-Qwen2.5-14B) are as follows:
37
 
38
  | Dataset | NL4Opt | Mamo Easy | Mamo Complex | NLP4LP | ComplexOR | IndustryOR | ICML Competition | OptiBench | OptMath | AVG |
39
  | :-------------------------------: | :--------------: | :--------------: | :--------------: | :--------------: | :--------------: | :--------------: | :--------------: | :--------------: | :--------------: | :--------------: |
 
44
  | ER w/o self-correction | 97.42% | 98.29% | 77.73% | 97.93% | 88.89% | 61.00% | 93.90% | 73.22% | 31.93% | 80.03% |
45
  | SA w/o self-correction | 80.28% | 89.53% | 44.08% | 73.42% | 35.29% | 29.00% | 75.35% | 53.83% | 12.50% | 54.81% |
46
 
47
+ In the experiment, we use three performance metrics to comprehensively evaluate the optimization generalization of the algorithm, namely, **Execution Rate (ER), Solving Accuracy (SA), and Average Solving Times (AST)**. Specifically, **ER** refers to the proportion of solutions whose code can run without any errors and has running results output. **SA** refers to the proportion of solutions that correctly solve the optimization problem, i.e., find the optimal solution. **AST** refers to the average number of times the self-correction process is performed during the test.
48
+
49
  ## 📊Dataset Release
50
 
51
  ### Data Structure
52
 
53
  To facilitate the evaluation, we process all datasets into a unified data structure. Specifically, each dataset is organized in a `jsonl` file, and each line is an independent piece of data. Each data includes four attributes, `question`, `answer`, `ori`, and `index`. The `question` field is a complete string description of the optimization problem, including complete data that can solve a problem. The `answer` field is a `float` type value, which indicates the objective function value corresponding to the optimal solution of the problem, i.e., the ground truth. The `ori` field indicates the source of the problem, that is, the name of the dataset. In order to facilitate statistical results, we use the `index` field to number the data in each dataset.
54
 
55
+ The data are [available](https://github.com/antgroup/LLMOPT/tree/main/data/testset).
56
+
57
  An example: (The first data of the NL4Opt dataset)
58
 
59
  ```json