Feedback and Missing File

#1
by Firworks - opened

First off, this is a pretty interesting model! I was able to quantize it to NVFP4 and get it running with your custom vllm container image. With the quantization, this model’s capabilities become accessible on a single RTX 5090.

However, one issue I ran into was a missing file in the repo, modeling_step_audio_2.py. I was able to grab a copy from stepfun-ai/Step-Audio-2-mini and it seems to have worked. I ran your provided test script and the responses looked good to me.

Would it be possible to include the appropriate modeling_step_audio_2.py file in the Step-Audio-R1 repo? It would help avoid load failures for users working with Transformers or doing quantization.

Thank you so much for your feedback and for trying out the model! We truly appreciate it.

You are absolutely right, the missing modeling_step_audio_2.py file was an oversight on our part. Our apologies for the inconvenience this caused. We have noted the issue and will add the necessary file to the Step-Audio-R1 repository as soon as possible to prevent load failures for other users.

We're also very glad to hear that you successfully quantized the model to NVFP4 and got it running smoothly on a single RTX 5090! It's great that the test responses looked good.

Please don't hesitate to let us know if you encounter any other issues or have further suggestions.

Happy coding

Sign up or log in to comment