Chunked inference?
Is there a way to do chunked inference with canary-qwen-2.5b?
This link works for canary but could not implement it for canary-qwen-2.5b - https://github.com/NVIDIA/NeMo/blob/main/examples/asr/asr_chunked_inference/aed/speech_to_text_aed_chunked_infer.py
Yes, there is, please refer to this code in the spaces demo:
https://huggingface.co/spaces/nvidia/canary-qwen-2.5b/blob/main/app.py#L30-L66
No ETA yet but this feature will also eventually land in NeMo speechlm2 collection.
Tell me exactly what you are trying to do, and I can help you. I have a custom application That processes this exact method in a high-performance method. ! I'd be happy to advise/show you how to do it
Yes, there is, please refer to this code in the spaces demo:
https://huggingface.co/spaces/nvidia/canary-qwen-2.5b/blob/main/app.py#L30-L66No ETA yet but this feature will also eventually land in NeMo speechlm2 collection.
is it possible to specify the source and target lang here too via parameters, e.g. transcribing to German? Or is the only option via the prompt?
PS: I provide German Audio samples and I currently get a gibberish of German and English back when using the prompt.