Skip to content

Optimize Parakeet feature extraction on CUDA#45134

Draft
milesial wants to merge 2 commits intohuggingface:mainfrom
milesial:codex/parakeet-gpu-transformers
Draft

Optimize Parakeet feature extraction on CUDA#45134
milesial wants to merge 2 commits intohuggingface:mainfrom
milesial:codex/parakeet-gpu-transformers

Conversation

@milesial
Copy link
Copy Markdown
Contributor

@milesial milesial commented Mar 31, 2026

What does this PR do?

Add support for CUDA parakeet preprocessor, running STFT and mel spectrogram extraction on the GPU.
This refactor also speeds up the CPU implementation.

Tested on nvidia/parakeet-ctc-0.6b, B200, 300s audio:

Before this PR, CPU: 28ms
After this PR, CPU: 21ms
After this PR, GPU: 1.7ms

No impact on accuracy (VoxPopuli).

  • I confirm that this is not a pure code agent PR.

Before submitting

Who can review?

Anyone in the community is free to review the PR once the tests have passed. Feel free to tag
members/contributors who may be interested in your PR.

@github-actions
Copy link
Copy Markdown
Contributor

[For maintainers] Suggested jobs to run (before merge)

run-slow: parakeet

@Rocketknight1
Copy link
Copy Markdown
Member

cc @eustlb @ebezzam

@eustlb
Copy link
Copy Markdown
Contributor

eustlb commented Mar 31, 2026

Hey @milesial, interesting PR! This should be covered out of the box by #44394, so I'll wait it to land (coming days) beofre doing a full review. Context on what motivated you to do so and limitations you had with the current implem would be golden feedback for us 🙏

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants