Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Sample Android App for JNI library #57

Open
wants to merge 7 commits into
base: main
Choose a base branch
from

Conversation

Acs176
Copy link

@Acs176 Acs176 commented Feb 19, 2025

Description

This PR includes a sample android application under the android-app directory. It includes all the necessary .so's to run WhisperKit with any delegate. It's meant to have just very basic functionality (record, play audio, transcribe audio).

The only step needed to run the app is copying the openai_whisper-tiny/ folder into the assets folder of the android project. Currently I only support this model, can be updated in the future.

Features

Screenshot 2025-02-19 at 17 30 37

Audio Transcription

There's a dropdown to select the audio file and a Transcribe button to do just that.

Recording

Press the record button once to start a microphone recording, press again to end it. The file will be stored with the name MicInput.wav and can be selected from the dropdown to play it or transcribe it.

Known issues

  • Model caching doesn't work correctly with QNN delegate and it takes several minutes to load the model when starting the app (if using that delegate).
  • Issues running in Android Studio emulator. I've experienced some strange crashes when running on the emulator which I suspect are related to the QNN .so files not loading successfully

@v-prgmr
Copy link
Contributor

v-prgmr commented Feb 19, 2025

This is awesome, i am going to try it out now. Thank you so much 🚀

@v-prgmr
Copy link
Contributor

v-prgmr commented Feb 19, 2025

.apk works like a charm. Just ran the tiny model on a S24 Ultra SM8650. Great work Thank you! @Acs176

@Acs176
Copy link
Author

Acs176 commented Feb 20, 2025

Great news @v-prgmr! Could you share some latency data when running the transcribe on that device?
So far I've only been able to run it on an SM7450 (added the soc manually to the allowed list) and get around 1400ms transcribing the jfk.wav file.

@Acs176
Copy link
Author

Acs176 commented Feb 20, 2025

I updated the PR because I noticed a bug. The lib directory to access the .so files was being hardcoded in the c++ side. The android app sets this directory on runtime when loading the libraries (it puts some hash in the path), so these .so files were not being loaded correctly into the app and you couldn't use QNN unless you had previously run the adb-push.sh script, which puts the files in the folder that TranscribeTask.cpp indicated.

I allowed the WhisperKitRunner to receive the libs path through the NativeWhisperKit from the Android app on runtime. This ensures that the libraries for QNN load correctly.

I also added a section in the screen of the app to display tflite logs from logcat. That way you get some more feedback in case you run from apk.

Try it out again @v-prgmr, you should get much faster responses if QNN was not loading correctly.

@v-prgmr
Copy link
Contributor

v-prgmr commented Feb 20, 2025

@Acs176 just pulled the latest commit from your fork and rebuilt the app and ran it on SM8650.

Here are the screenshots for jfk.wav and english_test2.wav

Screenshot_20250220_155324.jpg

Screenshot_20250220_155343.jpg

@bpkeene
Copy link
Contributor

bpkeene commented Feb 26, 2025

Review is in progress! Will post feedback shortly

@yuguolong
Copy link

Great work!

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

4 participants