add blank_penalty for offline transducer #542

chiiyeh · 2024-01-24T08:11:33Z

Fix #541
Tested with sherpa-onnx-offline. High penalty will force out more words.

csukuangfj

Thanks! Left some minor comments. Otherwise, it looks good to me.

csukuangfj · 2024-01-24T08:26:59Z

sherpa-onnx/csrc/offline-transducer-greedy-search-decoder.cc

-    const float *p_logit = logit.GetTensorData<float>();
+    float *p_logit = logit.GetTensorMutableData<float>();
+    if (blank_penalty_ > 0.0) {
+      p_logit[0] -= blank_penalty_; // assuming blank id is 0


Shall we also consider the case when batch_size > 1 ?

p_logit[0] is only for the first utterance.

We need to process

p_logit[vocab_size*i + 0] // for in range(n)

You can move this if statement into the for statement below.

csukuangfj · 2024-01-24T08:33:25Z

python-api-examples/non_streaming_server.py

@@ -862,6 +876,7 @@ def create_recognizer(args) -> sherpa_onnx.OfflineRecognizer:
            max_active_paths=args.max_active_paths,
            hotwords_file=args.hotwords_file,
            hotwords_score=args.hotwords_score,
+            blank_penalty=args.blank_penalty,


Could you also update

sherpa-onnx/sherpa-onnx/python/sherpa_onnx/offline_recognizer.py

Line 38 in b929124

def from_transducer(

to add an extra argument

blank_penalty: float = 0

and also add docstring for it?

Note that you need to pass the argument to

sherpa-onnx/sherpa-onnx/python/sherpa_onnx/offline_recognizer.py

Line 114 in b929124

recognizer_config = OfflineRecognizerConfig(

Fixed!
I also noticed there is no docstring for the hotwords params, as well as the max_active_paths was not propagated into the OfflineRecognizerConfig. Should I raise two seperate PR to fix these?

Should I raise two seperate PR to fix these?

Yes, please do that in a separate PR. Thanks!

csukuangfj · 2024-01-24T11:25:35Z

By the way, please merge the master branch into your current branch so the CI can pass.

csukuangfj · 2024-01-25T04:01:05Z

Could you fix the c++ code style check?
https://github.com/k2-fsa/sherpa-onnx/actions/runs/7648706009/job/20842763975#step:4:38

Check last commit
sherpa-onnx/csrc/math.h:100:  Lines should be <= 80 characters long  [whitespace/line_length] [2]
Done processing sherpa-onnx/csrc/math.h
Total errors found: 1
[FAILED] sherpa-onnx/csrc/math.h
Error: Process completed with exit code 1

You can run

pip install clang-format
cd /path/to/sherpa-onnx
./scripts/check_style_cpplint.sh
./scripts/check_style_cpplint.sh 1
./scripts/check_style_cpplint.sh 2

to check the style locally.

csukuangfj · 2024-01-25T06:59:53Z

Thank you for your first-time contribution!

add blank_penalty for offline transducer

b929124

csukuangfj reviewed Jan 24, 2024

View reviewed changes

chiiyeh and others added 3 commits January 25, 2024 09:24

Merge branch 'k2-fsa:master' into master

a4fc90b

Shift blank penalty into the loop for greedy search

b1c61cf

add blank_penalty to offline_recognizer.py

d1992b8

Fix style check

5d16622

csukuangfj merged commit 3bb3849 into k2-fsa:master Jan 25, 2024
171 of 179 checks passed

XiaYucca pushed a commit to XiaYucca/sherpa-onnx that referenced this pull request Jan 9, 2025

add blank_penalty for offline transducer (k2-fsa#542)

20313fe

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

add blank_penalty for offline transducer #542

add blank_penalty for offline transducer #542

chiiyeh commented Jan 24, 2024 •

edited

Loading

csukuangfj left a comment

csukuangfj Jan 24, 2024

chiiyeh Jan 25, 2024

csukuangfj Jan 24, 2024

chiiyeh Jan 25, 2024

csukuangfj Jan 25, 2024

csukuangfj commented Jan 24, 2024

csukuangfj commented Jan 25, 2024

csukuangfj commented Jan 25, 2024

add blank_penalty for offline transducer #542

add blank_penalty for offline transducer #542

Conversation

chiiyeh commented Jan 24, 2024 • edited Loading

csukuangfj left a comment

Choose a reason for hiding this comment

csukuangfj Jan 24, 2024

Choose a reason for hiding this comment

chiiyeh Jan 25, 2024

Choose a reason for hiding this comment

csukuangfj Jan 24, 2024

Choose a reason for hiding this comment

chiiyeh Jan 25, 2024

Choose a reason for hiding this comment

csukuangfj Jan 25, 2024

Choose a reason for hiding this comment

csukuangfj commented Jan 24, 2024

csukuangfj commented Jan 25, 2024

csukuangfj commented Jan 25, 2024

chiiyeh commented Jan 24, 2024 •

edited

Loading