Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Failing on_message retries message indefinitly #867

Open
rubenbaer opened this issue Oct 31, 2024 · 1 comment
Open

Failing on_message retries message indefinitly #867

rubenbaer opened this issue Oct 31, 2024 · 1 comment
Labels
Status: Available No one has claimed responsibility for resolving this issue.

Comments

@rubenbaer
Copy link

rubenbaer commented Oct 31, 2024

Hi!

I was trying to understand the behaviour of message processing if on_message fails with respect to package acknowledgement. If on_message fails, then the message is not acknowledged and I was pondering how this will affect the client. What I found surprised me.

If on_message(...) fails, then the message is kept in the client and retried (by the client) as soon as an other message is received. If it fails again, then the message is retried indefinitly and the client fails to process any new messages. Eventually, this causes a connection timeout. Even after the timeout, the message is retried indefinitly.

I was expecting that if on_messsage(...) fails, then the message is dropped.

Furthermore, I was expecting that if on_message(...) fails on QoS 1 and 2 messages, then the message should still be acknowledged as it was passed from the client to the software. The client might fail to send an acknowledgment due to external reasons, e. g., network issues.

So, what is the reason behind this behaviour?

Note that I was expecting the behaviour similar to suppress_exceptions = True only that the expceptions are passed through.

How to reproduce

Start your mqtt broker. You can use the docker compose and mosquitto configuration below.

Start the script below.

Send a message to t/qos_0 and t/qos_1, respectivly: mosquitto_pub -t qos=0 -m "msg_1" -q 0 -V mqttv5

The client below reuses the mqtt session. So if a QoS 1 message fails to process and the client restarts, the message is retried and the client is thus stuck in a loop immediatly.

import paho.mqtt.client as mqtt

client = mqtt.Client(protocol=mqtt.MQTTv5, callback_api_version=mqtt.CallbackAPIVersion.VERSION2, client_id="my-client-id")

properties = mqtt.Properties(mqtt.PacketTypes.CONNECT)
properties.SessionExpiryInterval = 0xFFFFFFFF

client.connect(host="localhost", clean_start=False, properties=properties)

@client.message_callback()
def on_message(client, userdata, msg: mqtt.MQTTMessage):
    raise Exception("Error: Processing failed")


@client.connect_callback()
def on_connect(client, userdata, flags, reason_code, properties):
    client.subscribe("t/qos_0", qos=0)
    client.subscribe("t/qos_1", qos=1)


while True:
    try:
        client.loop_forever()
    except Exception as e:
        print(e)

docker-compose.yml

version: "3.7"

services:
  mosquitto:
    image: eclipse-mosquitto
    hostname: mosquitto
    container_name: mosquitto
    restart: unless-stopped
    ports:
      - "1883:1883"
      - "9001:9001"
    volumes:
      - ./mosquitto:/etc/mosquitto
      - ./mosquitto/mosquitto.conf:/mosquitto/config/mosquitto.conf

mosquitto.conf

persistence false
allow_anonymous true
connection_messages true
log_type all
listener 1883

Environment

  • Python version: 3.11.9
  • Library version: 2.1.0
  • Operating system (including version): Windows 11, Ubuntu 24.10
  • MQTT server (name, version, configuration, hosting details): mosquitto 2.0.20, eclipse-docker image
@github-actions github-actions bot added the Status: Available No one has claimed responsibility for resolving this issue. label Oct 31, 2024
@rubenbaer rubenbaer changed the title Failing on_message can cause Failing on_message retries message indefinitly Oct 31, 2024
@PierreF
Copy link
Contributor

PierreF commented Mar 4, 2025

Hi,

the retry behind done by the client is the part that surprise me. The message being retry seems the wanted behaviour.

Use-case (which I've already implemented): you are stopping a consumer that must not lost messages. During shutdown, an output connector is already stopped (possibly the shutdown is due to this output connector being already stopped, like you lost a database access, become unhealthy and kill yourself). You need to tell the broker that you do NOT ack the message (raise an exception in on_message), so the broker will retry this message on future client.

Agree it means (and IMO it's the design of the protocol) that if a client always fail on a specific message (and therefore don't ack the message), then the broker will re-sent this message forever.

Note: for QoS 2, this behaviour is tricky (and probably need user decision): does re-sending the message violate the deliver only-once or not ? It probably depend on why user's on_publish callback raised an error (e.g. if it's connection refused to by DB I want a retry, if it's timeout on my DB request it might depend).

But clearly:

  • I got surprised that it's the client library that does the retry and not the broker. It might be due to missing "unack" feature of MQTT (if the MQTT client don't unack a message that failed to be processed, the retry won't occur before... I think client disconnection. And since this message is added to pending message, you might reach max_inflight messages and stop receiving new messages)
  • suppress_exceptions = True should fix you use-case, since it mostly catch all exception and just log them.

Maybe a better option that suppress_exceptions (which mostly make error close to invisible, which was why #365 changed the behaviour), an option (enabled by default ?) that cause message to always be ack even in case of error would help.
If you care a lot about not losing a message, you would could disable this option / use manual ack. But for other usage, this option would still ack the message even in case of error which would guarantee that you continue to process future message.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Status: Available No one has claimed responsibility for resolving this issue.
Projects
None yet
Development

No branches or pull requests

2 participants