Kafka Consumers Are Disrupting Production — And Go Fails to Help

Kafka Consumers Are Disrupting Production — And Go Fails to Help

We relied on Go to ensure our Kafka pipeline’s reliability. Instead, it silently allowed bugs until everything collapsed.

Kafka offers speed, scale, and separation, forming the core of many modern systems, ours included. Writing our consumers in Go seemed ideal: fast, clean, easy to deploy, and offering just enough control without excess complexity.

However, speed doesn’t guarantee safety. Despite Go’s strengths, it won’t prevent subtle, dangerous errors.

We found out the hard way.

Initially, things ran smoothly for months. Our consumers handled thousands of events per second with minimal latency issues. Life was good.

Then, one Friday night, a seemingly harmless deployment triggered a partition rebalance. Some consumers restarted, and Kafka reassigned partitions — standard procedure. But over the next few hours, anomalies emerged: users received duplicate emails, invoices were generated twice, and some refunds were processed twice.

Tracing the source revealed that our consumers were reprocessing the same messages, and we hadn’t managed it well.

The uncomfortable truth is that Kafka guarantees at-least-once delivery by default, meaning duplicate messages are inevitable, especially if consumers restart, there’s a timeout, or offset commits fail.

Although we knew this theoretically, in practice, we mishandled it:

“`
func handleInvoiceCreated(msg kafka.Message) error {
var event InvoiceCreated
json.Unmarshal(msg.Value, &event)
return db.InsertInvoice(event) // Whoops
}
“`

No deduplication. No idempotency. Blind trust that every message was new.

Here’s how we encountered issues:

“`
Kafka Broker


Consumer A
(crashes)


Consumer B takes over

Same message reprocessed
“`

This is standard Kafka behavior. Without a prepared handler, duplicate side effects, corrupted states, and hours of manual cleanup ensued.

After cleaning up the mess, we enhanced critical parts rather than redoing everything.

1. Idempotency at the Database Layer

We made our inserts safe by adding a unique constraint on `event_id`:
“`
CREATE UNIQUE INDEX invoice_event_id_idx ON invoices(event_id);
“`
Updated the handler as follows:

“`
err := db.InsertInvoice(event)
if isUniqueViolation(err) {
log.Println(“Duplicate event, skipping”)
return nil
}
“`

While not elegant, it was effective.

2. Manual Offset Commits

Using auto-commit (Kafka’s default) led to message losses during crashes. We switched to manual commit — after successful processing only:

“`
msg, err := r.ReadMessage(ctx)
if err != nil { return err }

if err := handle(msg); err != nil {
return err // don’t commit
}

r.CommitMessages(ctx, msg)
“`

This approach significantly reduced silent data loss.

3. Dead Letter Queues (Finally)

Certain events were inherently flawed, like malformed payloads or missing fields. Previously, retries persisted endlessly. Now, known fatal errors push messages to a DLQ:

“`
if isFatalError(err) {
dlqProducer.WriteMessages(ctx, kafka.Message{
Key: msg.Key,
Value: msg.Value,
})
}
“`

This method prevented log spamming and system-choking retries, offering valuable insights into upstream issues.

So, Was Go the Problem?

No. But Go didn’t rescue us either. The language offered performance without protection. Go’s Kafka libraries are minimalistic, a strength and potential hazard. They don’t dictate offset handling, retries, or idempotency, simplifying assumptions that everything’s working.

It was, until it wasn’t.

Final Thoughts

If you’re developing Kafka consumers in Go — or any language — here’s advice I wish I had earlier:

– Don’t rely on throughput metrics; prioritize correctness.
– Kafka will redeliver messages. Prepare for it from day one.
– Idempotency isn’t optional; it’s essential.
– Auto-commit can cause unnoticed issues; use it only if well-understood.
– DLQs are crucial; they’re indispensable.

Our consumers still operate in Go. We didn’t change languages but shifted mindsets — prioritizing safety over speed.

And now? We enjoy better rest.

Leave a Reply

Your email address will not be published. Required fields are marked *