Bobcares

How to Fix NOT_ENOUGH_REPLICAS Error in AWS MSK

by | Mar 4, 2025

Learn how to fix NOT_ENOUGH_REPLICAS Error in AWS MSK. Our AWS Support team is here to help you with your questions and concerns.

How to Fix NOT_ENOUGH_REPLICAS Error in AWS MSK

How to Fix NOT_ENOUGH_REPLICAS Error in AWS MSKAccording to our Experts, the NOT_ENOUGH_REPLICAS error in AWS MSK occurs when the number of in-sync replicas (ISRs) for a partition cannot meet the replication requirements.

This error typically arises due to broker failures, network issues, or misconfigurations that prevent Kafka from maintaining the desired replication factor.

Today, we will examine the impact of this error and discuss easy solutions for resolving it.

Impacts of the Error

When this error appears, it can cause several critical issues, including:

  • Messages may not be fully replicated, increasing the risk of data loss.
  • Producers configured with `acks=all` will fail to send messages if replication requirements are unmet.
  • Applications like Kafka Connect or Debezium may struggle to commit records, disrupting data pipelines.
  • Producers may experience retries, leading to higher latency.
  • The cluster becomes more vulnerable to further failures due to reduced replica availability.

Causes and Fixes

Let’s break down the causes of the NOT_ENOUGH_REPLICAS error and how to resolve them.

1. Low Replication Factor

The topic may have a low replication factor (e.g., 1), which does not provide redundancy.

Click here for the Solution.

Increase the replication factor by following these steps:

  1. Identify the Topic:


    kafka-topics.sh --list --zookeeper

  2. Create a JSON File for Reassignment:


    {
    "version": 1,
    "partitions": [
    {
    "topic": "topic_name",
    "partition": 0,
    "replicas": [0, 1, 2]
    },
    {
    "topic": "topic_name",
    "partition": 1,
    "replicas": [0, 1, 2]
    }
    ]
    }

  3. Execute the Reassignment:

    kafka-reassign-partitions.sh --zookeeper --reassignment-json-file increase-replication-factor.json --execute

  4. Finally, verify the Changes:

    kafka-topics.sh --describe --zookeeper --topic

2. Brokers are Down

One or more brokers in the cluster may be down.

Click here for the Solution.
  • Use Amazon CloudWatch or Kafka’s internal metrics to monitor Broker status.
  • Restart Failed Brokers:

    sudo systemctl restart kafka

  • Check broker configurations in `server.properties`.
  • If broker failures are frequent, consider adding more brokers to the cluster.

3. High min.insync.replicas Setting

The `min.insync.replicas` setting might be too high relative to the number of available replicas.

Click here for the Solution.
  • Review current configuration:

    kafka-configs.sh --describe --entity-type topics --entity-name topic_name --zookeeper zookeeper_host

  • Modify min.insync.replicas Setting:

    kafka-configs.sh --alter --entity-type topics --entity-name --add-config min.insync.replicas=new_value --zookeeper zookeeper_host

  • Then, verify changes:

    kafka-configs.sh --describe --entity-type topics --entity-name topic_name --zookeeper zookeeper_host

4. Network Issues

Network partitions or connectivity issues can prevent brokers from communicating.

Click here for the Solution.
  • Review security groups, firewalls, and routing configurations in AWS.
  • Ensure brokers are in the same VPC or have proper peering arrangements.
  • Test connectivity between brokers:


    ping broker_ip
    telnet broker_ip 9092

Prevention Strategies

To prevent future NOT_ENOUGH_REPLICAS errors:

  • Always configure topics with a replication factor greater than 1 based on fault tolerance needs.
  • Use Amazon CloudWatch to track broker health and partition status.
  • Configure auto-scaling for your MSK cluster to handle variable loads.
  • Adjust Kafka settings based on application requirements.
  • Identify bottlenecks and ensure your infrastructure can handle peak loads.

[Need assistance with a different issue? Our team is available 24/7.]

Conclusion

We can maintain data reliability, reduce latency, and enhance fault tolerance by understanding the root causes of the NOT_ENOUGH_REPLICAS error in AWS MSK and implementing these fixes.

In brief, our Support Experts demonstrated how to fix the “configmap aws-auth does not exist” error in Amazon EKS with Terraform.

0 Comments

Submit a Comment

Your email address will not be published. Required fields are marked *

Never again lose customers to poor
server speed! Let us help you.

Privacy Preference Center

Necessary

Necessary cookies help make a website usable by enabling basic functions like page navigation and access to secure areas of the website. The website cannot function properly without these cookies.

PHPSESSID - Preserves user session state across page requests.

gdpr[consent_types] - Used to store user consents.

gdpr[allowed_cookies] - Used to store user allowed cookies.

PHPSESSID, gdpr[consent_types], gdpr[allowed_cookies]
PHPSESSID
WHMCSpKDlPzh2chML

Statistics

Statistic cookies help website owners to understand how visitors interact with websites by collecting and reporting information anonymously.

_ga - Preserves user session state across page requests.

_gat - Used by Google Analytics to throttle request rate

_gid - Registers a unique ID that is used to generate statistical data on how you use the website.

smartlookCookie - Used to collect user device and location information of the site visitors to improve the websites User Experience.

_ga, _gat, _gid
_ga, _gat, _gid
smartlookCookie
_clck, _clsk, CLID, ANONCHK, MR, MUID, SM

Marketing

Marketing cookies are used to track visitors across websites. The intention is to display ads that are relevant and engaging for the individual user and thereby more valuable for publishers and third party advertisers.

IDE - Used by Google DoubleClick to register and report the website user's actions after viewing or clicking one of the advertiser's ads with the purpose of measuring the efficacy of an ad and to present targeted ads to the user.

test_cookie - Used to check if the user's browser supports cookies.

1P_JAR - Google cookie. These cookies are used to collect website statistics and track conversion rates.

NID - Registers a unique ID that identifies a returning user's device. The ID is used for serving ads that are most relevant to the user.

DV - Google ad personalisation

_reb2bgeo - The visitor's geographical location

_reb2bloaded - Whether or not the script loaded for the visitor

_reb2bref - The referring URL for the visit

_reb2bsessionID - The visitor's RB2B session ID

_reb2buid - The visitor's RB2B user ID

IDE, test_cookie, 1P_JAR, NID, DV, NID
IDE, test_cookie
1P_JAR, NID, DV
NID
hblid
_reb2bgeo, _reb2bloaded, _reb2bref, _reb2bsessionID, _reb2buid

Security

These are essential site cookies, used by the google reCAPTCHA. These cookies use an unique identifier to verify if a visitor is human or a bot.

SID, APISID, HSID, NID, PREF
SID, APISID, HSID, NID, PREF