Support Article

Cassandra exception: Cannot achieve consistency level ONE

SA-25046

Summary



User has 13 Pega 7 nodes in a cluster that has a Pega Marketing 7.13 application installed. User has exposed an API that invokes a Strategy to give offers back.

During load testing, user observes that around 5% of requests return HTTP 500 and that few DNodes still start during testing.

Error Messages



10:49:55,288 INFO [stdout] (Dispatcher-Thread-123) com.pega.dsm.dnode.api.dataflow.StageException: Exception in stage: Next Best Activity.
...

10:49:55,294 INFO [stdout] (Dispatcher-Thread-123) at com.pega.dsm.dnode.impl.cassandra.CassandraDao.executeStatement(CassandraDao.java:172)
10:49:55,294 INFO [stdout] (Dispatcher-Thread-123) at com.pega.dsm.dnode.impl.cassandra.CassandraDataRepository.insert(CassandraDataRepository.java:252)
10:49:55,294 INFO [stdout] (Dispatcher-Thread-123) at com.pega.dsm.dnode.impl.dataset.cassandra.CassandraSaveOperation$2.emit(CassandraSaveOperation.java:101)
10:49:55,294 INFO [stdout] (Dispatcher-Thread-123) at com.pega.dsm.dnode.impl.stream.DataObservableImpl$SafeDataSubscriber.subscribe(DataObservableImpl.java:320)
10:49:55,294 INFO [stdout] (Dispatcher-Thread-123) at com.pega.dsm.dnode.impl.stream.DataObservableImpl.subscribe(DataObservableImpl.java:52)
10:49:55,294 INFO [stdout] (Dispatcher-Thread-123) at com.pega.dsm.dnode.impl.stream.DataObservableImpl.await(DataObservableImpl.java:98)
10:49:55,294 INFO [stdout] (Dispatcher-Thread-123) at com.pega.dsm.dnode.impl.stream.DataObservableImpl.await(DataObservableImpl.java:87)
10:49:55,294 INFO [stdout] (Dispatcher-Thread-123) at com.pega.dsm.dnode.impl.dataflow.strategy.DelayedLearning.saveResultsForDelayedLearning(DelayedLearning.java:53)
10:49:55,294 INFO [stdout] (Dispatcher-Thread-123) at com.pega.dsm.dnode.impl.dataflow.strategy.StrategyStageProcessor.onNext(StrategyStageProcessor.java:145)
10:49:55,294 INFO [stdout] (Dispatcher-Thread-123) ... 189 more
10:49:55,294 INFO [stdout] (Dispatcher-Thread-123) Caused by: org.apache.cassandra.exceptions.UnavailableException: Cannot achieve consistency level ONE
10:49:55,294 INFO [stdout] (Dispatcher-Thread-123) at org.apache.cassandra.db.ConsistencyLevel.assureSufficientLiveNodes(ConsistencyLevel.java:292)
10:49:55,294 INFO [stdout] (Dispatcher-Thread-123) at org.apache.cassandra.service.AbstractWriteResponseHandler.assureSufficientLiveNodes(AbstractWriteResponseHandler.java:117)
10:49:55,294 INFO [stdout] (Dispatcher-Thread-123) at org.apache.cassandra.service.StorageProxy.performWrite(StorageProxy.java:382)
10:49:55,294 INFO [stdout] (Dispatcher-Thread-123) at org.apache.cassandra.service.StorageProxy.mutate(StorageProxy.java:191)
10:49:55,294 INFO [stdout] (Dispatcher-Thread-123) at org.apache.cassandra.cql3.statements.ModificationStatement.execute(ModificationStatement.java:128)


Steps to Reproduce



1. Include a cluster of five nodes and stop all the nodes.
2. Start Node1 and use a load test to call an HTTP API that uses a Strategy.
3. At the same time, start the rest of the nodes.

Root Cause



User has set a Cassandra consistency level to 1 (CL1), which means that “A write must be written to the
commit log and memtable of at least one replica node.”

Because the replication factor is currently set to 1, when the system tries to write to the commit log and memTable, the node is not available, causing the “Cannot achieve consistency level ONE” Cassandra exception to appear in the logs.

Resolution



Refer to DSM D-Node Operations Guide.
This guide provides details on the replication factor and how to alter it. If you have a cluster size of 5 or higher, the replication factor should be 5 to specify that each row has its original instance and four replicas.

Note: Replication factor describes how many copies of your data exist; whereas, consistency level describes the behavior seen by the client. Replication factor and consistency levels are inter-related.

For example, writing with a replication factor of 3, three copies are always stored, assuming enough nodes are up. Each row has its original instance and two replicas.

When a node is down, writes for that node are stashed away and written when it comes back up, unless it is down long enough for Cassandra to decide that the node is gone for good.


Continuing with this example, when you write with a consistency level of ONE, the client receives a success acknowledgement after a write is done to one node without waiting for the second write.
If you write with a CL of ALL, the acknowledgement to the client waits until both copies are written.

There are many more consistency level options, too many to discuss here.

For additional information, see
http://docs.datastax.com/en/cassandra/2.0/cassandra/dml/dml_config_consistency_c.html.

In the same example, if read with a consistency level of ONE, the response is sent to the client after a single replica responds.

Another replica may have newer data, in which case the response will not be up-to-date. In many contexts, that is quite sufficient.

In other cases, if the client server needs the most up-to-date information, you specify a different consistency level on the read -- perhaps a level ALL. In this way, the consistency of Cassandra and other post-relational databases is tunable in ways that relational databases typically are not.

Published June 29, 2016 - Updated August 23, 2017


100% found this useful

Have a question? Get answers now.

Visit the Collaboration Center to ask questions, engage in discussions, share ideas, and help others.