NBAA Simulation run not processing records as expected
Issue reported as slow processing of records during NBAA Simulation run. In development environment, it was nine records per sec and in simulation it was seven records per sec.
Comparison between both the environments were made and there was not much significant difference. JVM heap memory size was increased, however this did not help improve the processing throughput. User was expecting this throughput to be around 30 to 40 records per sec.
No error messages.
Steps to Reproduce
- Go to NBA Studio > Decisioning > Decisions > Simulations.
- Run the Simulation. Wait for the results.
- Average speed is 13 to 14 records per second.
The total records were 3267379 and this would take days to complete the simulation.
This is the first time users were running this Simulation tests on huge data set after replicating the amount of data from the Production environment.
Investigations were based on in-house local tests and comparison with problematic environment:
- Initial investigation was done on Decision Logic placed on Strategies and found out there was not any issue with that.
- Analysis was also carried out based on number of Propositions and was ruled out.
- Issue was suspected to more interaction time with database. User has a table which takes more time (126.38sec, max) as compared to the local tests (0.07 sec)
- Database checks were done on Indexes, Parallelism and database was tuned.
Following changes were done in the user’s environment:
- JVM Heap size was increased from 4 GB to 8 GB
- Database was moved from old database server to NFT DB and there are 3 CPU’s in new database server
- Maximum speed obtained during investigation and changes was 18 records per sec.
- Introducing partitioning results with one node with 12 CPU's, 10 partitions, 10 threads and a batch size of 250, achieved a maximum speed of 83 records/second and completed a full run for entire data set 3201251 records in 19 hours with average speed of 47 records per second.
- Batch size was changed with values 50,100,150,250,500 and 1000; significant improvement with 73 records second was achieved with the batch size of 100.
Observe an average throughput of 50-60 records per second in a single node environment after performing infrastructure and database upgrade along with introducing partitioning.