Issue
While trying to test the solution's solvers using a benchmark configuration, I encounter the follow exception :
2021-12-22 15:24:37.328 WARN 22684 --- [ Test worker] c.o.b.i.D.singleBenchmarkRunnerException : The warm up singleBenchmarkRunner (Problem_0_Currently used_0) with random seed (null) failed.
java.lang.OutOfMemoryError: Java heap space
2021-12-22 15:24:37.329 WARN 22684 --- [ Test worker] c.o.b.i.D.singleBenchmarkRunnerException : The warm up singleBenchmarkRunner (Problem_0_Currently used_0) with random seed (null) failed.
java.lang.OutOfMemoryError: Java heap space
2021-12-22 15:24:37.330 WARN 22684 --- [ Test worker] c.o.b.i.D.singleBenchmarkRunnerException : The warm up singleBenchmarkRunner (Problem_0_Currently used_0) with random seed (null) failed.
java.lang.OutOfMemoryError: Java heap space
at java.base/java.lang.Long.valueOf(Long.java:1207)
at myrostering.solver.PEC.LambdaExtractorEC9F24820AB70C5865CE63ED29F967E9.apply(LambdaExtractorEC9F24820AB70C5865CE63ED29F967E9.java:69)
at myrostering.solver.PEC.LambdaExtractorEC9F24820AB70C5865CE63ED29F967E9.apply(LambdaExtractorEC9F24820AB70C5865CE63ED29F967E9.java:1)
at org.drools.model.functions.Function1$Impl.apply(Function1.java:35)
at org.drools.modelcompiler.constraints.LambdaReadAccessor.getValue(LambdaReadAccessor.java:42)
at org.drools.core.rule.Declaration.getValue(Declaration.java:258)
at org.drools.core.rule.Declaration.getValue(Declaration.java:253)
at org.drools.modelcompiler.constraints.BindingEvaluator.getArgument(BindingEvaluator.java:59)
at org.drools.modelcompiler.constraints.ConstraintEvaluator$InnerEvaluator.getArgument(ConstraintEvaluator.java:242)
at org.drools.modelcompiler.constraints.ConstraintEvaluator$InnerEvaluator$_2.evaluate(ConstraintEvaluator.java:309)
at org.drools.modelcompiler.constraints.ConstraintEvaluator.evaluate(ConstraintEvaluator.java:124)
at org.drools.modelcompiler.constraints.LambdaConstraint.isAllowedCachedLeft(LambdaConstraint.java:187)
at org.drools.core.common.SingleBetaConstraints.isAllowedCachedLeft(SingleBetaConstraints.java:132)
at org.drools.core.phreak.PhreakAccumulateNode.doLeftInserts(PhreakAccumulateNode.java:178)
at org.drools.core.phreak.PhreakAccumulateNode.doNode(PhreakAccumulateNode.java:89)
at org.drools.core.phreak.RuleNetworkEvaluator.switchOnDoBetaNode(RuleNetworkEvaluator.java:591)
at org.drools.core.phreak.RuleNetworkEvaluator.evalBetaNode(RuleNetworkEvaluator.java:558)
at org.drools.core.phreak.RuleNetworkEvaluator.evalNode(RuleNetworkEvaluator.java:385)
at org.drools.core.phreak.RuleNetworkEvaluator.innerEval(RuleNetworkEvaluator.java:345)
at org.drools.core.phreak.RuleNetworkEvaluator.outerEval(RuleNetworkEvaluator.java:181)
at org.drools.core.phreak.RuleNetworkEvaluator.evaluateNetwork(RuleNetworkEvaluator.java:139)
at org.drools.core.phreak.RuleExecutor.reEvaluateNetwork(RuleExecutor.java:235)
at org.drools.core.phreak.RuleExecutor.evaluateNetworkAndFire(RuleExecutor.java:91)
at org.drools.core.concurrent.AbstractRuleEvaluator.internalEvaluateAndFire(AbstractRuleEvaluator.java:33)
at org.drools.core.concurrent.SequentialRuleEvaluator.evaluateAndFire(SequentialRuleEvaluator.java:43)
at org.drools.core.common.DefaultAgenda.fireLoop(DefaultAgenda.java:753)
at org.drools.core.common.DefaultAgenda.internalFireAllRules(DefaultAgenda.java:700)
at org.drools.core.common.DefaultAgenda.fireAllRules(DefaultAgenda.java:692)
at org.drools.core.impl.StatefulKnowledgeSessionImpl.internalFireAllRules(StatefulKnowledgeSessionImpl.java:1225)
at org.drools.core.impl.StatefulKnowledgeSessionImpl.fireAllRules(StatefulKnowledgeSessionImpl.java:1216)
at org.drools.core.impl.StatefulKnowledgeSessionImpl.fireAllRules(StatefulKnowledgeSessionImpl.java:1200)
at org.optaplanner.core.impl.score.director.drools.DroolsScoreDirector.calculateScore(DroolsScoreDirector.java:105)
Here is the test class I ran:
@SpringBootTest(classes = MyApplication.class)
@EnableConfigurationProperties({ApplicationProperties.class, MyRosterProperties.class})
public class SolverBenchmarkTest {
private PlannerBenchmarkFactory benchmarkFactory = PlannerBenchmarkFactory.createFromXmlResource(
"myrostering/benchmark/benchmarkSolverConfig.xml");
@Autowired
MyRosterGenerator myRosterGenerator;
@Test
public void benchmarkBasicRostering() {
MyRoster mr = myRosterGenerator.createMyRoster();
PlannerBenchmark benchmark = benchmarkFactory.buildPlannerBenchmark(mr);
benchmark.benchmarkAndShowReportInBrowser();
}
}
Here is the benchmark configuration file :
<?xml version="1.0" encoding="UTF-8"?>
<plannerBenchmark xmlns="https://www.optaplanner.org/xsd/benchmark" xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance"
xsi:schemaLocation="https://www.optaplanner.org/xsd/benchmark https://www.optaplanner.org/xsd/benchmark/benchmark.xsd">
<benchmarkDirectory>local/benchmark/data/my-roster</benchmarkDirectory>
<parallelBenchmarkCount>AUTO</parallelBenchmarkCount>
<warmUpSecondsSpentLimit>30</warmUpSecondsSpentLimit>
<inheritedSolverBenchmark>
<solver>
<!-- This part of the solver configuration must be the same as the one used by the planner, otherwise, the benchmark test is pointless -->
<moveThreadCount>4</moveThreadCount>
<solutionClass>myrostering.domain.MyRoster</solutionClass>
<entityClass>myrostering.domain.Assignment</entityClass>
<scoreDirectorFactory>
<scoreDrl>myrostering/solver/myRosteringScoreRules.drl</scoreDrl>
</scoreDirectorFactory>
<termination>
<!-- Adding this secondsSpentLimit (contrary to no limit set for the planner) to avoid the benchmark running for too long -->
<secondsSpentLimit>60</secondsSpentLimit>
<bestScoreLimit>0hard/0medium/0soft</bestScoreLimit>
</termination>
<constructionHeuristic>
<constructionHeuristicType>STRONGEST_FIT</constructionHeuristicType>
</constructionHeuristic>
</solver>
</inheritedSolverBenchmark>
<solverBenchmark>
<name>Currently used</name>
<solver>
<localSearch>
<unionMoveSelector>
<moveListFactory>
<cacheType>PHASE</cacheType>
<moveListFactoryClass>
myrostering.solver.move.factory.ChangeMoveFactory
</moveListFactoryClass>
</moveListFactory>
<moveListFactory>
<cacheType>PHASE</cacheType>
<moveListFactoryClass>
myrostering.solver.move.factory.SwapMoveFactory
</moveListFactoryClass>
</moveListFactory>
</unionMoveSelector>
<acceptor>
<entityTabuSize>5</entityTabuSize>
<simulatedAnnealingStartingTemperature>15000hard/10medium/1000soft</simulatedAnnealingStartingTemperature>
</acceptor>
<forager>
<acceptedCountLimit>4</acceptedCountLimit>
</forager>
</localSearch>
</solver>
</solverBenchmark>
</plannerBenchmark>
Also, I'd like to add that we run the solver.solve() without an issue - even if the dataset is quite large (150 to 300 Mo for the file containing the solution when serialized). So I'm a bit surprised when the benchmark fails on warm up...
EDIT:
I've changed the configuration for these two parameters :
<parallelBenchmarkCount>1</parallelBenchmarkCount>
...
<secondsSpentLimit>600</secondsSpentLimit>
But I still got the following exception :
2022-01-03 10:53:49.850 INFO 21696 --- [nchmarkThread-1] o.d.c.kie.builder.impl.KieContainerImpl : Start creation of KieBase: defaultKieBase
2022-01-03 10:53:49.909 INFO 21696 --- [nchmarkThread-1] o.d.c.kie.builder.impl.KieContainerImpl : End creation of KieBase: defaultKieBase
2022-01-03 10:54:32.585 INFO 21696 --- [nchmarkThread-1] o.o.core.impl.solver.DefaultSolver : Solving started: time spent (41506), best score (-38295462hard/38260medium/3640soft), environment mode (REPRODUCIBLE), move thread count (4), random (JDK with seed 0).
2022-01-03 10:54:33.611 ERROR 21696 --- [nchmarkThread-1] o.o.core.impl.solver.thread.ThreadUtils : Multithreaded Local Search's ExecutorService didn't terminate within timeout (1 seconds).
2022-01-03 10:54:33.611 INFO 21696 --- [nchmarkThread-1] o.o.c.i.h.thread.MoveThreadRunner : Score calculation speed will be too low because move thread (0)'s destroy wasn't processed soon enough.
2022-01-03 10:54:33.611 INFO 21696 --- [nchmarkThread-1] o.o.c.i.h.thread.MoveThreadRunner : Score calculation speed will be too low because move thread (1)'s destroy wasn't processed soon enough.
2022-01-03 10:54:33.611 INFO 21696 --- [nchmarkThread-1] o.o.c.i.h.thread.MoveThreadRunner : Score calculation speed will be too low because move thread (2)'s destroy wasn't processed soon enough.
2022-01-03 10:54:33.611 INFO 21696 --- [nchmarkThread-1] o.o.c.i.h.thread.MoveThreadRunner : Score calculation speed will be too low because move thread (3)'s destroy wasn't processed soon enough.
2022-01-03 10:54:33.612 INFO 21696 --- [nchmarkThread-1] .c.i.c.DefaultConstructionHeuristicPhase : Construction Heuristic phase (0) ended: time spent (42533), best score (-38295462hard/38260medium/3640soft), score calculation speed (0/sec), step total (0).
2022-01-03 10:56:00.115 WARN 21696 --- [ Test worker] c.o.b.i.D.singleBenchmarkRunnerException : The subSingleBenchmarkRunner (Problem_0_Currently used_0) with random seed (null) failed.
java.lang.OutOfMemoryError: Java heap space
at java.base/java.util.Arrays.copyOf(Arrays.java:3480)
at java.base/java.util.ArrayList.grow(ArrayList.java:237)
at java.base/java.util.ArrayList.grow(ArrayList.java:244)
at java.base/java.util.ArrayList.add(ArrayList.java:454)
at java.base/java.util.ArrayList.add(ArrayList.java:467)
at be.myrostering.solver.move.factory.MySwapMoveFactory.createMoveList(MySwapMoveFactory.java:50)
at be.myrostering.solver.move.factory.MySwapMoveFactory.createMoveList(MySwapMoveFactory.java:30)
at org.optaplanner.core.impl.heuristic.selector.move.factory.MoveListFactoryToMoveSelectorBridge.constructCache(MoveListFactoryToMoveSelectorBridge.java:72)
at org.optaplanner.core.impl.heuristic.selector.common.SelectionCacheLifecycleBridge.phaseStarted(SelectionCacheLifecycleBridge.java:51)
at org.optaplanner.core.impl.phase.event.PhaseLifecycleSupport.firePhaseStarted(PhaseLifecycleSupport.java:37)
at org.optaplanner.core.impl.heuristic.selector.AbstractSelector.phaseStarted(AbstractSelector.java:50)
at org.optaplanner.core.impl.phase.event.PhaseLifecycleSupport.firePhaseStarted(PhaseLifecycleSupport.java:37)
at org.optaplanner.core.impl.heuristic.selector.AbstractSelector.phaseStarted(AbstractSelector.java:50)
at org.optaplanner.core.impl.localsearch.decider.LocalSearchDecider.phaseStarted(LocalSearchDecider.java:94)
at org.optaplanner.core.impl.localsearch.decider.MultiThreadedLocalSearchDecider.phaseStarted(MultiThreadedLocalSearchDecider.java:92)
at org.optaplanner.core.impl.localsearch.DefaultLocalSearchPhase.phaseStarted(DefaultLocalSearchPhase.java:141)
at org.optaplanner.core.impl.localsearch.DefaultLocalSearchPhase.solve(DefaultLocalSearchPhase.java:82)
at org.optaplanner.core.impl.solver.AbstractSolver.runPhases(AbstractSolver.java:99)
at org.optaplanner.core.impl.solver.DefaultSolver.solve(DefaultSolver.java:192)
at org.optaplanner.benchmark.impl.SubSingleBenchmarkRunner.call(SubSingleBenchmarkRunner.java:122)
at org.optaplanner.benchmark.impl.SubSingleBenchmarkRunner.call(SubSingleBenchmarkRunner.java:42)
at java.base/java.util.concurrent.FutureTask.run(FutureTask.java:264)
at java.base/java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1130)
at java.base/java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:630)
at java.base/java.lang.Thread.run(Thread.java:832)
2022-01-03 10:56:00.603 INFO 21696 --- [ Test worker] o.o.b.impl.report.BenchmarkReport : Generating benchmark report...
VERSION_2_3_31
java.lang.NoSuchFieldError: VERSION_2_3_31
at org.optaplanner.benchmark.impl.report.BenchmarkReport.writeHtmlOverviewFile(BenchmarkReport.java:828)
at org.optaplanner.benchmark.impl.report.BenchmarkReport.writeReport(BenchmarkReport.java:318)
at org.optaplanner.benchmark.impl.DefaultPlannerBenchmark.benchmarkingEnded(DefaultPlannerBenchmark.java:311)
at org.optaplanner.benchmark.impl.DefaultPlannerBenchmark.benchmark(DefaultPlannerBenchmark.java:100)
at org.optaplanner.benchmark.impl.DefaultPlannerBenchmark.benchmarkAndShowReportInBrowser(DefaultPlannerBenchmark.java:424)
On a final note, it appears that the problem might not be linked to Optaplanner (because the out-of-memory is triggered in MySwapMoveFactory) - if so, I'll close this post. But it would be still odd that it works when running the solver but not the benchmark...
Solution
MoveListFactory
scales badly, consuming a lot of memory and cpu.
Some kind of moves have billions of possible moves. For example a 3 swap on 10000 shifts has 1 trillion moves. That doesn't fit into a few GB RAM and it takes ages to generate.
Use a MoveIteratorFactory
instead and don't generate a list of moves, but generate them Just In Time, just like the default selectors do. See docs.
Answered By - Geoffrey De Smet
0 comments:
Post a Comment
Note: Only a member of this blog may post a comment.