Why does AMR force me into a weaker trinket?

So I’m using a 203 trinket currently, and I’ve found a 207 trinket (different trinket). The 203 is on-use and the 207 is a passive trinket.

AMR always asks me to use the 203 trinket in any setup, and claims I will lose up to 1.89% DPS if I swap out for the higher trinket. However, using AMR’s own simulation, the 207 trinket actually yields a few 100 superior DPS in any setup: single target, 2 target cleave, AoE, etc.

So, why would AMR ask me to continue with a weaker trinket?

We need more information to comment on this. Use the help link above the gear table in BiB to generate a snapshot id and then post that id for us.

Hey thank you. The ID is: d0311193ae5b478e9988e176c36e3799

The trinkets in question are:
(A) 203 Empyreal Ordnance
(B) 207 Infinitely Divisible Ooze

AMR first recommended that for single target, I swap my Soulbind to one that most guides do not recommend. I first ran a Simulation comparing the two soulbinds (my former vs the one AMR suggested) and AMR was indeed correct. I do sim better with the second soulbind. This could be because of the way my secondary stats are distributed.

However then I got curious on my trinkets. AMR recommends that I use (A), however I ran a whole bunch of simulations last night and swapping (A) for (B) yields better results in just about every scenario.

Edit: Just re-ran quick ST simulations to confirm:
With trinket A: 5,294
With trinket B: 5,306

The only thing I’m swapping around is the trinkets. I’m not changing my soulbinds, gear, etc. between simulations. There are some setups where Trinket B yields an even superior gap in DPS vs trinket A. However, AMR always recommends trinket A, no matter the DPS scenario I’m up against.

Those differences are extremely small. 12 DPS difference is just noise in the simulation data, to be honest. Simulators, in our opinion, aren’t good for making comparisons on that small a scale. We use a lot of data and then look at the trends to generate a scoring function that will consistently and reliably suggest gear that is highly likely to be good.

In your case… you aren’t going to notice a difference between those two trinkets if you make use of the ordinance optimally. Some fights might be better or worse depending on how it lines up with cooldowns/fight length. Our model (based off simulation data) will closely follow the simulator, but not necessarily match every spot simulation perfectly, by design. We want to give consistent advice that smooths out the noise in the simulation data.

I get and totally respect that but there are other simulations that open the gap between the two, yet AMR always asks me to opt for the weaker trinket.

Example: 2 target ranged cleave
Trinket A: 6,661
Trinket B: 6,799

Now the difference opens to beyond 100 DPS. I get that it’s a random proc on Trinket B depending on my Haste, but still… with it being completely passive and offering better results, why isn’t the engine recommending it and claiming it’s a loss?

The optimization is not based on 2 target ranged cleave simulations. We have an optimization based on data from the single target script and data from they mythic+ (multi-target) script.

If you imagine all the data we collect as this noisy morass of data points jumping up and down be small amounts - our scoring function draws a smoother line through that and reports the % DPS change as the predicted score. That might fall slightly above or below what any one particular simulation/data point shows.

When we look at the scale of the problem - we’re talking billions, trillions, quadrillions of possible gear setups, the optimizer always following the simulator for the given script within 1-2% is extremely accurate and well below the threshold of what a player could perceive when playing the game.