Simulation results are different than what BiB suggests


#1

I’m using custom gearing strategy for BiB and the same settings for simulation.
I wanted to check numbers for BiB, so I used test combination method for simulation.
Here is the report:
https://www.askmrrobot.com/wow/simulator/report/f8fea1d1fe7b44c8aed001de14a6d293
As you can see the Deep Fathom’s Bite simulate better than Mother’s Twin Gaze.
BiB suggest equipping the Mother’s Twin Gaze for OH.
The Deep Fathom’s Bite is 1,12% worse.
Could you explain me this inaccuracy between BiB and test combination?

Gearing strategy report:
https://www.askmrrobot.com/wow/simulator/report/0b56e4c3602344d7be5bce9862905c6f?weights=true&strategyId=582c4416314f454ca05fd014d5ac2ee2

Here’s my string from the game:
$64;EU;Burning Legion;Avereyn;Supreme VooDoo;10;2;120;26;13:1,15:925,12:2,8:150,2:150;2;.s1;25;2212321;.s2;26;2112323;.s3;27;3132312;.q1;155860s5b1512b3274b33a273007a6919;2087s3b-3287b3608a454a-13443;128s2b-3634b3423b4b3;78s14b-3414b3259b22;7s11b-3286b3268b17b1x153711;204s1b-3291b3274b33a14577a-15334;771s16b-3307b3267b7e5965;159s15b-3289b3281b7;27s10b-3273b3267b7e-31;22s8b-3274b3267b7;285s13b-3274b3267b7;97s6b-3274b3267b7;1782s7;9s9;2352s17b-3279b3278b340;66s12b-5083b1430b3314e10;.q2;158075s2b1522b3407b1b6;1050s14;182s3b-3409b3259b259a278676a-11739a13083;310s13b-3533b3267b7;1001s10b-3309b3309b12e5934;1s5b-3306b3294b37a535a-13618a-2975;6s7b-3326b311b2975b16x154126;18s15b-3307b316b2978b13x-418;2s12b-3307b3294b13e8;37s16b-3322b3309b12e23;0s17b-3321b3309b12e-3;35s6b-3321b3309b12;355s9b-3301b3286b16;287s11b-3322b3309b12e-24;2079s8b-3231b3216b343;158s1b-3634b3294b37a16597a-13623a13084;.q3;128476s16b743b738b4b4x143696y-10012z6743p1489p336p1703q-2026q1834q79r-1915r1836r150;3s17;409s7b-889b79b10b1022b1697;5307s10b-1905b1913b112;124s2b-2025b2025b45;140s9b-3533b1456b231b1609;65s12b-1834b1835b73;2525s11b49b171x11156e5428;488s13b-2152b248b1611;2489s14b-1859b330b1528b137x-21364;6643s8b-14b171x0y0z0;4300s1b-2135b313b1588b132x0;608s3b-2928b1099b1709b200b1;781s5b-2137b2056b86;42s15;356s6b-3571b1429b213b1843b101;.inv;2459;4489;132;4211;10586;695;3;2;1;1215;685;1171;50;1;7;271;5688;1626;776;1212;5492;2330;2481;1;18811;1;488;390;631;5178;551;1;1;0;0;1;1;77;0;1;8153;6256;1274;1713;572u451;394;4143;1;11;1140;8943;4123;637;177;331;976;51;796;1;2167;571;4;28;31;1;2205;1151;3379;1393;2;278;19;1464;3;5;4;0;0;0;0;0;2;0;2;1;0;2;1;3;314;0;0;0;1;0;0;0;0;0;2;1;4034b743b738b4b4x143696y-10012z6743p1489p336p1703q-2026q1834q79r-1915r1836r150;394b-748b751b2b0x-3050y7085z6545p-1966p1817p80q-1905q1824q237r-2027r1791r59;2b-752;404;471;317;113;726;1;1;1539b1069b1819x-20760;8b-171b171x23466;1707b-3026b909b1823b237;2183;444;263b-114b171x-2130e5428;20b-171b171;573;286;183;24;592;857;4;226;378;23;0;796;3;3;278;56;254;47;683;1;1325;124;451b-171b171;2433b-171b171x2128y0z-1;832;332;3311;422;2;396b-2143b2041b83;357;176;16;28;0;1;0;0;0;0;0;0;5;29;55;7;30;382;0;1;0;0;0;387;230;32;11;415;2;1;25b-1925;1b-1;9;0;1;0;0;0;0;0;0;0;0;1;3;146b-1645b1482b3263b18;159b-3281b3259b21b1;248;0;0;0;0;0;0;160;3;1;11;0;1;1;0;90;1;281b-3276b3256b20;297b-3286b3268b18;270;22b-3291b3274b33a273452a-6516;9b-3302b3269b16b144;661;1348b-3414b3606a6516a-6516a1658;3b-3606b3604;29b-3604b3606;1b-3616b3263b18;1b-4761b1475b3264b22;3b-3271b3608a4341a-8948a-25;30b-3608b3604a14714a-11740a1499;2b-3604b3608a11945a-13443a1659;34b-3608b3252b19;104b-3286b3268b18;18b-3286b3268b17b1;11b-3276b3256b20;4b-3266b3246b20;2b-3271b3254b17;32b-3266b3247b19;1b-3281b3259b22;1b-3281b3263b18;1b-3276b3256b20;5b-3286b3268b17b1x416e510;0b-3291b3273b18;2b-3281b3259b21b1x-415;1b-3271b3252b19;1b-3281b3259b22;40;14b-3281b3263b18;100b-3271b3251b222;1b-3488b3263b6;45b-3274b3274b33a12918a-15334a-2218;14;3;0;333b-3307b3267b7e12;31;171;219b-3274b3267b7e15;153b-3264b3264b216;7b-3480b3264b216;4b-3490b3274b33;2b-3302b3262b4;2b-3271b3274b33a8973a-5998a1657;5b-3292b3256b19b144x415;13b-3434b3267b7e-31;2b-3274b3274b33a11813a-13470;1b-3302b3269b160;6b-3449b3288b33;5b-3306b3274b33a6213a-6213;4b-3307b3274b33;1b-3307b3267b7b16x0;119b-3285b3263b6;1b-3254b3251b222;3b-3483b3258b3;151b-3271b3267b7;14b-3264b3261b163;7b-3434b3267b7;10b-3274b3267b7;80b-3274b3267b7;328;6;155b-3269b3262b4;84;153b-3266b3268b18;28;23;118b-3326b3309b36;3b-3345b3309b36a13618a-13618a-2975;2b-3330b316b2978b13x0;21b-4758b1451b316b2978b13x0e4;2b-3322b3309b12;91;14;80;114b-3281b3268b17b1;5b-3281b3259b22;101;19b-3326b3309b12;41b-3286b3273b17b1;6b-3291b3273b18;6b-3286b3268b18;74;145;34b-3326b331b2978b12x0;3b-3321b3309b12;128;3;4;155b-3266b3254b196b147;345b-3592b3249b324b18;444;0;1;6b-3427;104;13;314b-179b3259;138;0;177;70b-3249b3593b13;15b-3601b3246b342;11b-3593b3254b338;51b-3562b3224b340;75b-3594b3254b338;2b-3592b3254b16b322x0e-4;49;83;27b-3647b3298b11b36a16597a-13623a13084a-16042;8;15;84b-3295b3256b20;40;12;1;1;3;30;76b-3266b3246b341;2b-3592b3254b339;66b-5083b1430b3314e4$


#2

Simulation is best used to generate large quantities of data and then analyze that data in aggregate. Our optimization algorithm will follow the trend in that data very closely.

It is possible to set up specific tests like you have that show the algorithm to be “off” - but this is well within a reasonable margin for any statistical model.

We discourage using small numbers of data points to compare gear. This is because simulation models are not perfect models of WoW. On top of that, it is impossible to measure the actual error in the simulation model compared to the real game. We know we are “close” - but exactly how close - 1%? 2%? 5%? is actually not possible to determine. Given this unknown structural error, simulating two sets of gear that are very close in value and comparing just those two data points… is not sufficient to determine which set of gear is actually going to be better in-game.

Simulation data should be thought of as showing us trends - all the combinations of gear that simulate to within 1, 2, even 3% of the highest data point found are functionally equivalent in-game, assuming the player can play optimally in all cases. The goal of a gear optimizer is to find one of the sets of gear that falls in that top 1 or 2% of simulated results, from the millions, billions, or quadrillions of possibilities. This will ensure that in-game, we will have the best chance possible of doing optimal damage.

So, the TLDR answer to your question is probably: Your test simulation shows that BiB is functioning correctly. It has correctly chosen one of the sets of gear that simulation data shows will most likely result in optimal performance.


#3

As another side note: usually the AMR gearing strategies will get slightly closer to the simulated values than custom gearing strategies now. It is on my list to see if we can bring some of those improvements in the AMR strategies to custom strategies… but it is a complex problem to do so without exploding the amount of data required to make a custom gearing strategy (which we want to keep “reasonable” for users).

I’d call an optimization within ~1% of a simulation pretty good for an AMR strategy, and very good for a custom strategy.

Just for kicks I tried this out using the AMR strategy – it uses both of your Mother’s Twin Gaze weapons, but the main difference is that it favors Snake Eyes and picks a different shoulder item. This simulates better with the default settings (full buffs/consumables, etc.), but not as good with your custom settings (shorter fight, no raid buffs, different consumables, etc.). Seems that your custom fight length has the most noticeable impact: the value of the azerite traits causes the setup with Snake Eyes vs. Ace Up Your Sleeve to flip-flop by ~1% due to the shorter fight length.

That says to me… when you translate this to your in-game experience, you’re going to have a hard time telling which setup is actually better. It’s tempting to treat the number that the simulator spits out as “gospel” (or treat the data spit out by the optimizer as “gospel”… but when you get down to it, there’s always going to be a degree of uncertainty when you translate either result into the real game.


#4

Thank you very much for explaining to me in details how it works :slight_smile: