Warlock BiB not quite right

nelasha · July 22, 2020, 1:19am

fab02c2054a842aabf2a10ee4ea1f94b

I have a number of problems with the BiB recommendations for my warlock main:

AMR insists 475 Psyche Shredder is better than a 430 socketed Azshara’s Font of Power. This isn’t backed up by the broader Warlock communiy’s recommendation, raidbots, or simc. Any Font (heroic or higher) is basically BiS, along with a mythic Vita in the second slot.
AMR values the Twisted Appendage corruption way too high (for single target). Expedient is basically the way to go for Warlocks, especially if you stack the Flashpoint azerite trait. Even for multi-target scenarios.
Speaking of Azerite traits, AMR always recommends a mish-mash of chaotic inferno and heart of darkness, but again, these rarely sim better than the tried and true Flashpoint + Rolling Havoc combo.
Essences aren’t quite right either, Crucible of Flame might be good in a simulation for ST context, but is almost never recommended in a raid or M+ envionment.

I know it is possible to override each of these (Azerite traits, trinket weights, corruption weights manually), but it defeats the purpose of using AMR at all - I have to do all this independent research and come with my own gearset and then make AMR “fit” what I chose. That’s no fun, because the human does all the work instead of the robot!

Which brings me to my meta-feedback. It seems AMR is optimizing for “what set will sim the highest based on AMR rotation” based on the “Single Target” and “Mutliple Target” simulation parameters. I would argue this isn’t really the service most of your users want. What I really want is:

3 simulation parameters: “Raid Boss” in current tier, and “Mythic+” (Tyrannical / Bolstering).
Preset “popular” picks of Azerite traits, corruption effects, and other selections (in Shadowlands this would be things like Soulbinds and such).

Basically, I want to optimize for “don’t make me look stupid to my raid leader or mythic+ group”. If that means I use cookie-cutter traits/builds from warcraftlogs, so be it, they clearly work because the world’s best players of that class have picked those traits. This is the service worth paying for since I won’t have to do my own research into this and have peace of mind that i’m using the (widely acknowledged) best gear + traits for the content I’m currently doing.

Been a loyal customer for a while and love the work you’re doing here, but I have to admit that as of late (especially this expansion) I’ve come to trust AMR BiB less and less. Its recommendations just don’t make common sense and then I end up having to do a bunch of independent research, play with raidbots for a bit, then come back to AMR and fiddle with customization until it spits out the output I want – at which point it doesn’t really matter since I already know what the ‘best’ configuration is.

I hope you will take a look at these issues more deeply before Shadowlands. Really praying that AMR will reach parity with raidbots and various class discords, and looking forward to continuing my subscription!

asashdor · July 22, 2020, 1:39pm

So… huge wall of text incoming, sorry.

SimCraft assumes a perfect execution of its (often overly) complicated priority lists which almost always include even the tiniest dps gains. That’s something almost nobody can do in practice - and is completely unreasonable in almost every fight. AMR uses more “sensible” priority lists that a decent or strong player can realistically replicate in game.

Even with that in mind, the difference between those to trinkets is completely negligible (using your toon, about 240 dps, or about 0.26% difference, in favor of Azshara’s Font of Power according to SimCraft and about 720 dps, or 0.83%, in favor of the Psyche Shredder on AMR. In practice you’d most likely loose more dps if you had to cancel channeling Azshara’s Font of Power once because of bad timing.

Of course, that can changes a bit if you’re able to line up Azshara’s Font of Power with specific phases where a boss takes more damage but neither tool includes that and for all intents and purposes, both trinkets are absolutely identical and “forcing” players to use one over the other is absolutely ridiculous - regardless of recommendations by “the broader community”.

Using two completely independent tools will always show some differences but that doesn’t necessarily make one of them “right” and the other “wrong” - as both tools are developed by just a couple people who have to figure out things by themselves and without seeing Blizzard’s code nobody knows which tool might be closer to reality.

That may be a somewhat valid criticism although I don’t think AMR values Twisted Appendage too high - it’s more likely its algorithm and data might still have some problems with extreme amounts of stats caused by stacking of passive bonuses. Generally speaking that wasn’t intended when Blizzard implemented corruptions and has only popped up rather recently due to corruptions being purchasable.

AMR uses a huge amount of data to estimate values without having to always run a complete set of simulations so it can show a result with over 97% (I think, don’t quote me on that though) confidence within a few seconds. That data does have lower and upper bounds (i.e. item level of at least 400 or so) but due to purchasable corruption some values disregarded completely for being unreachable even a couple months ago might now be considered “normal” and the AMR team has mentioned several times that might still cause some issues.

Definitely not exactly ideal, but with the current complexity of talents, gear, essences, traits, corruption, etc. the amount of calculations necessary is just stupidly high and recalculating everything everytime something changes is not necessarily possible…

Looking at data from Warcraftlogs that does look perfectly reasonable for a fight similar to AMR’s “Single Target”, e.g. Wrathion or Shad’har. Most destruction warlocks appear to be using those traits (1x Chaotic Inferno, 1x Crashing Chaos, 1-2x Flashpoint, 2-3x Heart of Darkness) in situations like that.

On fights with mutiple targets it’s obviously different but even then AMR does quite a good job recommending azerite traits (ignoring Blood Rite, as I’m not sure how that effect works in practice and might be overvalued).
Of course, you’d want to have 3x Flashpoint but the amount of Rolling Havoc (especially 2 vs. 3) doesn’t seem to matter that much and AMR still does a reasonable job there, even more so considering that e.g. using both Flashpoint and Rolling Havoc on a helm requires you to get a high level pvp item which might be excluded by your settings.

Essences will always be a point of contention especially with AMR showing Unbound Force as quite strong - something which absolutely nobody has been able to disprove, even though almost nobody wants to use that essence. That also means there’s just not enough hard data on that essence…

You’re contradicting yourself a bit there though: Previously you insisted that simulations are correct when deciding between trinkets or essences but now you completely disregard their results just “because it’s not recommended” - even though the difference isn’t that much larger. Crucible of Flame is definitely usable and by far not a “bad” option.

You can’t have it both ways: either simulations are a valid tool to see differences or they’re not. You can’t just use them for some things and disregard others because someone (i.e. “the broader community”) doesn’t like the result.

SimCraft and class discords are just really huge echo chambers where everyone jumps on some arbitrary number and treats that as the be-all and end-all while simultaneously bashing every alternative - and the overall community just seems to accept that as the correct way to do things. That’s never a reasonable way though, especially with the current complexity of loot and gameplay, and huge differences in raid/group performance and player skill.

Differences between setups can be very small and differences in fights (even the duration!) or execution can be much more noticable than using different talents, traits, essences, and/or items. Class discords and guides often treat those differences as far larger than they really are, especially if there’s a difference in complexity which might cause a lesser player to not be able to use a theoretically stronger option well enough.

If you think SimCraft and Raidbots is the better tool, then just use that - there’s nobody forcing you to accept AMR’s recommendations. “Widely acknowledged” doesn’t mean “correct in every regard” though and it has been shown time and time again that AMR’s recommendations can still match real logs by top players.

Of course, there’s always room for optimization like even more realistic or different fight types which the AMR team has generally been very open to and tried to implement as much as possible but they’re still only two programmers who have to adhere to 24 hours per day and 7 days a week.

So… you’re saying it’s completely reasonable for SimCraft/Raidbots to provide exactly that data and everybody to use it as gospel but for AMR it’s not? What do you think Raidbots does?

If people are so narrow-minded to only accept what some random guy (yes, even a top player writing guides is just “some random guy on the internet”) says, that’s the root of the problem which sadly has been more and more prevalent over the years. Personally I’d switch that sentiment around though: It just shows that the raid leader or Mythic+ group is stupid for not accepting completely valid alternatives.

That’s a basic problem on the internet though and even Blizzard has to deal with that regularly when designing new content - a great example for that is one of the most recent Dark Legacy Comics on covenants: Covenant

Swol · July 22, 2020, 4:58pm

I hear what you are saying. Over the last two expansions, and especially over BfA, WoW has become more and more meta-driven when it comes to gear preferences. There are a number of people who I think would like to see Best in Bag able to provide a quick optimization that follows the current meta for their spec. We are working on putting that into the Shadowlands version of the site. We are always listening to the feedback of our users and trying to adapt the site to what they want. I think some sort of pre-canned custom builds that follow the meta where it diverges from our own suggestions would be worth adding.

On the other hand, we are also theorycrafters doing our own original work. So, we are always going to have our own take on what gear is good. I am not convinced that the on-meta gear selections are necessarily the only gear that can be good in the current game. The community tends to find a build that works well and then everyone uses that build because it is a safe choice. That does not prove there are not other good builds that could be comparable. Proving a negative is almost impossible. If we convinced a large portion of top players to try a different build that simulates well, we’d see a lot of top parses using those builds too.

The meta also tends to center around a set of fairly specific assumptions. For example, destruction warlocks favoring flashpoint is highly dependent upon very short fights where the openers account for a huge portion of total damage output. Or, fights where there are adds constantly spawning for you to make use of it. Azshara’s Font of Power is another item that excels in short encounters that are opener-heavy. Our data does not assume players are all in a situation where they are crushing fights, looking for high parses. Raidbots sets the default fight length now to 2 minutes, which is relatively short. Also, the exact rotation used in simc is rarely exactly what players are actually doing, even very good players. It is very difficult for most people to parse all the logic from the rotations.

I think that our advice on gear is more relevant to a wider range of players than the meta tends to be. The meta is centered on much higher level play than most people in WoW can or prefer to achieve. Also, the people who work on simc tend to be the people largely influential in setting the meta, so the on-meta builds are highly optimized in simc compared to off-meta builds. The class discords heavily promote using raidbots (simc) which then uses default settings that further favor the meta. The whole ecosystem reinforces the creation and perpetuation of a social build structure. I don’t like that. That is a personal preference of mine - I know a lot of people want there to be a meta to follow.

On top of just not liking the meta, I also think that using a simulator in the guise of an “optimizer” is not a good way to use a simulator. Testing 1000 variations of a build by using raidbots to run simc does not provide statistically useful results. It might make us feel good to pick the set that simulated to the highest number, but it doesn’t actually prove anything in a lot of cases. Simulation models are imperfect models of the game. It is not possible for us to measure how much they diverge from the real game. The scripts being simulated are not the fights we are actually doing. Comparing 1000 sets of gear by simulating each set… will only tell us in very rough terms which one is best. If all those sets are within 3% of each other, or maybe even 5%… we learn essentially nothing. Yet, the community will always pick the one that simulates highest and tell everyone else to do the same. This would be called bad science in most other context!

We at AMR think that simulators should be used to generate large amounts of data (millions of data points) and then that data can be analyzed to find trends. The noise in the data is not significant and should be smoothed out, not sought out as advice. For this reason, you will often not see our default gear suggestions matching what you would get if you plug in a subset of gear to simc using raidbots. Chances are the gear that simulates highest will also rank highly using our optimizer, but the optimizer can only pick one set of gear - it may not settle on the “meta” choice. By design, our gear advice is ignorant of the meta.

And lastly… you will only look stupid to your raid leader or mythic+ group if you do low damage. If you use a slightly different build than what is popular and do good damage, no one will care - they might even think it’s cool. This is a video game based on developing a personalized character - we aren’t all meant to be the same! So, yes, we will be offering some on-meta choices to help people out… but we’re also going to try really hard to convince you that WoW does not actually need a meta and that maybe you would have more fun if you ignore it.

nelasha · July 23, 2020, 5:58am

Thanks to both @asashdor and @Swol for their thoughtful replies. Deep discussion like these make me like this community a lot!

I agree with most of both your points. Not arguing that AMR recommendations won’t result in good performing parses. Definitely not arguing that play, rng, and skill won’t have a much bigger role to play than the loadout when it comes to real world raid or M+ performance. Some clarifications to consider though:

The problem is that there is no option to simulate “raid fight with multiple targets” that will recommend for me flashpoint with decent corruption options you will actually use in raid. On these fights AMR will always suggest Twilight Devastation which is usually not a good option for range classes (like my warlock). This works fine for M+ though.

Not trying to have it both ways, my argument is simply that because simc/raidbots are accepted as meta, they don’t have to do anything beyond crunch the numbers using simple “single target” simulation to be accepted by the community. AMR has the opportunity to surpass in this regard by modeling simulations that match closer to reality (current raid tier, M+ on both tyrannical and fortified), which is what players want anyway. Yes, I know it sucks and is unfair to be held to a different and higher standard I remember AMR had some realistic boss simulation way back (was it in WoD? don’t remember now) and it would be cool to see that come back.

The problem is a lot more complex than that. I understand your argument on what is “theoretically” correct but WoW is a social game and we have to understand the very real constraints most mythic raiders operate under. As a concrete example, when I rolled on that the 475 Psyche Shredder, I was met with emotions like surprise or dismay because nobody in my raid could imagine I’d want to replace my socketed 430 font with it. I could technically argue “AMR tells me this is an upgrade”, but realistically that is hard to pull off in reality when loot council have only a few minutes to decide before the next pull. This is what I mean by “don’t make me look stupid”.

I am so happy to hear that! Thank you. This is exactly the feature I was arguing for with my post. I think it will be very good for AMR as a community and business - by hopefully attracting more players to use it.

I think this is an awesome goal and wholeheartedly agree with the spirit of your argument. Once you get more and more players on your platform (by giving them a “close to meta” optimizer), there will be a lot of opportunity to woo them into non-meta builds that perform equally well and create some diversity.

In fact, one thing I just noticed today, and feel a bit silly about - is that the meta has actually shifted in talents. With all the expedient corruption going around it appears that Eradication / Internal Combustion outperforms my standard Flashover / Reverse Entropy set up by a rather large margin (especially on a fight like N’zoth). I only discovered this by looking at Warcraft Logs again. While I am dreaming, it would be really cool for AMR to make suggestions like this for me (it doesn’t appear to be possible to have AMR also optimize talents for you in the current site unless I am mistaken). I got a solid ~5% boost in dps just by switching out talents on our n’zoth rekills. Just the kind of thing a friendly robot could suggest!

Thanks again for the thoughtful responses, appreciate the discussion. Keep up the great work!

asashdor · July 23, 2020, 8:34am

Sure, but that’s most likely less of a problem caused by the available “encounter types” but rather because of the problems AMR has with stacking passive stat corruptions. AMR is not really equipped to deal with those as that’s something that was quite unreasonable before purchasable corruptions.

Personally I’ll just use the customization options to set the value of Twilight Devastation to 0 to “fix” that problem. Not ideal, but that seems to be the best way to do so. When doing that for your toon, AMR will recommend your currently equipped gear exactly (well, ignoring the “issues” already mentioned above with Psyche Shredder and Unbound Force).

I’ll definitely agree that using only two encounter types might not be ideal though, especially with the multi-target one more tailored towards a generic mythic+ experience with a couple differently sized trash groups and a single target boss fight (you can take a look at how that’s implemented here). While this will generally give reasonable results to use for a raid boss fight with some adds, it obviously can’t be perfect. That might just be a tradeoff necessary because of the stupidly high complexity gear has right now and the amount of data AMR needs for each gearing strategy though…

But just playing devil’s advocate for a bit: Even with the pretty big difference between raid bosses and mythic+ how many different strategies would be enough?

Just taking a rough look at Nyalotha, there are quite a few different types of encounters:

Single boss with light movement (Shadhar) and probably some single adds (Maut, Skitra).
Single boss with high movement and some adds (Xanesh, Ra-den).
Single boss with light movement and phase with inactive boss but many adds (Wrathion).
Single boss with high movement, many adds, and “inactive” boss (Vexiona, Drestagath, Il’gynoth).
Two bosses with loads of adds and high movement (Hivemind).
Complex fights with different phases (Carapace, N’zoth).

In addition to that, you’d probably also need to include different combat durations then to get a more realistic simulation because of phases and add spawns - e.g. having one less “flying phase” on Wrathion will cause quite a big difference in dps and fight style in itself compared to someone getting another “flying phase” at 10% or so.

I really hope Blizzard will come to their senses and massively reduce complexity with Shadowlands so we can get own Gearing Strategies on AMR back, which had to be scrapped previously because everything got far too complex. Combining that with the ability to build completely custom encounters and share those, we’d have everything necessary to optimize for many different situations while not unnecessarily bloating the main website for “normal” users.
Currently it does look like Blizzard is on a pretty good track for that, but I’m still somewhat sceptical - especially with things like re-adding ability damage variances and so on…

Oh, of course. I totally get that, especially after having been an extremely strict guild / raid leader in the past and, for example, not even considering several specs (Shadow priest? Balance/feral druid? Retribution paladin? Dps warrior? Get out of my guild!) during early vanilla myself.

That also confirms my sentiment though: People don’t want to deal with alternatives or think for themselves but like to following what someone else said. It’s obviously a much more complex situation than you could describe in a forum post alone that even may or may not have a “correct” answer and needs to be evaluated on a case-by-case basis as every situation and guild/raid is completely different. But I’d also say that someone on a loot council needs to keep an open mind even if they personally don’t agree with everything and to not dismiss something outright.

Maybe that situation could have gone better if you might have said something like “Simulations have shown it to be at least equal to my Font for our combat duration and I think I could use that more effectively” instead of “AMR recommends that”. Obviously, another important detail is how many people also showed interest in that item and how much they’d gain from it which is something the loot council needs to keep in mind. Best way might just be to deal with it and take items like that only if nobody else wants it, as infuriating as that might be for oneself…

Swol · July 23, 2020, 2:07pm

Creating gearing strategies for more boss types/scripts has always been something we’ve wanted to do. The problem is really creating the scripts themselves, which is non-trivial to put it mildly… and then running the data. We need to find a way to create gearing strategies with significantly fewer data points in order to crank out more strategies. In BfA this proved to just not be possible. As you have seen… we are already stretching the limits of the data we have. The amount of combos we need to deal with talent + corruption + azerite + essence… it’s just crazy.

So far in Shadowlands we’ll still have to deal with talents. There are going to be legendary items - right now i see up to 8 effects on the beta for a spec, but some are more utility based. The soulbind will also provide many conduits. So, it’s a little better… but we’re still talking about a lot of data. We will see what we can make happen.

We have generally not wanted to optimize talents for people because those are really a play style choice that we intentionally treat as an input to the optimizer. Mainly because they can be changed at will with no penalty. There is some info on the potential power of talents in the guide section of the site - we will still provide that and maybe beef it up a bit so people who don’t have a strong preference for talent can pick the ones that have the highest potential.

As a side note - our classic optimizer does not use simulation. I built a mathematical model of the game that estimates performance for each class. This allows us to have a lot of input options: fight length, consumables, raid buffs, etc. The result is a very configurable optimizer experience that is even a bit better than the retail version. Simulation is so slow. The game is just too complicated to make a good mathematical model without simulation now, though. The main thing that allows that approach to work for classic is that there is no haste on gear and only a couple of effects in the game that affect haste/attack speed (flurry fuck you). Haste is the one thing in WoW that makes simulation necessary. With no haste, you could go with a formula approach instead.

nelasha · July 24, 2020, 12:51am

Just throwing this out there as an idea, I wonder if we can build more of a community around this if you open source your tooling and scripts that describe a class rotation and encounter type. If we could contribute to this on Github or something, I wouldn’t mind spending a weekend optimizing warlock’s rotation or coding in a boss encounter

One of the reasons I think simc ends up being adopted by the class discords is that you can run the tool locally and the source code is available for anyone to play with. I know it won’t make sense to open source all of AMR since you still have to run the business – but I do wonder if you can open up the rotation/encounter parts and charge for the cloud service of crunching large numbers and running the massive BiB optimization calculations. (I mean raidbots is charging for basically “simc in the cloud” and doing pretty well!).

If I really dream, one could even think of a mechanism to scrape warcraftlogs for any given encounter and learn the script for a given fight (average duration, targets, rotation etc.) and then work backwards to a talent/gear optimization…

Swol · July 24, 2020, 1:33am

It already is open source! We designed our simulator from the start to be accessible by the whole community. The rotation editor, in particular, is a significant improvement compared to simc. We came from doing a lot of work in simc and with simc rotations - so we wanted to improve on that so that more people could work on rotations. The rotation editor has a lot of features that make rotation editing easier - the biggest being type-ahead with all available functions.

https://www.askmrrobot.com/wow/theory/rotation/amrwarlockdestructionmain?spec=WarlockDestruction&version=live

All you have to do is make a copy of that and edit it as much as you would like. You can run your simulations against your private version of the rotation. You can also share that rotation with other people. The shared rotation is a “live link” that will always provide them with the version that includes your updates.

We also made a rotation viewer that parses the simulator rotations and makes them even more legible/understandable for people not well-versed in simulator logic:
https://www.askmrrobot.com/wow/guide/class/warlock/destruction#generic (go to the rotation section)

The scripts are also editable, although admittedly the script editor never got fleshed out as much. We can help people if they want to make certain scripts, though. Look in the “Bosses” category of the theorycraft wiki to find the scripts.

We do have some users who contribute to the rotations and the simulator. If you look at that wiki - every single game mechanic is completely open source. A lot of people use our wiki to view how spells and the game works.

The only closed-source portion of our simulator is the “engine” that drives the actual simulation. All rotations, game mechanics, etc. are completely open source and, in our opinion, much more accessible to non-programmers.

nelasha · July 24, 2020, 6:51am

This is awesome, didn’t even know all this existed! Thanks for the pointers. Looking forward to tinkering with the scripts

Swol · July 24, 2020, 4:10pm

Yeah, most people don’t know about it, unfortunately. The community around simc and the class discords actively discourages anyone from using our tools, even the free ones.