General A neat idea for testing...

Over on MTGS, I ran across this in a discussion of the Blood Baron:
"In the end, the numbers heavily favoured the Baron during the period where we had the Baron // Obzedat split card (and as a result, the Obzedat mode was never used)."​

In other words, the cuber tests new cards by printing out proxies of the new card alongside the one it might replace--i.e., as a split card--and tracks how often each side it used. And that's kind of brilliant, I have to say.

How do others test potential new cards? How long do you keep in underperformers?
 

Jason Waddell

Administrator
Staff member
I do think it's a nice idea, but one to be used very sparingly. I think MTGS is really plagued by this notion that picking the stronger card from a player perspective is always the correct design decision, when in fact design is very much about creating good and enjoyable decision dynamics with the proper risk/rewards tradeoffs etc.

One of the great things about retail drafts is that you get to play with all these cute interaction cards, and you have to because that's all there is. Blood Baron is very much a blunt force tool, and not one that I am super excited to use. Their casting costs are also radically different, and that comes down to taste and knowing your dynamic. Do you want a spell in that slot that is harder or easier to cast? What are you trying to achieve?

Toss in the fact that both protection and lifelink are not mechanics that most cubes really need more of (especially considering the balance problems that the community at large complains about), and I think that you really run the risk of missing the forest through the trees with this kind of approach.

It does give information, but if you think about this from the perspective of designing, say, a brand new board game, you would end up with something pretty terrible if you just took this "what would the player want most" philosophy and applied it through. The reason it "kind of" (very big quotes there) works in cube is that a) Wizards has a tendency to appropriately cost their cards and b) people use singleton to protect themselves from having to make design decisions like "how many Black Lotus do I want".

TL DR: Jason grumbles (again) about cubers who mis-equate design with card power level evaluation.
 

Jason Waddell

Administrator
Staff member
That said, understanding power level is useful and design can go poorly if you're far off the mark (Skullclamp and Jitte say hello). I just think it's like Step 0 in a long design process. Sure, I could say Blood Baron is stronger than Obzedat, but that's just a sliver of the information I should be considering.

I also think that one of the big problems in cube design is that there are so few synergies that you get a real linear scale of card power. Rather than cards whose value is highly context sensitive, there's a large density of just "raw card power" dudes like Blood Baron. Without context concerns, a player will always pick Card A over Card B if one has a higher raw card power. When there are more synergies and things involved, you can start to think of the card as filling a role or a slot, which puts you much more in the mindset of evaluating things holistically (yes Usman, your check is in the mail) as opposed to just on raw power.
 

Eric Chan

Hyalopterous Lemure
Staff member
First things first. Blood Baron is an absolutely awful card for cube. Protection is just about the most swingy, high-variance, non-interactive cube mechanic ever printed. So, you know what's better than single protection? Double protection! Gee golly whiz, that should make for some grand ol' times. Lucksacks, ahoy!

Sorry. Just had to go off on my daily rant there. Ahem.

I think the idea is ok, but as Jason wisely explains, this technique heavily favours power maximation. If this was 2010, and I wasn't sure about the merits of Grave Titan, I might use this method to slap him on a card alongside, say, Kokusho. And then in the heat of battle, I'd probably cast the Grave Titan half 90% of the time. But does the inclusion of Grave Titan in my cube make it a better design? Does it improve the the overall drafting and gameplay experience for everyone? Those metrics are much tougher to measure.

I don't really have a dedicated test method at the moment, unless "throw crap at the wall and see what sticks" counts. It's not very scientific, I know. Player feedback is one of the best tools I have for knowing what to remove, though - when it comes to underperformers (Sylvok Lifestaff was in here for a while) as well as overperformers (hi there, Jitte!). A lot of times, getting useful feedback is like pulling teeth. But in the absence of better analytical data, I figure it's worth the effort to poll my group continually to get a handle on what's working and what isn't.
 

Jason Waddell

Administrator
Staff member
Over on MTGS, I ran across this in a discussion of the Blood Baron:
"In the end, the numbers heavily favoured the Baron during the period where we had the Baron // Obzedat split card (and as a result, the Obzedat mode was never used)."​

In other words, the cuber tests new cards by printing out proxies of the new card alongside the one it might replace--i.e., as a split card--and tracks how often each side it used. And that's kind of brilliant, I have to say.

How do others test potential new cards? How long do you keep in underperformers?

I will say, I hope we didn't come across as overly negative. This is something that I think is a really innovative idea, and I look forward to using it for cards that I really am unsure about. Stuff like Blood Scrivener comes to mind. One of the best things it does is that it forces you to evaluate the card in a gameplay setting even if you don't choose to play that side. You learn something more than you would from sitting at home and theorycrafting about the card.

Most everything I've read about design, even from really mathy people (e.g. David Sirlin, Reiner Knizia) is that data is great and all, but what really gets results is having a keenly tuned design instinct.
 

Eric Chan

Hyalopterous Lemure
Staff member
Yeah, I suppose this is just another tool in your shed, that you have to make sure you use for the right purpose. Like that chainsaw in there that you use all the time for cutting the hedges, but really shouldn't bring down upon the neighbour's annoying chihuahua.

Hmmm. What was I talking about..? Man, that went somewhere dark.

I remember using a variant of this when I was testing a Standard control deck some years ago, and Everflowing Chalice was doing a whole pile of nothing. Craig Wescoe wrote an article about how he'd allowed himself to change his own Chalices to Preordains during testing, to see which was better suited for the deck. I tried the exact same exercise, and lo and behold, Preordain became just about my favourite thing ever.

Perhaps the Blood Baron vs Obzedat example wasn't the greatest illustration of the technique at hand, but put to better use, it could come in handy.
 
Yes, there's no need to use this technique purely to decide which card is generically stronger. I'd argue it works best with odd, archetype dependent cards. It can be quite difficult to remember, in the middle of a match, that your Shuko used to be a Sylvok Lifestaff, say, and decide which would be more useful.
 

VibeBox

Contributor
good read. i like that line about distinguishing "good for and good in your cube. i'll probably start using that now when i explain my banned list to people.

(pssst....there's a "he's" where there should be a "his" just above the wall of reverance pic)
 

FlowerSunRain

Contributor
This is very true. The "raw power" standard seems markedly absurd: a card could always have bigger numbers, more keywords, lower casting cost or fewer restrictions. Striving for a higher powerful is an endless and empty goal, because you can always go "one higher".

The goal shouldn't be to find which card is the most powerful (though such information can of course be useful), but to find which card is appropriately powerful.

Game designers totally understand that they COULD make lightning bolt do 20 damage, make Ryu's fireball as tall as the screen, make barracks take 10 seconds to build or make militia discard each opponent's entire hand. The important thing, though, is that they understand why they shouldn't.
 

VibeBox

Contributor
the rosewater/forsythe regime lost sight of that a long time ago though. and now cube designers are going along with the new ridiculous power levels without stopping to questioning it. so frustrating to watch
 

FlowerSunRain

Contributor
I don't know, I'm still happily rocking over 200 old border cards in my cube and there are a ton more I don't run for that are certainly powerful cards (moxes, abur duals, ancestral, balance, demonic tutor, etc.). I think they are doing a good job.
 

Jason Waddell

Administrator
Staff member
the rosewater/forsythe regime lost sight of that a long time ago though. and now cube designers are going along with the new ridiculous power levels without stopping to questioning it. so frustrating to watch

I'm not really in a position to weigh in on the Rosewater/Forsythe thing, but I do agree with you that the power level of cards Wizards prints should not have a negative impact on the quality of cube design. It's the not stopping to question things that's a problem. Wizards could print Blacker Lotus in black border and it wouldn't really affect me.
 
Top