Delta Vector: Game Design #17: Playtesting Wargames

Wednesday, 17 December 2014

Game Design #17: Playtesting Wargames - A "Fair Test"

For someone who spends so much time testing and reviewing rules, and chatting with rules designers, I do very little official playtesting. Simple reason: good playtesting is too much like hard work.

It's also something almost no one does properly - in fact it's almost an impossible task.

According to the old science curriculum, there are a few elements that make a "fair test"of something. Amongst them are some pretty tough hurdles for a game designer to overcome:

*All variables should be considered and controlled
*The experiment must deal with the question being studied
*Only one variable should be altered at a time
*The experimenters should not be biased
*Conclusions drawn must be better than those from chance
*Experiments must be able to be replicated

So to fairly or properly playtest something in a scientific manner, there are a few things to consider:

Playtesting is difficult, when the subject involves chance, and the testors themselves are both a "variable"and likely biased.

*All variables should be considered and controlled

This is almost impossible in a wargame where testers will themselves are directly involved and not independent observers - they vary in playing skill, experience and their comprehension of the rules - the testers themselves are an uncontrolled variable!

*The experiment (game) must deal with the question being studied
Well obviously, the playtest deals with the game. But what is the "question"the game dev is asking of his testers? Is he wanting to know if the game is fun? realistic? too slow? to confusing? feedback on a particular mechanic? - to get specific feedback the designer needs to pose specific questions for the playtesters.

*Only one variable should be altered at a time
This article was inspired by the v3 Infinity rules. There was changes in "to hit" modifiers of weapons at different range bands. As shooting is very important in Infinity, the designers took a very careful approach. They would change the ranges on a single weapon (keeping all the other weapon ranges and rules the same) play a bunch of games, then change and test another weapon. In a game with dozens of weapons, this means a LOT of playtest games. In addition, they broke it down further - they experimented with the modifiers within each range band, for each weapon. That is hundreds of playtest games - something Infinity, with its established and large loyal fanbase can do; but a bit harder for the average self-published game designer. Most games go through various "alpha" and "beta" incarnations but I doubt many companies would have so many playtest games simply to fine-tune a single weapon.

The problem with changing many variables is you do not know which change had the most impact on gameplay. Even companies with the capability to do this often miss this piece of common sense; i.e. in video game companies when they can easily "patch" a particular weapon by simply typing a few lines of code, when they "balance" things they often mess with several variables. For example: I played a PC game where armed dune buggies ran rampant. They were so fast they could engage and flee at will, they could be repaired while moving, and they were tough. In addition, they wielded a powerful bazooka. These buggies were death machines, able to take out MBTs and infantry, their self-repairing and agility making them nigh-immortal to all foes - even gunships and aircraft. The company responded. But they changed multiple variables. They reduced the weapon power, made the buggy very fragile, reduced the ability to repair while moving, and lowered the speed. The buggy then became useless. And the company could not point to the exact reason, due to the mass of changes they made. The company should have reduced ONE variable (such as speed), tested it, then reduced another variable (say weapon damage) until the buggy reached a state of "useful but not overpowered."

*The experimenters should not be biased
Most playtesters are drawn from local clubs and friends (even ones you only deal with via email may form an emotional attachment and investment to you and your project; the loyalty that makes them a "good" playtester may also bias their opinions).

*Conclusions drawn must be better than those from chance
The very nature of our card-based and dice based wargames means chance plays a large part in any game and lucky rolling can change gameplay outcomes a lot. Whilst we can usually recognize unusually lucky/unlucky dice rolling, the strong element of chance in wargames makes feedback less reliable - which brings even more emphasis to the following:

*Experiments must be able to be replicated
I'd reword this simply as "experiments must BE replicated." Wargame playtesting has problems - the element of chance, and our playtesters are who likely biased; and themselves a "variable" differing in ability and understanding. This is where repeating the "test" leads to more reliable results. If you slip a coin once, and it lands on heads, you might conclude the coin will land on heads every time, or the majority of the time. However flip the coin 100 times, and you will probably realize it is likely to be around 50/50. Flip it 1000 times, and you will increase the accuracy of your experiment. Obviously game designers are limited in the number of tests they can make - but the point is, the more tests (games), the more accurate the data.

TL:DR
As you can see, it is well-nigh impossible to test a wargame "scientifically." Many elements of a proper test are outside the game designer's control. However we can learn from the principles of a "fair test"- I'd like to draw attention to those aspects which ARE controllable:

*Have a focus for feedback. The game dev should seek specific feedback, perhaps by posing focus questions that playtesters can respond to.
*Only change one thing at a time. Change one or two small things, then test. Overhauling masses of features and mechanics can make it unclear what is causing changes/aspects of gameplay.
*Test repeatedly. The more test games, the better the results.

Sadly, these often involve more effort than people are willing (or even able) to spend; it's a little wonder we have so many wargames which are confusing, frustrating or exploitable. In fact it's remarkable we have so many good games, when you consider the difficulties. Hats off to the valiant game devs and playtesters!

24 comments:

Nurglitch17 December 2014 at 07:16
Something many play-testers miss is testing parameters. Sometimes it's useful to go through a game playing the expected values, and at other times the best and worst-case scenarios.
ReplyDelete
Replies
Weasel17 December 2014 at 08:25
I'm going to go ahead and call it. 80% of the games we play had terrible playtesting :-)
I've run into plenty of things that don't even work on paper (not in the hysteria-driven "THIS IS BROKENZ" but in a factual "this rule does not actually function" way) and if an inquiry is made and answered, turns out no one in the designers group ever uses that rule any way.

It's the usual problem of trying to do any group project (and playtesting is essentially a big group project): 100 people want to help. Of those, only 10 of them will actually read the game, let alone provide meaningful feedback.
You can test in your own group but that loses its value very quickly because you're the wrong person to do the testing. One game by total strangers is worth 10 played on your own.
ReplyDelete
Replies
Paul O'G17 December 2014 at 11:55
I have done a range of play-testing, which is indeed different to experimentation and authoring. Much of the time, the requirement for play-teatsers isn't really articulated. Most of the time its 'play this, let me know what you think'. Sometimes there are a few pointed questions about particular rules or components of game play. Testing rules to the maximum left and right limits is difficult to do and needs to be specifically requested as its not something the average gamer can do well.

I totally agree, external advice and playing is vital; one gets clouded with one's one works. Quickly. At least, I know I do!
ReplyDelete
Replies
Weasel17 December 2014 at 12:58
I wonder too if this isn't one of those instances where what people say isn't what they want.

It's like coffee. Everyone says they like dark roasts with a lot of flavour because that's what you're supposed to like, but they buy the weak bland stuff in bulk.

We all say we want more Crossfire but the game we build armies for is Bolt Action ;)
ReplyDelete
Replies
evilleMonkeigh17 December 2014 at 15:45
You should be proud - it was well done and quite polished for a freebie. The sci fi game sounds interesting - Tomorrow's War is my most-played 15mm platoon game, but I always reckon it is much more complex than it first appears. The mechanics seem elegant, but there are so many exceptions to the rules (i.e. how vehicles and infantry interact) and so many reactions going on it bogs down a bit.

I'd contradict the "can't scratch the big boys" - I find SoBH universally recommended for fantasy skirmish, and it is somewhat an acquired taste (i.e. ex-GW players will not necessarily instantly warm to it as it's a bit too left-field - it is more for the " indie" crowd.) No, I'm talking about a similar-to-GW engine-but-with-a-few-cool-twists ruleset that competitive players might like as well. There's definitely a market.

There's also still more room for true FANTASY sci fi skirmish rules (aka Necromunda/Rogue Trader) as 90% of recent systems seem to be on the current "hard sci fi" bandwagon (how the wheel turns!) which is basically moderns with a few cool toys.
ReplyDelete
Replies
Weasel17 December 2014 at 16:00
FAD was basically my reaction to reading Stargrunt and realizing that "holy smokes, games can be not warhammer" :-)

Pop me an email at runequester@gmail.com and I'll send you a comp copy if you wouldn't mind considering a review down the road. I'll warn ahead of time that it's stat-free but it does have points values ;)

As far as big boys, I think it's more a question of production values. Though the 2HW guys started out with incomprehensible PDF's so who knows :-)

I'd argue more that we moved from space-fantasy to .. I don't know what to call it. "Anime-fantasy" ?
The big scifi titles tends to follow a "dark, gritty look" with quite heroic game play. Basically trying to be Warzone.

Though "Vietnam in space" will probably always be there, since it's what military scifi (and not a few political fantasies) are based heavily on and for a medium that is quite far-sighted, scifi gamers are a pretty conservative bunch when it comes to their games.
ReplyDelete
Replies
evilleMonkeigh17 December 2014 at 16:22
FAD was basically my reaction to reading Stargrunt and realizing that "holy smokes, games can be not warhammer" :-)

^It' amazing how many people have never moved past this. When people started laying Warmachine locally they thought they were all edgy and cool - this is another mass market game. It's like saying "I don''t like Star Wars anymore - have you heard of this cool new movie called LoTR?"
ReplyDelete
Replies
blacksmith18 December 2014 at 05:05
Talking about SOBH something funny happened to me. I was among the first ones to proof read and test the rules and I liked them very much when I read them; there are even a few rules in the book which come from me.mine... but when I started to play it I realised I didn't like at all SOBH. I tried it several times leaving time between the games but never worked for me.
Actually I am amazed of whenever somebody asks for a nice skirmish fantasy ruleset almost everybody recommends SBH. I'd love to see something like that coming from mr. Soresen.
ReplyDelete
Replies
Veni Vidi Vici26 August 2015 at 05:15
Well I have been a systems analyst and am used to testing software. First I test for the maximum and minimum, then test some of the bits in between.

So maximum would be, is any unit invincible? If it was it would be out.

Minimum, is any unit totally useless. If so it would have to have a points value approaching zero.
ReplyDelete
Replies
Veni Vidi Vici26 August 2015 at 05:20
Oh and we played the Tiger scenario from the film Fury (a Tiger I in ambush against 4 Shermans, with the Tiger getting the first shot in). Played it 20 times (does not take long). Without a hero in the Sherman 76, all the Shermans die, always. With a level 2 hero in the Sherman 76 the Tiger always loses, one or sometimes two Shermans survive. Conclusion, Brad Pitt is a hero.
ReplyDelete
Replies
SteveHolmes1112 December 2016 at 16:02
Playtesting certainly won't give you scientific reliability.
(I'll skip the possibility of having AIs run bazillions of console games - where you may get a lot closer)

What it does (at least for me) is provide that "before it leaves the garage" inspection.
* Do the mechanisms play smoothly and complement each other.
* Can the game be completed within a target timeframe.
* Do various match-ups usually turn out as expected.
* Is there a "Pareto" mechanism (Burns 80% of the game time for only 20% of the outcome)? If so can it be discarded or streamlined.
* Finally provide a few rules of thumb about army size, table size and the dreaded "Balance".
ReplyDelete
Replies

Add comment

Wednesday, 17 December 2014

Game Design #17: Playtesting Wargames - A "Fair Test"

24 comments:

Total Pageviews