Wednesday 17 December 2014

Game Design #17: Playtesting Wargames - A "Fair Test"

For someone who spends so much time testing and reviewing rules, and chatting with rules designers, I do very little official playtesting.  Simple reason: good playtesting is too much like hard work.

It's also something almost no one does properly - in fact it's almost an impossible task.

According to the old science curriculum, there are a few elements that make a "fair test"of something.  Amongst them are some pretty tough hurdles for a game designer to overcome:

*All variables should be considered and controlled
*The experiment must deal with the question being studied
*Only one variable should be altered at a time
*The experimenters should not be biased
*Conclusions drawn must be better than those from chance
*Experiments must be able to be replicated

So to fairly or properly playtest something in a scientific manner, there are a few things to consider:















Playtesting is difficult, when the subject involves chance, and the testors themselves are both a "variable"and likely biased.

*All variables should be considered and controlled

This is almost impossible in a wargame where testers will themselves are directly involved and not independent observers - they vary in playing skill, experience and their comprehension of the rules - the testers themselves are an uncontrolled variable! 

*The experiment (game) must deal with the question being studied
Well obviously, the playtest deals with the game.  But what is the "question"the game dev is asking of his testers?  Is he wanting to know if the game is fun? realistic?  too slow?  to confusing? feedback on a particular mechanic? - to get specific feedback the designer needs to pose specific questions for the playtesters.

*Only one variable should be altered at a time
This article was inspired by the v3 Infinity rules. There was changes in "to hit" modifiers of weapons at different range bands.  As shooting is very important in Infinity, the designers took a very careful approach.  They would change the ranges on a single weapon (keeping all the other weapon ranges and rules the same) play a bunch of games, then change and test another weapon. In a game with dozens of weapons, this means a LOT of playtest games.  In addition, they broke it down further - they experimented with the modifiers within each range band, for each weapon.  That is hundreds of playtest games - something Infinity, with its established and large loyal fanbase can do; but a bit harder for the average self-published game designer.  Most games go through various "alpha" and "beta" incarnations but I doubt many companies would have so many playtest games simply to fine-tune a single weapon. 

The problem with changing many variables is you do not know which change had the most impact on gameplay.  Even companies with the capability to do this often miss this piece of common sense; i.e. in video game companies when they can easily "patch" a particular weapon by simply typing a few lines of code, when they "balance" things they often mess with several variables. For example: I played a PC game where armed dune buggies ran rampant.  They were so fast they could engage and flee at will, they could be repaired while moving, and they were tough. In addition, they wielded a powerful bazooka.  These buggies were death machines, able to take out MBTs and infantry, their self-repairing and agility making them nigh-immortal to all foes - even gunships and aircraft.  The company responded. But they changed multiple variables. They reduced the weapon power, made the buggy very fragile, reduced the ability to repair while moving, and lowered the speed.  The buggy then became useless.  And the company could not point to the exact reason, due to the mass of changes they made.  The company should have reduced ONE variable (such as speed), tested it, then reduced another variable (say weapon damage) until the buggy reached a state of "useful but not overpowered."

*The experimenters should not be biased
Most playtesters are drawn from local clubs and friends (even ones you only deal with via email may form an emotional attachment and investment to you and your project; the loyalty that makes them a "good" playtester may also bias their opinions).   

*Conclusions drawn must be better than those from chance
The very nature of our card-based and  dice based wargames means chance plays a large part in any game and lucky rolling can change gameplay outcomes a lot.  Whilst we can usually recognize unusually lucky/unlucky dice rolling, the strong element of chance in wargames makes feedback less reliable - which brings even more emphasis to the following:

*Experiments must be able to be replicated
I'd reword this simply as "experiments must BE replicated."  Wargame playtesting has problems -  the element of chance, and our playtesters are who likely biased; and themselves a "variable" differing in ability and understanding.    This is where repeating the "test" leads to more reliable results.  If you slip a coin once, and it lands on heads, you might conclude the coin will land on heads every time, or the majority of the time.  However flip the coin 100 times, and you will probably realize it is likely to be around 50/50.  Flip it 1000 times, and you will increase the accuracy of your experiment.  Obviously game designers are limited in the number of tests they can make - but the point is, the more tests (games), the more accurate the data.

TL:DR
As you can see, it is well-nigh impossible to test a wargame "scientifically."  Many elements of a proper test are outside the game designer's control.  However we can learn from the principles of a "fair test"- I'd like to draw attention to those aspects which ARE controllable: 

*Have a focus for feedback. The game dev should seek specific feedback, perhaps by posing focus questions that playtesters can respond to.
*Only change one thing at a time. Change one or two small things, then test.  Overhauling masses of features and mechanics can make it unclear what is causing changes/aspects of gameplay.
*Test repeatedly.  The more test games, the better the results.

Sadly, these often involve more effort than people are willing (or even able) to spend; it's a little wonder we have so many wargames which are confusing, frustrating or exploitable.  In fact it's remarkable we have so many good games, when you consider the difficulties.  Hats off to the valiant game devs and playtesters!

24 comments:

  1. Something many play-testers miss is testing parameters. Sometimes it's useful to go through a game playing the expected values, and at other times the best and worst-case scenarios.

    ReplyDelete
  2. I'm going to go ahead and call it. 80% of the games we play had terrible playtesting :-)
    I've run into plenty of things that don't even work on paper (not in the hysteria-driven "THIS IS BROKENZ" but in a factual "this rule does not actually function" way) and if an inquiry is made and answered, turns out no one in the designers group ever uses that rule any way.

    It's the usual problem of trying to do any group project (and playtesting is essentially a big group project): 100 people want to help. Of those, only 10 of them will actually read the game, let alone provide meaningful feedback.
    You can test in your own group but that loses its value very quickly because you're the wrong person to do the testing. One game by total strangers is worth 10 played on your own.

    ReplyDelete
    Replies
    1. "You can test in your own group but that loses its value very quickly because you're the wrong person to do the testing. One game by total strangers is worth 10 played on your own."

      ^ This is a major issue. Your local club/mates not only knows the rule, but also the INTENT behind it. This is "background"information a stranger lacks. And because it's something the core testing group intuitively knows, they don't realize it is "missing." Our brain automatically fills in the gaps.

      Two random examples from my job: When editing/marking schoolwork, the writer usually misses mistakes others can find; this is because the work is correct/properly explained/all there "in his head.": Quite common in younger students (but also extending to teens+), it is often orally sound but incoherent on paper.

      Also, we can easily "assume" a shared knowledge. Place value (decimals, 10s, 100s, use of zero) is often poorly taught in schools compared to operations (x, divide,fractions) as to the teacher this knowledge is so "obvious" it is assumed - whereas to the student it is not obvious at all and need to be explicitly explained.

      Delete
  3. I have done a range of play-testing, which is indeed different to experimentation and authoring. Much of the time, the requirement for play-teatsers isn't really articulated. Most of the time its 'play this, let me know what you think'. Sometimes there are a few pointed questions about particular rules or components of game play. Testing rules to the maximum left and right limits is difficult to do and needs to be specifically requested as its not something the average gamer can do well.

    I totally agree, external advice and playing is vital; one gets clouded with one's one works. Quickly. At least, I know I do!

    ReplyDelete
    Replies
    1. "Most of the time its 'play this, let me know what you think'. "

      ^ This is almost verbatim the comment that always comes attached to any rules I am sent. "Think about what?" For example I'm not a huge fan of Bolt Action's recycled 40K mechanics, but I thought using them was a good commercial move...

      Often I love a particular mechanic but sometimes games can be "overcrowded" with new ideas fort their own sake (a bit like Ivan said where a game is usually evolutionary not revolutionary - i.e. have only one or two unique mechanics or concepts compared to the standard - clever but not "too clever.")

      Delete
  4. I wonder too if this isn't one of those instances where what people say isn't what they want.

    It's like coffee. Everyone says they like dark roasts with a lot of flavour because that's what you're supposed to like, but they buy the weak bland stuff in bulk.

    We all say we want more Crossfire but the game we build armies for is Bolt Action ;)

    ReplyDelete
    Replies
    1. I'm currently struggling with how to word a review - I really like all the interesting concepts but I think there's "too much going on." Your coffee example kinda hits it.

      Delete
    2. Irony bonus if its something I wrote you're reviewing :-)

      I don't know if you're much of an RPGer but the indie RPG scene is great at coming up with crazy game mechanics, not so great with pulling it back to earth again and picking the one or two really great ideas.

      Delete
    3. I rather cynically presumed anything you had done would be a 5Core spin-off - then I realised you did FAD, correct? Thumbs up, (it's one of my top 2 recommended free rulesets for sci fi). Let me know when you do the new Necromunda/Mordhiem, please - I''d like to be one of the cool kids before it outsells Bolt Action and FoW....

      My interest in RPGs is restricted to RPG-wargame hybrids like Savage Worlds, 2HW or indeed your own 5 Parsecs type games. - if you come across any wargame-style combat focused RPGs that can handle troops quickly, I'd be interested, but I'm not a RPGer per se.

      Delete
    4. FAD indeed. Still proud of that one.
      I recently did a different, original system for cold war and Scifi (well, two games but the two are close enough I felt good using the same engine) as well.
      In fact, I just did the scifi game (*hint hint*). A bit of a cross between stargrunt and tomorrows war, in terms of feel, but without being quite as clunky in spots.

      I'll likely tackle warband fantasy in the new year but as a one man band, I won't be able to scratch the big boys :)

      Delete
  5. You should be proud - it was well done and quite polished for a freebie. The sci fi game sounds interesting - Tomorrow's War is my most-played 15mm platoon game, but I always reckon it is much more complex than it first appears. The mechanics seem elegant, but there are so many exceptions to the rules (i.e. how vehicles and infantry interact) and so many reactions going on it bogs down a bit.

    I'd contradict the "can't scratch the big boys" - I find SoBH universally recommended for fantasy skirmish, and it is somewhat an acquired taste (i.e. ex-GW players will not necessarily instantly warm to it as it's a bit too left-field - it is more for the " indie" crowd.) No, I'm talking about a similar-to-GW engine-but-with-a-few-cool-twists ruleset that competitive players might like as well. There's definitely a market.

    There's also still more room for true FANTASY sci fi skirmish rules (aka Necromunda/Rogue Trader) as 90% of recent systems seem to be on the current "hard sci fi" bandwagon (how the wheel turns!) which is basically moderns with a few cool toys.

    ReplyDelete
  6. FAD was basically my reaction to reading Stargrunt and realizing that "holy smokes, games can be not warhammer" :-)

    Pop me an email at runequester@gmail.com and I'll send you a comp copy if you wouldn't mind considering a review down the road. I'll warn ahead of time that it's stat-free but it does have points values ;)

    As far as big boys, I think it's more a question of production values. Though the 2HW guys started out with incomprehensible PDF's so who knows :-)

    I'd argue more that we moved from space-fantasy to .. I don't know what to call it. "Anime-fantasy" ?
    The big scifi titles tends to follow a "dark, gritty look" with quite heroic game play. Basically trying to be Warzone.

    Though "Vietnam in space" will probably always be there, since it's what military scifi (and not a few political fantasies) are based heavily on and for a medium that is quite far-sighted, scifi gamers are a pretty conservative bunch when it comes to their games.

    ReplyDelete
  7. FAD was basically my reaction to reading Stargrunt and realizing that "holy smokes, games can be not warhammer" :-)

    ^It' amazing how many people have never moved past this. When people started laying Warmachine locally they thought they were all edgy and cool - this is another mass market game. It's like saying "I don''t like Star Wars anymore - have you heard of this cool new movie called LoTR?"

    ReplyDelete
    Replies
    1. You could adapt that old Unix quote.
      "Those who do not understand Warhammer are doomed to reinvent it, poorly" :)

      Delete
  8. Talking about SOBH something funny happened to me. I was among the first ones to proof read and test the rules and I liked them very much when I read them; there are even a few rules in the book which come from me.mine... but when I started to play it I realised I didn't like at all SOBH. I tried it several times leaving time between the games but never worked for me.
    Actually I am amazed of whenever somebody asks for a nice skirmish fantasy ruleset almost everybody recommends SBH. I'd love to see something like that coming from mr. Soresen.

    ReplyDelete
    Replies
    1. Most people enjoy making warbands but it seems more a "casual" game to play once in a blue moon; it doesn't seem to have the x factor needed to become a gaming "staple"

      Delete
    2. I have some sketched up notes for a "large warband" fantasy game based around leaders activating and pulling troops with them. Maybe 30'ish figures. The intention was to provide a fairly loose fitting campaign structure around it, rather than the highly regimented Mordheim option.

      I think giant robots is up first though :)

      Delete
    3. On the barge to Fraser I was sketching what a "new" Mordhiem-esque game might look like. Apart from stat lines, it included using 20ish minis and had leaders giving extra activations and pulling troops with them (TFL-style meets LoTR).

      Delete
    4. okay, that's kind of uncanny.

      Delete
    5. I'll probably do a series of posts on "making a new Mordhiem?", but I emailed you the notes I made while on hols.

      Delete
  9. Well I have been a systems analyst and am used to testing software. First I test for the maximum and minimum, then test some of the bits in between.

    So maximum would be, is any unit invincible? If it was it would be out.

    Minimum, is any unit totally useless. If so it would have to have a points value approaching zero.

    ReplyDelete
  10. Oh and we played the Tiger scenario from the film Fury (a Tiger I in ambush against 4 Shermans, with the Tiger getting the first shot in). Played it 20 times (does not take long). Without a hero in the Sherman 76, all the Shermans die, always. With a level 2 hero in the Sherman 76 the Tiger always loses, one or sometimes two Shermans survive. Conclusion, Brad Pitt is a hero.

    ReplyDelete
  11. Playtesting certainly won't give you scientific reliability.
    (I'll skip the possibility of having AIs run bazillions of console games - where you may get a lot closer)

    What it does (at least for me) is provide that "before it leaves the garage" inspection.
    * Do the mechanisms play smoothly and complement each other.
    * Can the game be completed within a target timeframe.
    * Do various match-ups usually turn out as expected.
    * Is there a "Pareto" mechanism (Burns 80% of the game time for only 20% of the outcome)? If so can it be discarded or streamlined.
    * Finally provide a few rules of thumb about army size, table size and the dreaded "Balance".

    ReplyDelete
    Replies
    1. Absolutely agree. I don't even think true balance is possible; i.e. in Chess I might be really good with Knights, making them worth more (to me) than their 5-point value suggests. Think I did a post on it somewhere, even.

      That said, it's surprising how haphazard playtesting is, even in large Hobby(tm)companies who at one stage were promoting a competitive scene.

      That said,

      Delete