Jtdylktuy

Question

CHALLENGE STATUS: OPEN

Comment or open a PR if I'm missing your bot.

Someone write handshake before I do.

Prisoner's dilemma ... with three choices. Crazy, huh?

Here's our payoff matrix. Player A on the left, B on the top

A,B| C | N | D

---|---|---|---

 C |3,3|4,1|0,5

 N |1,4|2,2|3,2

 D |5,0|2,3|1,1

The payoff matrix is engineered so that it's best for both players to always cooperate, but you can gain (usually) by choosing Neutral or Defection.

Here's some (competing) example bots.

# turns out if you don't actually have to implement __init__(). TIL!



class AllC:

    def round(self, _): return "C"

class AllN:

    def round(self, _): return "N"

class AllD:

    def round(self, _): return "D"

class RandomBot:

    def round(self, _): return random.choice(["C", "N", "D"])



# Actually using an identically-behaving "FastGrudger".

class Grudger:

    def __init__(self):

        self.history = 

    def round(self, last):

        if(last):

            self.history.append(last)

            if(self.history.count("D") > 0):

                return "D"

        return "C"



class TitForTat:

    def round(self, last):

        if(last == "D"):

            return "D"

        return "C"

Your bot is a Python3 class. A new instance is created for every game, and round() is called each round, with your opponent's choice from last round (or None, if it's the first round)

There's a 50 rep bounty for the winner in like a month.

Specifics

Every bot plays every other bot (1v1), including itself, in [REDACTED] rounds.

Standard loopholes disallowed.

No messing with anything outside your class or other underhanded shenanigains.

Yes, you can submit multiple bots.

Yes, you can implement handshake.

Any response other than C, N, or D will be silently taken as N.

Each bot's points from every game they play will be totaled up and compared.

Controller

Check!

Other Languages

I'll throw together an API if anyone needs it.

Scores: 2018-11-13 08:45 EST

19 bots, 361 games.



PatternFinder   - 6293

EvaluaterBot    - 5812

Ensemble        - 5517

Nash2           - 5448

LastOptimalBot  - 5340

FastGrudger     - 5336

HistoricAverage - 5165

Number6         - 5113

OldTitForTat    - 4809

TitForTat       - 4677

Jade            - 4510

Nash            - 4359

RandomBot       - 4246

AllD            - 4214

CopyCat         - 4048

TatForTit       - 3730

AllC            - 3671

NeverCOOP       - 3064

AllN            - 3030

How are bots put against each other? I get from the Grudger that there are always two bots against/with each other and the enemy's last choice is passed to the bot. How many rounds are played? And for a game: Does only the result count (i.e. who won) or also the points? — 21 hours ago
You would get more entries if you made this language-agnostic, or at least broader. You could have a wrapper python class that spawns a process and sends it text commands to get back text responses. — 21 hours ago
proposal: normalize scores so that adding more bots or more rounds doesn't inflate the numbers — 15 hours ago
If you wrap most of main.py in while len(botlist) > 1: with botlist.remove(lowest_scoring_bot) at the bottom of the loop, you get an elimination tournament with interesting results. — 7 hours ago

score 7 · Answer 1 · 2018-11-12 19:31:51Z

EvaluaterBot

class EvaluaterBot:

    def __init__(self):

        self.c2i = {"C":0, "N":1, "D":2}

        self.i2c = {0:"C", 1:"N", 2:"D"}

        self.history = [[0, 0, 0], [0, 0, 0], [0, 0, 0]]

        self.last = [None, None]



    def round(self, last):

        if self.last[0] == None:

            ret = 2

        else:

            # Input the latest enemy action (the reaction to my action 2 rounds ago)

            # into the history

            self.history[self.last[0]][self.c2i[last]] += 1

            # The enemy will react to the last action I did

            prediction,_ = max(enumerate(self.history[self.last[1]]), key=lambda l:l[1])

            ret = (prediction - 1) % 3

        self.last = [self.last[1], ret]

        return self.i2c[ret]

Wins against all previously submitted bots except (maybe) the random bot (but it could have an advantage, because it picks D in a draw and D should optimal) and plays a constant draw against themself.

Scratch that, PatternFinder beats it by a bit.
– Blacksilver
31 mins ago — 31 mins ago

score 7 · Answer 2 · 2018-11-13 07:20:00Z

TatForTit

class TatForTit:

    def round(self, last):

        if(last == "C"):

            return "N"

        return "D"

This bot will alternate picking D N D N while TitForTat alternates C D C D, for an average net gain of 3 points per round if I have read the payout matrix correctly. I think this might be optimal against TitForTat. Obviously it could be improved to detect a non-TFT opponent and adopt other strategies, but I was just aiming for the original bounty.

Blacksilver 413214 · Answer 3 · 2018-11-13 03:44:59Z

NashEquilibrium

This bot has taken a game theory class in college but was lazy and didn't go to the class where they covered iterated games. So he only plays single game mixed nash equilibrium. Turns out 1/5 2/5 2/5 is the mixed NE for the payoffs.

class NashEquilibrium:

    def round(self, _):

        a = random.random()

        if a <= 0.2:

            return "C"

        elif a <= 0.6:

            return "N"

        else:

            return "D"

Constant Abusing Nash Equilibrium

This bot picked up a lesson or two from his lazy brother. His lazy brother's problem was that he didn't take advantage of fixed strategies. This version checks if the opponent is a constant player or titfortat and plays accordingly, else it plays the regular nash equilibrium.

It's only downside is that it averages 2.2 points per turn playing against itself.

class NashEquilibrium2:



    def __init__(self):

        self.opphistory = [None, None, None]

        self.titfortatcounter = 0

        self.titfortatflag = 0

        self.mylast = "C"

        self.constantflag = 0

        self.myret = "C"



    def round(self, last):

        self.opphistory.pop(0)

        self.opphistory.append(last)



        # check if its a constant bot, if so exploit

        if self.opphistory.count(self.opphistory[0]) == 3:

            self.constantflag = 1

            if last == "C":

                 self.myret = "D"

            elif last == "N":

                 self.myret = "C"

            elif last == "D":

                 self.myret = "N"



        # check if its a titfortat bot, if so exploit

        # give it 2 chances to see if its titfortat as it might happen randomly

        if self.mylast == "D" and last == "D":

            self.titfortatcounter = self.titfortatcounter + 1



        if self.mylast == "D" and last!= "D":

            self.titfortatcounter = 0



        if self.titfortatcounter >= 3:

            self.titfortatflag = 1



        if self.titfortatflag == 1:

            if last == "C":

                 self.myret = "D"

            elif last == "D":

                 self.myret = "N"    

            elif last == "N":

                # tit for tat doesn't return N, we made a mistake somewhere

                 self.titfortatflag = 0

                 self.titfortatcounter = 0



        # else play the single game nash equilibrium

        if self.constantflag == 0 and self.titfortatflag == 0:

            a = random.random()

            if a <= 0.2:

                self.myret = "C"

            elif a <= 0.6:

                self.myret = "N"

            else:

                self.myret = "D"





        self.mylast = self.myret

        return self.myret

NashEquilibrium.round needs to take arguments even if it doesn't use them, so as to fit the expected function prototype. — 11 hours ago

Mnemonic 4,6271630 · Answer 4 · 2018-11-12 20:56:19Z

Jade

class Jade:

    def __init__(self):

        self.dRate = 0.001

        self.nRate = 0.003



    def round(self, last):

        if last == 'D':

            self.dRate *= 1.1

            self.nRate *= 1.2

        elif last == 'N':

            self.dRate *= 1.03

            self.nRate *= 1.05

        self.dRate = min(self.dRate, 1)

        self.nRate = min(self.nRate, 1)



        x = random.random()

        if x > (1 - self.dRate):

            return 'D'

        elif x > (1 - self.nRate):

            return 'N'

        else:

            return 'C'

Starts out optimistic, but gets progressively more bitter as the opponent refuses to cooperate. Lots of magic constants that could probably be tweaked, but this probably isn't going to do well enough to justify the time.

Joshua 2,385918 · Answer 5 · 2018-11-12 21:53:42Z

OldTitForTat

Old school player is too lazy to update for the new rules.

class OldTitForTat:

    def round(self, last):

        if(last == None)

            return "C"

        if(last == "C"):

            return "C"

        return "D"

glietz 716 · Answer 6 · 2018-11-12 18:17:30Z

up vote
3
down vote

NeverCOOP

class NeverCOOP:

    def round(self, last):

        try:

            if last in "ND":

                return "D"

            else:

                return "N"

        except:

            return "N"

If the opposing bot defects or is neutral, choose defect. Otherwise if this is the first turn or the opposing bot cooperates, choose neutral. I'm not sure how good this will work...

answered 20 hours ago

glietz

716

New contributor

What's the try/except for?
– Blacksilver
19 hours ago

1

@Blacksilver I'd assume it serves the same purpose as the if(last): in your Grudger bot, detecting whether there was a previous round.
– ETHproductions
19 hours ago

ahh, I see. None in "ND" errors.
– Blacksilver
19 hours ago

Because if last and last in "ND": was too complicated?
– immibis
17 hours ago

add a comment |

score 3 · Answer 7 · 2018-11-12 22:43:18Z

up vote
3
down vote

LastOptimalBot

class LastOptimalBot:

    def round(self, last):

        return "N" if last == "D" else ("D" if last == "C" else "C")

Assumes that the opposing bot will always play the same move again, and chooses the one that has the best payoff against it.

Averages:

Me   Opp

2.6  2    vs TitForTat

5    0    vs AllC

4    1    vs AllN

3    2    vs AllD

3.5  3.5  vs Random

3    2    vs Grudger

2    2    vs LastOptimalBot

1    3.5  vs TatForTit

4    1    vs NeverCOOP

1    4    vs EvaluaterBot

2.28 2.24 vs NashEquilibrium



2.91 average overall

edited 16 hours ago

answered 20 hours ago

Spitemaster

3214

oof. Maybe T4T would do better as return last.
– Blacksilver
20 hours ago

I'd like that! If TitForTat were return last, LOB would go 18-9 over 6 rounds rather than the 13-10 over 5 rounds it's currently getting. I think it's fine as is - don't worry about optimizing the example bots.
– Spitemaster
20 hours ago

return last would be a better T4T for this challenge, I think
– Sparr
20 hours ago

Just tried -- the if(last): return last; else: return "C" is worse.
– Blacksilver
19 hours ago

Right, but as @Sparr was saying, it might be more appropriate. Up to you, I suppose.
– Spitemaster
19 hours ago

add a comment |

score 3 · Answer 8 · 2018-11-13 05:54:58Z

PatternFinder

class PatternFinder:

    def __init__(self):

        import collections

        self.size = 10

        self.moves = [None]

        self.other = 

        self.patterns = collections.defaultdict(list)

        self.counter_moves = {"C":"D", "N":"C", "D":"N"}

        self.initial_move = "D"

        self.pattern_length_exponent = 1

        self.pattern_age_exponent = 1

        self.debug = False

    def round(self, last):

        self.other.append(last)

        best_pattern_match = None

        best_pattern_score = None

        best_pattern_response = None

        self.debug and print("match so far:",tuple(zip(self.moves,self.other)))

        for turn in range(max(0,len(self.moves)-self.size),len(self.moves)):

            # record patterns ending with the move that just happened

            pattern_full = tuple(zip(self.moves[turn:],self.other[turn:]))

            if len(pattern_full) > 1:

                pattern_trunc = pattern_full[:-1]

                pattern_trunc_result = pattern_full[-1][1]

                self.patterns[pattern_trunc].append([pattern_trunc_result,len(self.moves)-1])

            if pattern_full in self.patterns:

                # we've seen this pattern at least once before

                self.debug and print("I've seen",pattern_full,"before:",self.patterns[pattern_full])

                for [response,turn_num] in self.patterns[pattern_full]:

                    score = len(pattern_full) ** self.pattern_length_exponent / (len(self.moves) - turn_num) ** self.pattern_age_exponent

                    if best_pattern_score == None or score > best_pattern_score:

                        best_pattern_match = pattern_full

                        best_pattern_score = score

                        best_pattern_response = response

                    # this could be much smarter about aggregating previous responses

        if best_pattern_response:

            move = self.counter_moves[best_pattern_response]

        else:

            # fall back to playing nice

            move = "C"

        self.moves.append(move)

        self.debug and print("I choose",move)

        return move

This bot looks for previous occurrences of the recent game state to see how the opponent responded to those occurrences, with a preference for longer pattern matches and more recent matches, then plays the move that will "beat" the opponent's predicted move. There's a lot of room for it to be smarter with all the data it keeps track of, but I ran out of time to work on it.

When you get the time, mind giving her an optimization pass? It's easily the largest time-sink. — 11 hours ago
@Blacksilver I just reduced the maximum pattern length from 100 to 10. It should run almost instantly now if you're running <200 rounds — 9 hours ago

Blacksilver 413214 · Answer 9 · 2018-11-13 03:38:12Z

up vote
2
down vote

CopyCat

class CopyCat:

    def round(self, last):

        if last:

            return last

        return "C"

Copies the opponent's last move.

I don't expect this to do well, but no one had implemented this classic yet.

edited 11 hours ago

Blacksilver

413214

answered 12 hours ago

MannerPots

212

New contributor

add a comment |

MegaTom 3,4321324 · Answer 10 · 2018-11-12 23:33:47Z

class HistoricAverage:

    PAYOFFS = {

        "C":{"C":3,"N":1,"D":5},

        "N":{"C":4,"N":2,"D":2},

        "D":{"C":0,"N":3,"D":1}}

    def __init__(self):

        self.history = 

    def round(self, last):

        if(last != None):

            self.history.append(last)

        payoffsum = {"C":0, "N":0, "D":0}

        for move in self.history:

            for x in payoffsum:

               payoffsum[x] += HistoricAverage.PAYOFFS[move][x]

        return max(payoffsum, key=payoffsum.get)

Looks at history and finds the action that would have been best on average. Starts cooperative.

score 1 · Answer 11 · 2018-11-13 07:59:47Z

Ensemble

This runs an ensemble of related models. The individual models consider different amounts of history, and have the option of either always choosing the move that will optimize the expected payout difference, or will randomly select a move in proportion to expected payout difference.

Each member of the ensemble then votes on their preferred move. They get a number of votes equal to how much more they've won than the opponent (which means that terrible models will get negative votes). Whichever move wins the vote is then selected.

(They should probably split their votes among the moves in proportion to how much they favor each, but I don't care enough to do that right now.)

It beats everything posted so far except EvaluaterBot and PatternFinder. (One-on-one, it beats EvaluaterBot and loses to PatternFinder).

from collections import defaultdict

import random

class Number6:

    class Choices:

        def __init__(self, C = 0, N = 0, D = 0):

            self.C = C

            self.N = N

            self.D = D



    def __init__(self, strategy = "maxExpected", markov_order = 3):

        self.MARKOV_ORDER = markov_order;

        self.my_choices = "" 

        self.opponent = defaultdict(lambda: self.Choices())

        self.choice = None # previous choice

        self.payoff = {

            "C": { "C": 3-3, "N": 4-1, "D": 0-5 },

            "N": { "C": 1-4, "N": 2-2, "D": 3-2 },

            "D": { "C": 5-0, "N": 2-3, "D": 1-1 },

        }

        self.total_payoff = 0



        # if random, will choose in proportion to payoff.

        # otherwise, will always choose argmax

        self.strategy = strategy

        # maxExpected: maximize expected relative payoff

        # random: like maxExpected, but it chooses in proportion to E[payoff]

        # argmax: always choose the option that is optimal for expected opponent choice



    def update_opponent_model(self, last):

        for i in range(0, self.MARKOV_ORDER):

            hist = self.my_choices[i:]

            self.opponent[hist].C += ("C" == last)

            self.opponent[hist].N += ("N" == last)

            self.opponent[hist].D += ("D" == last)



    def normalize(self, counts):

        sum = float(counts.C + counts.N + counts.D)

        if 0 == sum:

            return self.Choices(1.0 / 3.0, 1.0 / 3.0, 1.0 / 3.0)

        return self.Choices(

            counts.C / sum, counts.N / sum, counts.D / sum)



    def get_distribution(self):

        for i in range(0, self.MARKOV_ORDER):

            hist = self.my_choices[i:]

            #print "check hist = " + hist

            if hist in self.opponent:

                return self.normalize(self.opponent[hist])



        return self.Choices(1.0 / 3.0, 1.0 / 3.0, 1.0 / 3.0)



    def choose(self, dist):

        payoff = self.Choices()

        # We're interested in *beating the opponent*, not

        # maximizing our score, so we optimize the difference

        payoff.C = (3-3) * dist.C + (4-1) * dist.N + (0-5) * dist.D

        payoff.N = (1-4) * dist.C + (2-2) * dist.N + (3-2) * dist.D

        payoff.D = (5-0) * dist.C + (2-3) * dist.N + (1-1) * dist.D



        # D has slightly better payoff on uniform opponent,

        # so we select it on ties

        if self.strategy == "maxExpected":

            if payoff.C > payoff.N:

                return "C" if payoff.C > payoff.D else "D"

            return "N" if payoff.N > payoff.D else "D"

        elif self.strategy == "randomize":

            payoff = self.normalize(payoff)

            r = random.uniform(0.0, 1.0)

            if (r < payoff.C): return "C"

            return "N" if (r < payoff.N) else "D"

        elif self.strategy == "argMax":

            if dist.C > dist.N:

                return "D" if dist.C > dist.D else "N"

            return "C" if dist.N > dist.D else "N"



        assert(0) #, "I am not a number! I am a free man!")



    def update_history(self):

        self.my_choices += self.choice

        if len(self.my_choices) > self.MARKOV_ORDER:

            assert(len(self.my_choices) == self.MARKOV_ORDER + 1)

            self.my_choices = self.my_choices[1:]



    def round(self, last):

        if last: self.update_opponent_model(last)



        dist = self.get_distribution()

        self.choice = self.choose(dist)

        self.update_history()

        return self.choice



class Ensemble:

    def __init__(self):

        self.models = 

        self.votes = 

        self.prev_choice = 

        for order in range(0, 6):

            self.models.append(Number6("maxExpected", order))

            self.models.append(Number6("randomize", order))

            #self.models.append(Number6("argMax", order))

        for i in range(0, len(self.models)):

            self.votes.append(0)

            self.prev_choice.append("D")



        self.payoff = {

            "C": { "C": 3-3, "N": 4-1, "D": 0-5 },

            "N": { "C": 1-4, "N": 2-2, "D": 3-2 },

            "D": { "C": 5-0, "N": 2-3, "D": 1-1 },

        }



    def round(self, last):

        if last:

            for i in range(0, len(self.models)):

                self.votes[i] += self.payoff[self.prev_choice[i]][last]



        # vote. Sufficiently terrible models get negative votes

        C = 0

        N = 0

        D = 0

        for i in range(0, len(self.models)):

            choice = self.models[i].round(last)

            if "C" == choice: C += self.votes[i]

            if "N" == choice: N += self.votes[i]

            if "D" == choice: D += self.votes[i]

            self.prev_choice[i] = choice



        if C > D and C > N: return "C"

        elif N > D: return "N"

        else: return "D"

Test Framework

In case anyone else finds it useful, here's a test framework for looking at individual matchups. Python2. Just put all the opponents you're interested in in opponents.py, and change the references to Ensemble to your own.

import sys, inspect

import opponents

from ensemble import Ensemble



def count_payoff(label, them):

    if None == them: return

    me = choices[label]

    payoff = {

        "C": { "C": 3-3, "N": 4-1, "D": 0-5 },

        "N": { "C": 1-4, "N": 2-2, "D": 3-2 },

        "D": { "C": 5-0, "N": 2-3, "D": 1-1 },

    }

    if label not in total_payoff: total_payoff[label] = 0

    total_payoff[label] += payoff[me][them]



def update_hist(label, choice):

    choices[label] = choice



opponents = [ x[1] for x 

    in inspect.getmembers(sys.modules['opponents'], inspect.isclass)]



for k in opponents:

    total_payoff = {}



    for j in range(0, 100):

        A = Ensemble()

        B = k()

        choices = {}



        aChoice = None

        bChoice = None

        for i in range(0, 100):

            count_payoff(A.__class__.__name__, bChoice)

            a = A.round(bChoice)

            update_hist(A.__class__.__name__, a)



            count_payoff(B.__class__.__name__, aChoice)

            b = B.round(aChoice)

            update_hist(B.__class__.__name__, b)



            aChoice = a

            bChoice = b

    print total_payoff

@Blacksilver I realized that just as I was about to submit. But this one works in versions before 3.6, and gives information about individual matchups that can help identify weak spots, so it wasn't a complete waste of time. — 11 hours ago
Fair enough; running now. I'll probably add options to my controller to do similar things. — 11 hours ago
"It beats everything posted so far except Ensemble and PatternFinder" I'm honored :) — 7 hours ago
@Sparr Oops. That was supposed to say EvaluaterBot and PatternFinder. But that's when comparing total score against the entire field. PatternFinder remains the only one that beats this in a direct match up. — 6 hours ago

Sparr 4,9781633 · Answer 12 · 2018-11-13 08:56:07Z

Weighted Average

class WeightedAverageBot:

  def __init__(self):

    self.C_bias = 1/4

    self.N = self.C_bias

    self.D = self.C_bias

    self.prev_weight = 1/2

  def round(self, last):

    if last:

      if last == "C" or last == "N":

        self.D *= self.prev_weight

      if last == "C" or last == "D":

        self.N *= self.prev_weight

      if last == "N":

        self.N = 1 - ((1 - self.N) * self.prev_weight)

      if last == "D":

        self.D = 1 - ((1 - self.D) * self.prev_weight)

    if self.N <= self.C_bias and self.D <= self.C_bias:

      return "D"

    if self.N > self.D:

      return "C"

    return "N"

The opponent's behavior is modeled as a right triangle with corners for C N D at 0,0 0,1 1,0 respectively. Each opponent move shifts the point within that triangle toward that corner, and we play to beat the move indicated by the point (with C being given an adjustably small slice of the triangle). In theory I wanted this to have a longer memory with more weight to previous moves, but in practice the current meta favors bots that change quickly, so this devolves into an approximation of LastOptimalBot against most enemies. Posting for posterity; maybe someone will be inspired.

score 7 · Answer 13 · 2018-11-12 19:31:51Z

EvaluaterBot

class EvaluaterBot:

    def __init__(self):

        self.c2i = {"C":0, "N":1, "D":2}

        self.i2c = {0:"C", 1:"N", 2:"D"}

        self.history = [[0, 0, 0], [0, 0, 0], [0, 0, 0]]

        self.last = [None, None]



    def round(self, last):

        if self.last[0] == None:

            ret = 2

        else:

            # Input the latest enemy action (the reaction to my action 2 rounds ago)

            # into the history

            self.history[self.last[0]][self.c2i[last]] += 1

            # The enemy will react to the last action I did

            prediction,_ = max(enumerate(self.history[self.last[1]]), key=lambda l:l[1])

            ret = (prediction - 1) % 3

        self.last = [self.last[1], ret]

        return self.i2c[ret]

Wins against all previously submitted bots except (maybe) the random bot (but it could have an advantage, because it picks D in a draw and D should optimal) and plays a constant draw against themself.

Scratch that, PatternFinder beats it by a bit.
– Blacksilver
31 mins ago — 31 mins ago

score 7 · Answer 14 · 2018-11-13 07:20:00Z

TatForTit

class TatForTit:

    def round(self, last):

        if(last == "C"):

            return "N"

        return "D"

This bot will alternate picking D N D N while TitForTat alternates C D C D, for an average net gain of 3 points per round if I have read the payout matrix correctly. I think this might be optimal against TitForTat. Obviously it could be improved to detect a non-TFT opponent and adopt other strategies, but I was just aiming for the original bounty.

Blacksilver 413214 · Answer 15 · 2018-11-13 03:44:59Z

NashEquilibrium

This bot has taken a game theory class in college but was lazy and didn't go to the class where they covered iterated games. So he only plays single game mixed nash equilibrium. Turns out 1/5 2/5 2/5 is the mixed NE for the payoffs.

class NashEquilibrium:

    def round(self, _):

        a = random.random()

        if a <= 0.2:

            return "C"

        elif a <= 0.6:

            return "N"

        else:

            return "D"

Constant Abusing Nash Equilibrium

This bot picked up a lesson or two from his lazy brother. His lazy brother's problem was that he didn't take advantage of fixed strategies. This version checks if the opponent is a constant player or titfortat and plays accordingly, else it plays the regular nash equilibrium.

It's only downside is that it averages 2.2 points per turn playing against itself.

class NashEquilibrium2:



    def __init__(self):

        self.opphistory = [None, None, None]

        self.titfortatcounter = 0

        self.titfortatflag = 0

        self.mylast = "C"

        self.constantflag = 0

        self.myret = "C"



    def round(self, last):

        self.opphistory.pop(0)

        self.opphistory.append(last)



        # check if its a constant bot, if so exploit

        if self.opphistory.count(self.opphistory[0]) == 3:

            self.constantflag = 1

            if last == "C":

                 self.myret = "D"

            elif last == "N":

                 self.myret = "C"

            elif last == "D":

                 self.myret = "N"



        # check if its a titfortat bot, if so exploit

        # give it 2 chances to see if its titfortat as it might happen randomly

        if self.mylast == "D" and last == "D":

            self.titfortatcounter = self.titfortatcounter + 1



        if self.mylast == "D" and last!= "D":

            self.titfortatcounter = 0



        if self.titfortatcounter >= 3:

            self.titfortatflag = 1



        if self.titfortatflag == 1:

            if last == "C":

                 self.myret = "D"

            elif last == "D":

                 self.myret = "N"    

            elif last == "N":

                # tit for tat doesn't return N, we made a mistake somewhere

                 self.titfortatflag = 0

                 self.titfortatcounter = 0



        # else play the single game nash equilibrium

        if self.constantflag == 0 and self.titfortatflag == 0:

            a = random.random()

            if a <= 0.2:

                self.myret = "C"

            elif a <= 0.6:

                self.myret = "N"

            else:

                self.myret = "D"





        self.mylast = self.myret

        return self.myret

NashEquilibrium.round needs to take arguments even if it doesn't use them, so as to fit the expected function prototype. — 11 hours ago

Mnemonic 4,6271630 · Answer 16 · 2018-11-12 20:56:19Z

Jade

class Jade:

    def __init__(self):

        self.dRate = 0.001

        self.nRate = 0.003



    def round(self, last):

        if last == 'D':

            self.dRate *= 1.1

            self.nRate *= 1.2

        elif last == 'N':

            self.dRate *= 1.03

            self.nRate *= 1.05

        self.dRate = min(self.dRate, 1)

        self.nRate = min(self.nRate, 1)



        x = random.random()

        if x > (1 - self.dRate):

            return 'D'

        elif x > (1 - self.nRate):

            return 'N'

        else:

            return 'C'

Starts out optimistic, but gets progressively more bitter as the opponent refuses to cooperate. Lots of magic constants that could probably be tweaked, but this probably isn't going to do well enough to justify the time.

Joshua 2,385918 · Answer 17 · 2018-11-12 21:53:42Z

OldTitForTat

Old school player is too lazy to update for the new rules.

class OldTitForTat:

    def round(self, last):

        if(last == None)

            return "C"

        if(last == "C"):

            return "C"

        return "D"

glietz 716 · Answer 18 · 2018-11-12 18:17:30Z

up vote
3
down vote

NeverCOOP

class NeverCOOP:

    def round(self, last):

        try:

            if last in "ND":

                return "D"

            else:

                return "N"

        except:

            return "N"

If the opposing bot defects or is neutral, choose defect. Otherwise if this is the first turn or the opposing bot cooperates, choose neutral. I'm not sure how good this will work...

answered 20 hours ago

glietz

716

New contributor

What's the try/except for?
– Blacksilver
19 hours ago

1

@Blacksilver I'd assume it serves the same purpose as the if(last): in your Grudger bot, detecting whether there was a previous round.
– ETHproductions
19 hours ago

ahh, I see. None in "ND" errors.
– Blacksilver
19 hours ago

Because if last and last in "ND": was too complicated?
– immibis
17 hours ago

add a comment |

score 3 · Answer 19 · 2018-11-12 22:43:18Z

up vote
3
down vote

LastOptimalBot

class LastOptimalBot:

    def round(self, last):

        return "N" if last == "D" else ("D" if last == "C" else "C")

Assumes that the opposing bot will always play the same move again, and chooses the one that has the best payoff against it.

Averages:

Me   Opp

2.6  2    vs TitForTat

5    0    vs AllC

4    1    vs AllN

3    2    vs AllD

3.5  3.5  vs Random

3    2    vs Grudger

2    2    vs LastOptimalBot

1    3.5  vs TatForTit

4    1    vs NeverCOOP

1    4    vs EvaluaterBot

2.28 2.24 vs NashEquilibrium



2.91 average overall

edited 16 hours ago

answered 20 hours ago

Spitemaster

3214

oof. Maybe T4T would do better as return last.
– Blacksilver
20 hours ago

I'd like that! If TitForTat were return last, LOB would go 18-9 over 6 rounds rather than the 13-10 over 5 rounds it's currently getting. I think it's fine as is - don't worry about optimizing the example bots.
– Spitemaster
20 hours ago

return last would be a better T4T for this challenge, I think
– Sparr
20 hours ago

Just tried -- the if(last): return last; else: return "C" is worse.
– Blacksilver
19 hours ago

Right, but as @Sparr was saying, it might be more appropriate. Up to you, I suppose.
– Spitemaster
19 hours ago

add a comment |

score 3 · Answer 20 · 2018-11-13 05:54:58Z

PatternFinder

class PatternFinder:

    def __init__(self):

        import collections

        self.size = 10

        self.moves = [None]

        self.other = 

        self.patterns = collections.defaultdict(list)

        self.counter_moves = {"C":"D", "N":"C", "D":"N"}

        self.initial_move = "D"

        self.pattern_length_exponent = 1

        self.pattern_age_exponent = 1

        self.debug = False

    def round(self, last):

        self.other.append(last)

        best_pattern_match = None

        best_pattern_score = None

        best_pattern_response = None

        self.debug and print("match so far:",tuple(zip(self.moves,self.other)))

        for turn in range(max(0,len(self.moves)-self.size),len(self.moves)):

            # record patterns ending with the move that just happened

            pattern_full = tuple(zip(self.moves[turn:],self.other[turn:]))

            if len(pattern_full) > 1:

                pattern_trunc = pattern_full[:-1]

                pattern_trunc_result = pattern_full[-1][1]

                self.patterns[pattern_trunc].append([pattern_trunc_result,len(self.moves)-1])

            if pattern_full in self.patterns:

                # we've seen this pattern at least once before

                self.debug and print("I've seen",pattern_full,"before:",self.patterns[pattern_full])

                for [response,turn_num] in self.patterns[pattern_full]:

                    score = len(pattern_full) ** self.pattern_length_exponent / (len(self.moves) - turn_num) ** self.pattern_age_exponent

                    if best_pattern_score == None or score > best_pattern_score:

                        best_pattern_match = pattern_full

                        best_pattern_score = score

                        best_pattern_response = response

                    # this could be much smarter about aggregating previous responses

        if best_pattern_response:

            move = self.counter_moves[best_pattern_response]

        else:

            # fall back to playing nice

            move = "C"

        self.moves.append(move)

        self.debug and print("I choose",move)

        return move

This bot looks for previous occurrences of the recent game state to see how the opponent responded to those occurrences, with a preference for longer pattern matches and more recent matches, then plays the move that will "beat" the opponent's predicted move. There's a lot of room for it to be smarter with all the data it keeps track of, but I ran out of time to work on it.

When you get the time, mind giving her an optimization pass? It's easily the largest time-sink. — 11 hours ago
@Blacksilver I just reduced the maximum pattern length from 100 to 10. It should run almost instantly now if you're running <200 rounds — 9 hours ago

Blacksilver 413214 · Answer 21 · 2018-11-13 03:38:12Z

up vote
2
down vote

CopyCat

class CopyCat:

    def round(self, last):

        if last:

            return last

        return "C"

Copies the opponent's last move.

I don't expect this to do well, but no one had implemented this classic yet.

edited 11 hours ago

Blacksilver

413214

answered 12 hours ago

MannerPots

212

New contributor

add a comment |

MegaTom 3,4321324 · Answer 22 · 2018-11-12 23:33:47Z

class HistoricAverage:

    PAYOFFS = {

        "C":{"C":3,"N":1,"D":5},

        "N":{"C":4,"N":2,"D":2},

        "D":{"C":0,"N":3,"D":1}}

    def __init__(self):

        self.history = 

    def round(self, last):

        if(last != None):

            self.history.append(last)

        payoffsum = {"C":0, "N":0, "D":0}

        for move in self.history:

            for x in payoffsum:

               payoffsum[x] += HistoricAverage.PAYOFFS[move][x]

        return max(payoffsum, key=payoffsum.get)

Looks at history and finds the action that would have been best on average. Starts cooperative.

score 1 · Answer 23 · 2018-11-13 07:59:47Z

Ensemble

This runs an ensemble of related models. The individual models consider different amounts of history, and have the option of either always choosing the move that will optimize the expected payout difference, or will randomly select a move in proportion to expected payout difference.

Each member of the ensemble then votes on their preferred move. They get a number of votes equal to how much more they've won than the opponent (which means that terrible models will get negative votes). Whichever move wins the vote is then selected.

(They should probably split their votes among the moves in proportion to how much they favor each, but I don't care enough to do that right now.)

It beats everything posted so far except EvaluaterBot and PatternFinder. (One-on-one, it beats EvaluaterBot and loses to PatternFinder).

from collections import defaultdict

import random

class Number6:

    class Choices:

        def __init__(self, C = 0, N = 0, D = 0):

            self.C = C

            self.N = N

            self.D = D



    def __init__(self, strategy = "maxExpected", markov_order = 3):

        self.MARKOV_ORDER = markov_order;

        self.my_choices = "" 

        self.opponent = defaultdict(lambda: self.Choices())

        self.choice = None # previous choice

        self.payoff = {

            "C": { "C": 3-3, "N": 4-1, "D": 0-5 },

            "N": { "C": 1-4, "N": 2-2, "D": 3-2 },

            "D": { "C": 5-0, "N": 2-3, "D": 1-1 },

        }

        self.total_payoff = 0



        # if random, will choose in proportion to payoff.

        # otherwise, will always choose argmax

        self.strategy = strategy

        # maxExpected: maximize expected relative payoff

        # random: like maxExpected, but it chooses in proportion to E[payoff]

        # argmax: always choose the option that is optimal for expected opponent choice



    def update_opponent_model(self, last):

        for i in range(0, self.MARKOV_ORDER):

            hist = self.my_choices[i:]

            self.opponent[hist].C += ("C" == last)

            self.opponent[hist].N += ("N" == last)

            self.opponent[hist].D += ("D" == last)



    def normalize(self, counts):

        sum = float(counts.C + counts.N + counts.D)

        if 0 == sum:

            return self.Choices(1.0 / 3.0, 1.0 / 3.0, 1.0 / 3.0)

        return self.Choices(

            counts.C / sum, counts.N / sum, counts.D / sum)



    def get_distribution(self):

        for i in range(0, self.MARKOV_ORDER):

            hist = self.my_choices[i:]

            #print "check hist = " + hist

            if hist in self.opponent:

                return self.normalize(self.opponent[hist])



        return self.Choices(1.0 / 3.0, 1.0 / 3.0, 1.0 / 3.0)



    def choose(self, dist):

        payoff = self.Choices()

        # We're interested in *beating the opponent*, not

        # maximizing our score, so we optimize the difference

        payoff.C = (3-3) * dist.C + (4-1) * dist.N + (0-5) * dist.D

        payoff.N = (1-4) * dist.C + (2-2) * dist.N + (3-2) * dist.D

        payoff.D = (5-0) * dist.C + (2-3) * dist.N + (1-1) * dist.D



        # D has slightly better payoff on uniform opponent,

        # so we select it on ties

        if self.strategy == "maxExpected":

            if payoff.C > payoff.N:

                return "C" if payoff.C > payoff.D else "D"

            return "N" if payoff.N > payoff.D else "D"

        elif self.strategy == "randomize":

            payoff = self.normalize(payoff)

            r = random.uniform(0.0, 1.0)

            if (r < payoff.C): return "C"

            return "N" if (r < payoff.N) else "D"

        elif self.strategy == "argMax":

            if dist.C > dist.N:

                return "D" if dist.C > dist.D else "N"

            return "C" if dist.N > dist.D else "N"



        assert(0) #, "I am not a number! I am a free man!")



    def update_history(self):

        self.my_choices += self.choice

        if len(self.my_choices) > self.MARKOV_ORDER:

            assert(len(self.my_choices) == self.MARKOV_ORDER + 1)

            self.my_choices = self.my_choices[1:]



    def round(self, last):

        if last: self.update_opponent_model(last)



        dist = self.get_distribution()

        self.choice = self.choose(dist)

        self.update_history()

        return self.choice



class Ensemble:

    def __init__(self):

        self.models = 

        self.votes = 

        self.prev_choice = 

        for order in range(0, 6):

            self.models.append(Number6("maxExpected", order))

            self.models.append(Number6("randomize", order))

            #self.models.append(Number6("argMax", order))

        for i in range(0, len(self.models)):

            self.votes.append(0)

            self.prev_choice.append("D")



        self.payoff = {

            "C": { "C": 3-3, "N": 4-1, "D": 0-5 },

            "N": { "C": 1-4, "N": 2-2, "D": 3-2 },

            "D": { "C": 5-0, "N": 2-3, "D": 1-1 },

        }



    def round(self, last):

        if last:

            for i in range(0, len(self.models)):

                self.votes[i] += self.payoff[self.prev_choice[i]][last]



        # vote. Sufficiently terrible models get negative votes

        C = 0

        N = 0

        D = 0

        for i in range(0, len(self.models)):

            choice = self.models[i].round(last)

            if "C" == choice: C += self.votes[i]

            if "N" == choice: N += self.votes[i]

            if "D" == choice: D += self.votes[i]

            self.prev_choice[i] = choice



        if C > D and C > N: return "C"

        elif N > D: return "N"

        else: return "D"

Test Framework

In case anyone else finds it useful, here's a test framework for looking at individual matchups. Python2. Just put all the opponents you're interested in in opponents.py, and change the references to Ensemble to your own.

import sys, inspect

import opponents

from ensemble import Ensemble



def count_payoff(label, them):

    if None == them: return

    me = choices[label]

    payoff = {

        "C": { "C": 3-3, "N": 4-1, "D": 0-5 },

        "N": { "C": 1-4, "N": 2-2, "D": 3-2 },

        "D": { "C": 5-0, "N": 2-3, "D": 1-1 },

    }

    if label not in total_payoff: total_payoff[label] = 0

    total_payoff[label] += payoff[me][them]



def update_hist(label, choice):

    choices[label] = choice



opponents = [ x[1] for x 

    in inspect.getmembers(sys.modules['opponents'], inspect.isclass)]



for k in opponents:

    total_payoff = {}



    for j in range(0, 100):

        A = Ensemble()

        B = k()

        choices = {}



        aChoice = None

        bChoice = None

        for i in range(0, 100):

            count_payoff(A.__class__.__name__, bChoice)

            a = A.round(bChoice)

            update_hist(A.__class__.__name__, a)



            count_payoff(B.__class__.__name__, aChoice)

            b = B.round(aChoice)

            update_hist(B.__class__.__name__, b)



            aChoice = a

            bChoice = b

    print total_payoff

@Blacksilver I realized that just as I was about to submit. But this one works in versions before 3.6, and gives information about individual matchups that can help identify weak spots, so it wasn't a complete waste of time. — 11 hours ago
Fair enough; running now. I'll probably add options to my controller to do similar things. — 11 hours ago
"It beats everything posted so far except Ensemble and PatternFinder" I'm honored :) — 7 hours ago
@Sparr Oops. That was supposed to say EvaluaterBot and PatternFinder. But that's when comparing total score against the entire field. PatternFinder remains the only one that beats this in a direct match up. — 6 hours ago

Sparr 4,9781633 · Answer 24 · 2018-11-13 08:56:07Z

Weighted Average

class WeightedAverageBot:

  def __init__(self):

    self.C_bias = 1/4

    self.N = self.C_bias

    self.D = self.C_bias

    self.prev_weight = 1/2

  def round(self, last):

    if last:

      if last == "C" or last == "N":

        self.D *= self.prev_weight

      if last == "C" or last == "D":

        self.N *= self.prev_weight

      if last == "N":

        self.N = 1 - ((1 - self.N) * self.prev_weight)

      if last == "D":

        self.D = 1 - ((1 - self.D) * self.prev_weight)

    if self.N <= self.C_bias and self.D <= self.C_bias:

      return "D"

    if self.N > self.D:

      return "C"

    return "N"

The opponent's behavior is modeled as a right triangle with corners for C N D at 0,0 0,1 1,0 respectively. Each opponent move shifts the point within that triangle toward that corner, and we play to beat the move indicated by the point (with C being given an adjustably small slice of the triangle). In theory I wanted this to have a longer memory with more weight to previous moves, but in practice the current meta favors bots that change quickly, so this devolves into an approximation of LastOptimalBot against most enemies. Posting for posterity; maybe someone will be inspired.

Iterated Prisoner's Trilemma

Specifics

Controller

Other Languages

Scores: 2018-11-13 08:45 EST

Specifics

Controller

Other Languages

Scores: 2018-11-13 08:45 EST

Specifics

Controller

Other Languages

Scores: 2018-11-13 08:45 EST

Specifics

Controller

Other Languages

Scores: 2018-11-13 08:45 EST

12 Answers 12

EvaluaterBot

TatForTit

NashEquilibrium

Jade

OldTitForTat

NeverCOOP

LastOptimalBot

PatternFinder

CopyCat

Ensemble

Test Framework

Weighted Average

Your Answer

Sign up or log in

Post as a guest

Post as a guest

12 Answers 12

12 Answers 12

EvaluaterBot

EvaluaterBot

EvaluaterBot

EvaluaterBot

TatForTit

TatForTit

TatForTit

TatForTit

NashEquilibrium

NashEquilibrium

NashEquilibrium

NashEquilibrium

Jade

Jade

Jade

Jade

OldTitForTat

OldTitForTat

OldTitForTat

OldTitForTat

NeverCOOP

NeverCOOP

NeverCOOP

NeverCOOP

LastOptimalBot

LastOptimalBot

LastOptimalBot

LastOptimalBot

PatternFinder

PatternFinder

PatternFinder

PatternFinder

CopyCat

CopyCat

CopyCat

CopyCat

Ensemble

Test Framework

Ensemble

Test Framework

Ensemble

Test Framework

Ensemble

Test Framework

12 Answers
12

12 Answers
12

12 Answers
12