Iterated Prisoner's Trilemma
up vote
16
down vote
favorite
CHALLENGE STATUS: OPEN
Comment or open a PR if I'm missing your bot.
Someone write handshake before I do.
Prisoner's dilemma ... with three choices. Crazy, huh?
Here's our payoff matrix. Player A on the left, B on the top
A,B| C | N | D
---|---|---|---
C |3,3|4,1|0,5
N |1,4|2,2|3,2
D |5,0|2,3|1,1
The payoff matrix is engineered so that it's best for both players to always cooperate, but you can gain (usually) by choosing Neutral or Defection.
Here's some (competing) example bots.
# turns out if you don't actually have to implement __init__(). TIL!
class AllC:
def round(self, _): return "C"
class AllN:
def round(self, _): return "N"
class AllD:
def round(self, _): return "D"
class RandomBot:
def round(self, _): return random.choice(["C", "N", "D"])
# Actually using an identically-behaving "FastGrudger".
class Grudger:
def __init__(self):
self.history =
def round(self, last):
if(last):
self.history.append(last)
if(self.history.count("D") > 0):
return "D"
return "C"
class TitForTat:
def round(self, last):
if(last == "D"):
return "D"
return "C"
Your bot is a Python3 class. A new instance is created for every game, and round() is called each round, with your opponent's choice from last round (or None, if it's the first round)
There's a 50 rep bounty for the winner in like a month.
Specifics
- Every bot plays every other bot (1v1), including itself, in [REDACTED] rounds.
- Standard loopholes disallowed.
- No messing with anything outside your class or other underhanded shenanigains.
- Yes, you can submit multiple bots.
- Yes, you can implement handshake.
- Any response other than
C,N, orDwill be silently taken asN. - Each bot's points from every game they play will be totaled up and compared.
Controller
Check!
Other Languages
I'll throw together an API if anyone needs it.
Scores: 2018-11-13 08:45 EST
19 bots, 361 games.
PatternFinder - 6293
EvaluaterBot - 5812
Ensemble - 5517
Nash2 - 5448
LastOptimalBot - 5340
FastGrudger - 5336
HistoricAverage - 5165
Number6 - 5113
OldTitForTat - 4809
TitForTat - 4677
Jade - 4510
Nash - 4359
RandomBot - 4246
AllD - 4214
CopyCat - 4048
TatForTit - 3730
AllC - 3671
NeverCOOP - 3064
AllN - 3030
king-of-the-hill python
|
show 8 more comments
up vote
16
down vote
favorite
CHALLENGE STATUS: OPEN
Comment or open a PR if I'm missing your bot.
Someone write handshake before I do.
Prisoner's dilemma ... with three choices. Crazy, huh?
Here's our payoff matrix. Player A on the left, B on the top
A,B| C | N | D
---|---|---|---
C |3,3|4,1|0,5
N |1,4|2,2|3,2
D |5,0|2,3|1,1
The payoff matrix is engineered so that it's best for both players to always cooperate, but you can gain (usually) by choosing Neutral or Defection.
Here's some (competing) example bots.
# turns out if you don't actually have to implement __init__(). TIL!
class AllC:
def round(self, _): return "C"
class AllN:
def round(self, _): return "N"
class AllD:
def round(self, _): return "D"
class RandomBot:
def round(self, _): return random.choice(["C", "N", "D"])
# Actually using an identically-behaving "FastGrudger".
class Grudger:
def __init__(self):
self.history =
def round(self, last):
if(last):
self.history.append(last)
if(self.history.count("D") > 0):
return "D"
return "C"
class TitForTat:
def round(self, last):
if(last == "D"):
return "D"
return "C"
Your bot is a Python3 class. A new instance is created for every game, and round() is called each round, with your opponent's choice from last round (or None, if it's the first round)
There's a 50 rep bounty for the winner in like a month.
Specifics
- Every bot plays every other bot (1v1), including itself, in [REDACTED] rounds.
- Standard loopholes disallowed.
- No messing with anything outside your class or other underhanded shenanigains.
- Yes, you can submit multiple bots.
- Yes, you can implement handshake.
- Any response other than
C,N, orDwill be silently taken asN. - Each bot's points from every game they play will be totaled up and compared.
Controller
Check!
Other Languages
I'll throw together an API if anyone needs it.
Scores: 2018-11-13 08:45 EST
19 bots, 361 games.
PatternFinder - 6293
EvaluaterBot - 5812
Ensemble - 5517
Nash2 - 5448
LastOptimalBot - 5340
FastGrudger - 5336
HistoricAverage - 5165
Number6 - 5113
OldTitForTat - 4809
TitForTat - 4677
Jade - 4510
Nash - 4359
RandomBot - 4246
AllD - 4214
CopyCat - 4048
TatForTit - 3730
AllC - 3671
NeverCOOP - 3064
AllN - 3030
king-of-the-hill python
1
How are bots put against each other? I get from the Grudger that there are always two bots against/with each other and the enemy's last choice is passed to the bot. How many rounds are played? And for a game: Does only the result count (i.e. who won) or also the points?
– Black Owl Kai
21 hours ago
1
You would get more entries if you made this language-agnostic, or at least broader. You could have a wrapper python class that spawns a process and sends it text commands to get back text responses.
– Sparr
21 hours ago
1
Done. This was on the sandbox for like a month!
– Blacksilver
21 hours ago
proposal: normalize scores so that adding more bots or more rounds doesn't inflate the numbers
– Sparr
15 hours ago
1
If you wrap most of main.py inwhile len(botlist) > 1:withbotlist.remove(lowest_scoring_bot)at the bottom of the loop, you get an elimination tournament with interesting results.
– Sparr
7 hours ago
|
show 8 more comments
up vote
16
down vote
favorite
up vote
16
down vote
favorite
CHALLENGE STATUS: OPEN
Comment or open a PR if I'm missing your bot.
Someone write handshake before I do.
Prisoner's dilemma ... with three choices. Crazy, huh?
Here's our payoff matrix. Player A on the left, B on the top
A,B| C | N | D
---|---|---|---
C |3,3|4,1|0,5
N |1,4|2,2|3,2
D |5,0|2,3|1,1
The payoff matrix is engineered so that it's best for both players to always cooperate, but you can gain (usually) by choosing Neutral or Defection.
Here's some (competing) example bots.
# turns out if you don't actually have to implement __init__(). TIL!
class AllC:
def round(self, _): return "C"
class AllN:
def round(self, _): return "N"
class AllD:
def round(self, _): return "D"
class RandomBot:
def round(self, _): return random.choice(["C", "N", "D"])
# Actually using an identically-behaving "FastGrudger".
class Grudger:
def __init__(self):
self.history =
def round(self, last):
if(last):
self.history.append(last)
if(self.history.count("D") > 0):
return "D"
return "C"
class TitForTat:
def round(self, last):
if(last == "D"):
return "D"
return "C"
Your bot is a Python3 class. A new instance is created for every game, and round() is called each round, with your opponent's choice from last round (or None, if it's the first round)
There's a 50 rep bounty for the winner in like a month.
Specifics
- Every bot plays every other bot (1v1), including itself, in [REDACTED] rounds.
- Standard loopholes disallowed.
- No messing with anything outside your class or other underhanded shenanigains.
- Yes, you can submit multiple bots.
- Yes, you can implement handshake.
- Any response other than
C,N, orDwill be silently taken asN. - Each bot's points from every game they play will be totaled up and compared.
Controller
Check!
Other Languages
I'll throw together an API if anyone needs it.
Scores: 2018-11-13 08:45 EST
19 bots, 361 games.
PatternFinder - 6293
EvaluaterBot - 5812
Ensemble - 5517
Nash2 - 5448
LastOptimalBot - 5340
FastGrudger - 5336
HistoricAverage - 5165
Number6 - 5113
OldTitForTat - 4809
TitForTat - 4677
Jade - 4510
Nash - 4359
RandomBot - 4246
AllD - 4214
CopyCat - 4048
TatForTit - 3730
AllC - 3671
NeverCOOP - 3064
AllN - 3030
king-of-the-hill python
CHALLENGE STATUS: OPEN
Comment or open a PR if I'm missing your bot.
Someone write handshake before I do.
Prisoner's dilemma ... with three choices. Crazy, huh?
Here's our payoff matrix. Player A on the left, B on the top
A,B| C | N | D
---|---|---|---
C |3,3|4,1|0,5
N |1,4|2,2|3,2
D |5,0|2,3|1,1
The payoff matrix is engineered so that it's best for both players to always cooperate, but you can gain (usually) by choosing Neutral or Defection.
Here's some (competing) example bots.
# turns out if you don't actually have to implement __init__(). TIL!
class AllC:
def round(self, _): return "C"
class AllN:
def round(self, _): return "N"
class AllD:
def round(self, _): return "D"
class RandomBot:
def round(self, _): return random.choice(["C", "N", "D"])
# Actually using an identically-behaving "FastGrudger".
class Grudger:
def __init__(self):
self.history =
def round(self, last):
if(last):
self.history.append(last)
if(self.history.count("D") > 0):
return "D"
return "C"
class TitForTat:
def round(self, last):
if(last == "D"):
return "D"
return "C"
Your bot is a Python3 class. A new instance is created for every game, and round() is called each round, with your opponent's choice from last round (or None, if it's the first round)
There's a 50 rep bounty for the winner in like a month.
Specifics
- Every bot plays every other bot (1v1), including itself, in [REDACTED] rounds.
- Standard loopholes disallowed.
- No messing with anything outside your class or other underhanded shenanigains.
- Yes, you can submit multiple bots.
- Yes, you can implement handshake.
- Any response other than
C,N, orDwill be silently taken asN. - Each bot's points from every game they play will be totaled up and compared.
Controller
Check!
Other Languages
I'll throw together an API if anyone needs it.
Scores: 2018-11-13 08:45 EST
19 bots, 361 games.
PatternFinder - 6293
EvaluaterBot - 5812
Ensemble - 5517
Nash2 - 5448
LastOptimalBot - 5340
FastGrudger - 5336
HistoricAverage - 5165
Number6 - 5113
OldTitForTat - 4809
TitForTat - 4677
Jade - 4510
Nash - 4359
RandomBot - 4246
AllD - 4214
CopyCat - 4048
TatForTit - 3730
AllC - 3671
NeverCOOP - 3064
AllN - 3030
king-of-the-hill python
king-of-the-hill python
edited 25 mins ago
asked 21 hours ago
Blacksilver
413214
413214
1
How are bots put against each other? I get from the Grudger that there are always two bots against/with each other and the enemy's last choice is passed to the bot. How many rounds are played? And for a game: Does only the result count (i.e. who won) or also the points?
– Black Owl Kai
21 hours ago
1
You would get more entries if you made this language-agnostic, or at least broader. You could have a wrapper python class that spawns a process and sends it text commands to get back text responses.
– Sparr
21 hours ago
1
Done. This was on the sandbox for like a month!
– Blacksilver
21 hours ago
proposal: normalize scores so that adding more bots or more rounds doesn't inflate the numbers
– Sparr
15 hours ago
1
If you wrap most of main.py inwhile len(botlist) > 1:withbotlist.remove(lowest_scoring_bot)at the bottom of the loop, you get an elimination tournament with interesting results.
– Sparr
7 hours ago
|
show 8 more comments
1
How are bots put against each other? I get from the Grudger that there are always two bots against/with each other and the enemy's last choice is passed to the bot. How many rounds are played? And for a game: Does only the result count (i.e. who won) or also the points?
– Black Owl Kai
21 hours ago
1
You would get more entries if you made this language-agnostic, or at least broader. You could have a wrapper python class that spawns a process and sends it text commands to get back text responses.
– Sparr
21 hours ago
1
Done. This was on the sandbox for like a month!
– Blacksilver
21 hours ago
proposal: normalize scores so that adding more bots or more rounds doesn't inflate the numbers
– Sparr
15 hours ago
1
If you wrap most of main.py inwhile len(botlist) > 1:withbotlist.remove(lowest_scoring_bot)at the bottom of the loop, you get an elimination tournament with interesting results.
– Sparr
7 hours ago
1
1
How are bots put against each other? I get from the Grudger that there are always two bots against/with each other and the enemy's last choice is passed to the bot. How many rounds are played? And for a game: Does only the result count (i.e. who won) or also the points?
– Black Owl Kai
21 hours ago
How are bots put against each other? I get from the Grudger that there are always two bots against/with each other and the enemy's last choice is passed to the bot. How many rounds are played? And for a game: Does only the result count (i.e. who won) or also the points?
– Black Owl Kai
21 hours ago
1
1
You would get more entries if you made this language-agnostic, or at least broader. You could have a wrapper python class that spawns a process and sends it text commands to get back text responses.
– Sparr
21 hours ago
You would get more entries if you made this language-agnostic, or at least broader. You could have a wrapper python class that spawns a process and sends it text commands to get back text responses.
– Sparr
21 hours ago
1
1
Done. This was on the sandbox for like a month!
– Blacksilver
21 hours ago
Done. This was on the sandbox for like a month!
– Blacksilver
21 hours ago
proposal: normalize scores so that adding more bots or more rounds doesn't inflate the numbers
– Sparr
15 hours ago
proposal: normalize scores so that adding more bots or more rounds doesn't inflate the numbers
– Sparr
15 hours ago
1
1
If you wrap most of main.py in
while len(botlist) > 1: with botlist.remove(lowest_scoring_bot) at the bottom of the loop, you get an elimination tournament with interesting results.– Sparr
7 hours ago
If you wrap most of main.py in
while len(botlist) > 1: with botlist.remove(lowest_scoring_bot) at the bottom of the loop, you get an elimination tournament with interesting results.– Sparr
7 hours ago
|
show 8 more comments
12 Answers
12
active
oldest
votes
up vote
7
down vote
EvaluaterBot
class EvaluaterBot:
def __init__(self):
self.c2i = {"C":0, "N":1, "D":2}
self.i2c = {0:"C", 1:"N", 2:"D"}
self.history = [[0, 0, 0], [0, 0, 0], [0, 0, 0]]
self.last = [None, None]
def round(self, last):
if self.last[0] == None:
ret = 2
else:
# Input the latest enemy action (the reaction to my action 2 rounds ago)
# into the history
self.history[self.last[0]][self.c2i[last]] += 1
# The enemy will react to the last action I did
prediction,_ = max(enumerate(self.history[self.last[1]]), key=lambda l:l[1])
ret = (prediction - 1) % 3
self.last = [self.last[1], ret]
return self.i2c[ret]
Wins against all previously submitted bots except (maybe) the random bot (but it could have an advantage, because it picks D in a draw and D should optimal) and plays a constant draw against themself.
Yep, beats everything.
– Blacksilver
19 hours ago
Scratch that, PatternFinder beats it by a bit.
– Blacksilver
31 mins ago
add a comment |
up vote
7
down vote
TatForTit
class TatForTit:
def round(self, last):
if(last == "C"):
return "N"
return "D"
This bot will alternate picking D N D N while TitForTat alternates C D C D, for an average net gain of 3 points per round if I have read the payout matrix correctly. I think this might be optimal against TitForTat. Obviously it could be improved to detect a non-TFT opponent and adopt other strategies, but I was just aiming for the original bounty.
add a comment |
up vote
5
down vote
NashEquilibrium
This bot has taken a game theory class in college but was lazy and didn't go to the class where they covered iterated games. So he only plays single game mixed nash equilibrium. Turns out 1/5 2/5 2/5 is the mixed NE for the payoffs.
class NashEquilibrium:
def round(self, _):
a = random.random()
if a <= 0.2:
return "C"
elif a <= 0.6:
return "N"
else:
return "D"
Constant Abusing Nash Equilibrium
This bot picked up a lesson or two from his lazy brother. His lazy brother's problem was that he didn't take advantage of fixed strategies. This version checks if the opponent is a constant player or titfortat and plays accordingly, else it plays the regular nash equilibrium.
It's only downside is that it averages 2.2 points per turn playing against itself.
class NashEquilibrium2:
def __init__(self):
self.opphistory = [None, None, None]
self.titfortatcounter = 0
self.titfortatflag = 0
self.mylast = "C"
self.constantflag = 0
self.myret = "C"
def round(self, last):
self.opphistory.pop(0)
self.opphistory.append(last)
# check if its a constant bot, if so exploit
if self.opphistory.count(self.opphistory[0]) == 3:
self.constantflag = 1
if last == "C":
self.myret = "D"
elif last == "N":
self.myret = "C"
elif last == "D":
self.myret = "N"
# check if its a titfortat bot, if so exploit
# give it 2 chances to see if its titfortat as it might happen randomly
if self.mylast == "D" and last == "D":
self.titfortatcounter = self.titfortatcounter + 1
if self.mylast == "D" and last!= "D":
self.titfortatcounter = 0
if self.titfortatcounter >= 3:
self.titfortatflag = 1
if self.titfortatflag == 1:
if last == "C":
self.myret = "D"
elif last == "D":
self.myret = "N"
elif last == "N":
# tit for tat doesn't return N, we made a mistake somewhere
self.titfortatflag = 0
self.titfortatcounter = 0
# else play the single game nash equilibrium
if self.constantflag == 0 and self.titfortatflag == 0:
a = random.random()
if a <= 0.2:
self.myret = "C"
elif a <= 0.6:
self.myret = "N"
else:
self.myret = "D"
self.mylast = self.myret
return self.myret
New contributor
Omer Faruk Yatkin is a new contributor to this site. Take care in asking for clarification, commenting, and answering.
Check out our Code of Conduct.
1
NashEquilibrium.round needs to take arguments even if it doesn't use them, so as to fit the expected function prototype.
– Ray
11 hours ago
Thank you fixed it
– Omer Faruk Yatkin
11 hours ago
add a comment |
up vote
4
down vote
Jade
class Jade:
def __init__(self):
self.dRate = 0.001
self.nRate = 0.003
def round(self, last):
if last == 'D':
self.dRate *= 1.1
self.nRate *= 1.2
elif last == 'N':
self.dRate *= 1.03
self.nRate *= 1.05
self.dRate = min(self.dRate, 1)
self.nRate = min(self.nRate, 1)
x = random.random()
if x > (1 - self.dRate):
return 'D'
elif x > (1 - self.nRate):
return 'N'
else:
return 'C'
Starts out optimistic, but gets progressively more bitter as the opponent refuses to cooperate. Lots of magic constants that could probably be tweaked, but this probably isn't going to do well enough to justify the time.
add a comment |
up vote
4
down vote
OldTitForTat
Old school player is too lazy to update for the new rules.
class OldTitForTat:
def round(self, last):
if(last == None)
return "C"
if(last == "C"):
return "C"
return "D"
add a comment |
up vote
3
down vote
NeverCOOP
class NeverCOOP:
def round(self, last):
try:
if last in "ND":
return "D"
else:
return "N"
except:
return "N"
If the opposing bot defects or is neutral, choose defect. Otherwise if this is the first turn or the opposing bot cooperates, choose neutral. I'm not sure how good this will work...
New contributor
glietz is a new contributor to this site. Take care in asking for clarification, commenting, and answering.
Check out our Code of Conduct.
What's the try/except for?
– Blacksilver
19 hours ago
1
@Blacksilver I'd assume it serves the same purpose as theif(last):in your Grudger bot, detecting whether there was a previous round.
– ETHproductions
19 hours ago
ahh, I see.None in "ND"errors.
– Blacksilver
19 hours ago
Becauseif last and last in "ND":was too complicated?
– immibis
17 hours ago
add a comment |
up vote
3
down vote
LastOptimalBot
class LastOptimalBot:
def round(self, last):
return "N" if last == "D" else ("D" if last == "C" else "C")
Assumes that the opposing bot will always play the same move again, and chooses the one that has the best payoff against it.
Averages:
Me Opp
2.6 2 vs TitForTat
5 0 vs AllC
4 1 vs AllN
3 2 vs AllD
3.5 3.5 vs Random
3 2 vs Grudger
2 2 vs LastOptimalBot
1 3.5 vs TatForTit
4 1 vs NeverCOOP
1 4 vs EvaluaterBot
2.28 2.24 vs NashEquilibrium
2.91 average overall
oof. Maybe T4T would do better asreturn last.
– Blacksilver
20 hours ago
I'd like that! If TitForTat werereturn last, LOB would go 18-9 over 6 rounds rather than the 13-10 over 5 rounds it's currently getting. I think it's fine as is - don't worry about optimizing the example bots.
– Spitemaster
20 hours ago
return lastwould be a better T4T for this challenge, I think
– Sparr
20 hours ago
Just tried -- theif(last): return last; else: return "C"is worse.
– Blacksilver
19 hours ago
Right, but as @Sparr was saying, it might be more appropriate. Up to you, I suppose.
– Spitemaster
19 hours ago
add a comment |
up vote
3
down vote
PatternFinder
class PatternFinder:
def __init__(self):
import collections
self.size = 10
self.moves = [None]
self.other =
self.patterns = collections.defaultdict(list)
self.counter_moves = {"C":"D", "N":"C", "D":"N"}
self.initial_move = "D"
self.pattern_length_exponent = 1
self.pattern_age_exponent = 1
self.debug = False
def round(self, last):
self.other.append(last)
best_pattern_match = None
best_pattern_score = None
best_pattern_response = None
self.debug and print("match so far:",tuple(zip(self.moves,self.other)))
for turn in range(max(0,len(self.moves)-self.size),len(self.moves)):
# record patterns ending with the move that just happened
pattern_full = tuple(zip(self.moves[turn:],self.other[turn:]))
if len(pattern_full) > 1:
pattern_trunc = pattern_full[:-1]
pattern_trunc_result = pattern_full[-1][1]
self.patterns[pattern_trunc].append([pattern_trunc_result,len(self.moves)-1])
if pattern_full in self.patterns:
# we've seen this pattern at least once before
self.debug and print("I've seen",pattern_full,"before:",self.patterns[pattern_full])
for [response,turn_num] in self.patterns[pattern_full]:
score = len(pattern_full) ** self.pattern_length_exponent / (len(self.moves) - turn_num) ** self.pattern_age_exponent
if best_pattern_score == None or score > best_pattern_score:
best_pattern_match = pattern_full
best_pattern_score = score
best_pattern_response = response
# this could be much smarter about aggregating previous responses
if best_pattern_response:
move = self.counter_moves[best_pattern_response]
else:
# fall back to playing nice
move = "C"
self.moves.append(move)
self.debug and print("I choose",move)
return move
This bot looks for previous occurrences of the recent game state to see how the opponent responded to those occurrences, with a preference for longer pattern matches and more recent matches, then plays the move that will "beat" the opponent's predicted move. There's a lot of room for it to be smarter with all the data it keeps track of, but I ran out of time to work on it.
When you get the time, mind giving her an optimization pass? It's easily the largest time-sink.
– Blacksilver
11 hours ago
1
@Blacksilver I just reduced the maximum pattern length from 100 to 10. It should run almost instantly now if you're running <200 rounds
– Sparr
9 hours ago
add a comment |
up vote
2
down vote
CopyCat
class CopyCat:
def round(self, last):
if last:
return last
return "C"
Copies the opponent's last move.
I don't expect this to do well, but no one had implemented this classic yet.
New contributor
MannerPots is a new contributor to this site. Take care in asking for clarification, commenting, and answering.
Check out our Code of Conduct.
add a comment |
up vote
1
down vote
class HistoricAverage:
PAYOFFS = {
"C":{"C":3,"N":1,"D":5},
"N":{"C":4,"N":2,"D":2},
"D":{"C":0,"N":3,"D":1}}
def __init__(self):
self.history =
def round(self, last):
if(last != None):
self.history.append(last)
payoffsum = {"C":0, "N":0, "D":0}
for move in self.history:
for x in payoffsum:
payoffsum[x] += HistoricAverage.PAYOFFS[move][x]
return max(payoffsum, key=payoffsum.get)
Looks at history and finds the action that would have been best on average. Starts cooperative.
add a comment |
up vote
1
down vote
Ensemble
This runs an ensemble of related models. The individual models consider different amounts of history, and have the option of either always choosing the move that will optimize the expected payout difference, or will randomly select a move in proportion to expected payout difference.
Each member of the ensemble then votes on their preferred move. They get a number of votes equal to how much more they've won than the opponent (which means that terrible models will get negative votes). Whichever move wins the vote is then selected.
(They should probably split their votes among the moves in proportion to how much they favor each, but I don't care enough to do that right now.)
It beats everything posted so far except EvaluaterBot and PatternFinder. (One-on-one, it beats EvaluaterBot and loses to PatternFinder).
from collections import defaultdict
import random
class Number6:
class Choices:
def __init__(self, C = 0, N = 0, D = 0):
self.C = C
self.N = N
self.D = D
def __init__(self, strategy = "maxExpected", markov_order = 3):
self.MARKOV_ORDER = markov_order;
self.my_choices = ""
self.opponent = defaultdict(lambda: self.Choices())
self.choice = None # previous choice
self.payoff = {
"C": { "C": 3-3, "N": 4-1, "D": 0-5 },
"N": { "C": 1-4, "N": 2-2, "D": 3-2 },
"D": { "C": 5-0, "N": 2-3, "D": 1-1 },
}
self.total_payoff = 0
# if random, will choose in proportion to payoff.
# otherwise, will always choose argmax
self.strategy = strategy
# maxExpected: maximize expected relative payoff
# random: like maxExpected, but it chooses in proportion to E[payoff]
# argmax: always choose the option that is optimal for expected opponent choice
def update_opponent_model(self, last):
for i in range(0, self.MARKOV_ORDER):
hist = self.my_choices[i:]
self.opponent[hist].C += ("C" == last)
self.opponent[hist].N += ("N" == last)
self.opponent[hist].D += ("D" == last)
def normalize(self, counts):
sum = float(counts.C + counts.N + counts.D)
if 0 == sum:
return self.Choices(1.0 / 3.0, 1.0 / 3.0, 1.0 / 3.0)
return self.Choices(
counts.C / sum, counts.N / sum, counts.D / sum)
def get_distribution(self):
for i in range(0, self.MARKOV_ORDER):
hist = self.my_choices[i:]
#print "check hist = " + hist
if hist in self.opponent:
return self.normalize(self.opponent[hist])
return self.Choices(1.0 / 3.0, 1.0 / 3.0, 1.0 / 3.0)
def choose(self, dist):
payoff = self.Choices()
# We're interested in *beating the opponent*, not
# maximizing our score, so we optimize the difference
payoff.C = (3-3) * dist.C + (4-1) * dist.N + (0-5) * dist.D
payoff.N = (1-4) * dist.C + (2-2) * dist.N + (3-2) * dist.D
payoff.D = (5-0) * dist.C + (2-3) * dist.N + (1-1) * dist.D
# D has slightly better payoff on uniform opponent,
# so we select it on ties
if self.strategy == "maxExpected":
if payoff.C > payoff.N:
return "C" if payoff.C > payoff.D else "D"
return "N" if payoff.N > payoff.D else "D"
elif self.strategy == "randomize":
payoff = self.normalize(payoff)
r = random.uniform(0.0, 1.0)
if (r < payoff.C): return "C"
return "N" if (r < payoff.N) else "D"
elif self.strategy == "argMax":
if dist.C > dist.N:
return "D" if dist.C > dist.D else "N"
return "C" if dist.N > dist.D else "N"
assert(0) #, "I am not a number! I am a free man!")
def update_history(self):
self.my_choices += self.choice
if len(self.my_choices) > self.MARKOV_ORDER:
assert(len(self.my_choices) == self.MARKOV_ORDER + 1)
self.my_choices = self.my_choices[1:]
def round(self, last):
if last: self.update_opponent_model(last)
dist = self.get_distribution()
self.choice = self.choose(dist)
self.update_history()
return self.choice
class Ensemble:
def __init__(self):
self.models =
self.votes =
self.prev_choice =
for order in range(0, 6):
self.models.append(Number6("maxExpected", order))
self.models.append(Number6("randomize", order))
#self.models.append(Number6("argMax", order))
for i in range(0, len(self.models)):
self.votes.append(0)
self.prev_choice.append("D")
self.payoff = {
"C": { "C": 3-3, "N": 4-1, "D": 0-5 },
"N": { "C": 1-4, "N": 2-2, "D": 3-2 },
"D": { "C": 5-0, "N": 2-3, "D": 1-1 },
}
def round(self, last):
if last:
for i in range(0, len(self.models)):
self.votes[i] += self.payoff[self.prev_choice[i]][last]
# vote. Sufficiently terrible models get negative votes
C = 0
N = 0
D = 0
for i in range(0, len(self.models)):
choice = self.models[i].round(last)
if "C" == choice: C += self.votes[i]
if "N" == choice: N += self.votes[i]
if "D" == choice: D += self.votes[i]
self.prev_choice[i] = choice
if C > D and C > N: return "C"
elif N > D: return "N"
else: return "D"
Test Framework
In case anyone else finds it useful, here's a test framework for looking at individual matchups. Python2. Just put all the opponents you're interested in in opponents.py, and change the references to Ensemble to your own.
import sys, inspect
import opponents
from ensemble import Ensemble
def count_payoff(label, them):
if None == them: return
me = choices[label]
payoff = {
"C": { "C": 3-3, "N": 4-1, "D": 0-5 },
"N": { "C": 1-4, "N": 2-2, "D": 3-2 },
"D": { "C": 5-0, "N": 2-3, "D": 1-1 },
}
if label not in total_payoff: total_payoff[label] = 0
total_payoff[label] += payoff[me][them]
def update_hist(label, choice):
choices[label] = choice
opponents = [ x[1] for x
in inspect.getmembers(sys.modules['opponents'], inspect.isclass)]
for k in opponents:
total_payoff = {}
for j in range(0, 100):
A = Ensemble()
B = k()
choices = {}
aChoice = None
bChoice = None
for i in range(0, 100):
count_payoff(A.__class__.__name__, bChoice)
a = A.round(bChoice)
update_hist(A.__class__.__name__, a)
count_payoff(B.__class__.__name__, aChoice)
b = B.round(aChoice)
update_hist(B.__class__.__name__, b)
aChoice = a
bChoice = b
print total_payoff
The controller is ready, you didn't have to do all that...
– Blacksilver
11 hours ago
1
@Blacksilver I realized that just as I was about to submit. But this one works in versions before 3.6, and gives information about individual matchups that can help identify weak spots, so it wasn't a complete waste of time.
– Ray
11 hours ago
Fair enough; running now. I'll probably add options to my controller to do similar things.
– Blacksilver
11 hours ago
"It beats everything posted so far except Ensemble and PatternFinder" I'm honored :)
– Sparr
7 hours ago
@Sparr Oops. That was supposed to say EvaluaterBot and PatternFinder. But that's when comparing total score against the entire field. PatternFinder remains the only one that beats this in a direct match up.
– Ray
6 hours ago
add a comment |
up vote
1
down vote
Weighted Average
class WeightedAverageBot:
def __init__(self):
self.C_bias = 1/4
self.N = self.C_bias
self.D = self.C_bias
self.prev_weight = 1/2
def round(self, last):
if last:
if last == "C" or last == "N":
self.D *= self.prev_weight
if last == "C" or last == "D":
self.N *= self.prev_weight
if last == "N":
self.N = 1 - ((1 - self.N) * self.prev_weight)
if last == "D":
self.D = 1 - ((1 - self.D) * self.prev_weight)
if self.N <= self.C_bias and self.D <= self.C_bias:
return "D"
if self.N > self.D:
return "C"
return "N"
The opponent's behavior is modeled as a right triangle with corners for C N D at 0,0 0,1 1,0 respectively. Each opponent move shifts the point within that triangle toward that corner, and we play to beat the move indicated by the point (with C being given an adjustably small slice of the triangle). In theory I wanted this to have a longer memory with more weight to previous moves, but in practice the current meta favors bots that change quickly, so this devolves into an approximation of LastOptimalBot against most enemies. Posting for posterity; maybe someone will be inspired.
add a comment |
12 Answers
12
active
oldest
votes
12 Answers
12
active
oldest
votes
active
oldest
votes
active
oldest
votes
up vote
7
down vote
EvaluaterBot
class EvaluaterBot:
def __init__(self):
self.c2i = {"C":0, "N":1, "D":2}
self.i2c = {0:"C", 1:"N", 2:"D"}
self.history = [[0, 0, 0], [0, 0, 0], [0, 0, 0]]
self.last = [None, None]
def round(self, last):
if self.last[0] == None:
ret = 2
else:
# Input the latest enemy action (the reaction to my action 2 rounds ago)
# into the history
self.history[self.last[0]][self.c2i[last]] += 1
# The enemy will react to the last action I did
prediction,_ = max(enumerate(self.history[self.last[1]]), key=lambda l:l[1])
ret = (prediction - 1) % 3
self.last = [self.last[1], ret]
return self.i2c[ret]
Wins against all previously submitted bots except (maybe) the random bot (but it could have an advantage, because it picks D in a draw and D should optimal) and plays a constant draw against themself.
Yep, beats everything.
– Blacksilver
19 hours ago
Scratch that, PatternFinder beats it by a bit.
– Blacksilver
31 mins ago
add a comment |
up vote
7
down vote
EvaluaterBot
class EvaluaterBot:
def __init__(self):
self.c2i = {"C":0, "N":1, "D":2}
self.i2c = {0:"C", 1:"N", 2:"D"}
self.history = [[0, 0, 0], [0, 0, 0], [0, 0, 0]]
self.last = [None, None]
def round(self, last):
if self.last[0] == None:
ret = 2
else:
# Input the latest enemy action (the reaction to my action 2 rounds ago)
# into the history
self.history[self.last[0]][self.c2i[last]] += 1
# The enemy will react to the last action I did
prediction,_ = max(enumerate(self.history[self.last[1]]), key=lambda l:l[1])
ret = (prediction - 1) % 3
self.last = [self.last[1], ret]
return self.i2c[ret]
Wins against all previously submitted bots except (maybe) the random bot (but it could have an advantage, because it picks D in a draw and D should optimal) and plays a constant draw against themself.
Yep, beats everything.
– Blacksilver
19 hours ago
Scratch that, PatternFinder beats it by a bit.
– Blacksilver
31 mins ago
add a comment |
up vote
7
down vote
up vote
7
down vote
EvaluaterBot
class EvaluaterBot:
def __init__(self):
self.c2i = {"C":0, "N":1, "D":2}
self.i2c = {0:"C", 1:"N", 2:"D"}
self.history = [[0, 0, 0], [0, 0, 0], [0, 0, 0]]
self.last = [None, None]
def round(self, last):
if self.last[0] == None:
ret = 2
else:
# Input the latest enemy action (the reaction to my action 2 rounds ago)
# into the history
self.history[self.last[0]][self.c2i[last]] += 1
# The enemy will react to the last action I did
prediction,_ = max(enumerate(self.history[self.last[1]]), key=lambda l:l[1])
ret = (prediction - 1) % 3
self.last = [self.last[1], ret]
return self.i2c[ret]
Wins against all previously submitted bots except (maybe) the random bot (but it could have an advantage, because it picks D in a draw and D should optimal) and plays a constant draw against themself.
EvaluaterBot
class EvaluaterBot:
def __init__(self):
self.c2i = {"C":0, "N":1, "D":2}
self.i2c = {0:"C", 1:"N", 2:"D"}
self.history = [[0, 0, 0], [0, 0, 0], [0, 0, 0]]
self.last = [None, None]
def round(self, last):
if self.last[0] == None:
ret = 2
else:
# Input the latest enemy action (the reaction to my action 2 rounds ago)
# into the history
self.history[self.last[0]][self.c2i[last]] += 1
# The enemy will react to the last action I did
prediction,_ = max(enumerate(self.history[self.last[1]]), key=lambda l:l[1])
ret = (prediction - 1) % 3
self.last = [self.last[1], ret]
return self.i2c[ret]
Wins against all previously submitted bots except (maybe) the random bot (but it could have an advantage, because it picks D in a draw and D should optimal) and plays a constant draw against themself.
edited 19 hours ago
answered 19 hours ago
Black Owl Kai
5017
5017
Yep, beats everything.
– Blacksilver
19 hours ago
Scratch that, PatternFinder beats it by a bit.
– Blacksilver
31 mins ago
add a comment |
Yep, beats everything.
– Blacksilver
19 hours ago
Scratch that, PatternFinder beats it by a bit.
– Blacksilver
31 mins ago
Yep, beats everything.
– Blacksilver
19 hours ago
Yep, beats everything.
– Blacksilver
19 hours ago
Scratch that, PatternFinder beats it by a bit.
– Blacksilver
31 mins ago
Scratch that, PatternFinder beats it by a bit.
– Blacksilver
31 mins ago
add a comment |
up vote
7
down vote
TatForTit
class TatForTit:
def round(self, last):
if(last == "C"):
return "N"
return "D"
This bot will alternate picking D N D N while TitForTat alternates C D C D, for an average net gain of 3 points per round if I have read the payout matrix correctly. I think this might be optimal against TitForTat. Obviously it could be improved to detect a non-TFT opponent and adopt other strategies, but I was just aiming for the original bounty.
add a comment |
up vote
7
down vote
TatForTit
class TatForTit:
def round(self, last):
if(last == "C"):
return "N"
return "D"
This bot will alternate picking D N D N while TitForTat alternates C D C D, for an average net gain of 3 points per round if I have read the payout matrix correctly. I think this might be optimal against TitForTat. Obviously it could be improved to detect a non-TFT opponent and adopt other strategies, but I was just aiming for the original bounty.
add a comment |
up vote
7
down vote
up vote
7
down vote
TatForTit
class TatForTit:
def round(self, last):
if(last == "C"):
return "N"
return "D"
This bot will alternate picking D N D N while TitForTat alternates C D C D, for an average net gain of 3 points per round if I have read the payout matrix correctly. I think this might be optimal against TitForTat. Obviously it could be improved to detect a non-TFT opponent and adopt other strategies, but I was just aiming for the original bounty.
TatForTit
class TatForTit:
def round(self, last):
if(last == "C"):
return "N"
return "D"
This bot will alternate picking D N D N while TitForTat alternates C D C D, for an average net gain of 3 points per round if I have read the payout matrix correctly. I think this might be optimal against TitForTat. Obviously it could be improved to detect a non-TFT opponent and adopt other strategies, but I was just aiming for the original bounty.
edited 7 hours ago
answered 20 hours ago
Sparr
4,9781633
4,9781633
add a comment |
add a comment |
up vote
5
down vote
NashEquilibrium
This bot has taken a game theory class in college but was lazy and didn't go to the class where they covered iterated games. So he only plays single game mixed nash equilibrium. Turns out 1/5 2/5 2/5 is the mixed NE for the payoffs.
class NashEquilibrium:
def round(self, _):
a = random.random()
if a <= 0.2:
return "C"
elif a <= 0.6:
return "N"
else:
return "D"
Constant Abusing Nash Equilibrium
This bot picked up a lesson or two from his lazy brother. His lazy brother's problem was that he didn't take advantage of fixed strategies. This version checks if the opponent is a constant player or titfortat and plays accordingly, else it plays the regular nash equilibrium.
It's only downside is that it averages 2.2 points per turn playing against itself.
class NashEquilibrium2:
def __init__(self):
self.opphistory = [None, None, None]
self.titfortatcounter = 0
self.titfortatflag = 0
self.mylast = "C"
self.constantflag = 0
self.myret = "C"
def round(self, last):
self.opphistory.pop(0)
self.opphistory.append(last)
# check if its a constant bot, if so exploit
if self.opphistory.count(self.opphistory[0]) == 3:
self.constantflag = 1
if last == "C":
self.myret = "D"
elif last == "N":
self.myret = "C"
elif last == "D":
self.myret = "N"
# check if its a titfortat bot, if so exploit
# give it 2 chances to see if its titfortat as it might happen randomly
if self.mylast == "D" and last == "D":
self.titfortatcounter = self.titfortatcounter + 1
if self.mylast == "D" and last!= "D":
self.titfortatcounter = 0
if self.titfortatcounter >= 3:
self.titfortatflag = 1
if self.titfortatflag == 1:
if last == "C":
self.myret = "D"
elif last == "D":
self.myret = "N"
elif last == "N":
# tit for tat doesn't return N, we made a mistake somewhere
self.titfortatflag = 0
self.titfortatcounter = 0
# else play the single game nash equilibrium
if self.constantflag == 0 and self.titfortatflag == 0:
a = random.random()
if a <= 0.2:
self.myret = "C"
elif a <= 0.6:
self.myret = "N"
else:
self.myret = "D"
self.mylast = self.myret
return self.myret
New contributor
Omer Faruk Yatkin is a new contributor to this site. Take care in asking for clarification, commenting, and answering.
Check out our Code of Conduct.
1
NashEquilibrium.round needs to take arguments even if it doesn't use them, so as to fit the expected function prototype.
– Ray
11 hours ago
Thank you fixed it
– Omer Faruk Yatkin
11 hours ago
add a comment |
up vote
5
down vote
NashEquilibrium
This bot has taken a game theory class in college but was lazy and didn't go to the class where they covered iterated games. So he only plays single game mixed nash equilibrium. Turns out 1/5 2/5 2/5 is the mixed NE for the payoffs.
class NashEquilibrium:
def round(self, _):
a = random.random()
if a <= 0.2:
return "C"
elif a <= 0.6:
return "N"
else:
return "D"
Constant Abusing Nash Equilibrium
This bot picked up a lesson or two from his lazy brother. His lazy brother's problem was that he didn't take advantage of fixed strategies. This version checks if the opponent is a constant player or titfortat and plays accordingly, else it plays the regular nash equilibrium.
It's only downside is that it averages 2.2 points per turn playing against itself.
class NashEquilibrium2:
def __init__(self):
self.opphistory = [None, None, None]
self.titfortatcounter = 0
self.titfortatflag = 0
self.mylast = "C"
self.constantflag = 0
self.myret = "C"
def round(self, last):
self.opphistory.pop(0)
self.opphistory.append(last)
# check if its a constant bot, if so exploit
if self.opphistory.count(self.opphistory[0]) == 3:
self.constantflag = 1
if last == "C":
self.myret = "D"
elif last == "N":
self.myret = "C"
elif last == "D":
self.myret = "N"
# check if its a titfortat bot, if so exploit
# give it 2 chances to see if its titfortat as it might happen randomly
if self.mylast == "D" and last == "D":
self.titfortatcounter = self.titfortatcounter + 1
if self.mylast == "D" and last!= "D":
self.titfortatcounter = 0
if self.titfortatcounter >= 3:
self.titfortatflag = 1
if self.titfortatflag == 1:
if last == "C":
self.myret = "D"
elif last == "D":
self.myret = "N"
elif last == "N":
# tit for tat doesn't return N, we made a mistake somewhere
self.titfortatflag = 0
self.titfortatcounter = 0
# else play the single game nash equilibrium
if self.constantflag == 0 and self.titfortatflag == 0:
a = random.random()
if a <= 0.2:
self.myret = "C"
elif a <= 0.6:
self.myret = "N"
else:
self.myret = "D"
self.mylast = self.myret
return self.myret
New contributor
Omer Faruk Yatkin is a new contributor to this site. Take care in asking for clarification, commenting, and answering.
Check out our Code of Conduct.
1
NashEquilibrium.round needs to take arguments even if it doesn't use them, so as to fit the expected function prototype.
– Ray
11 hours ago
Thank you fixed it
– Omer Faruk Yatkin
11 hours ago
add a comment |
up vote
5
down vote
up vote
5
down vote
NashEquilibrium
This bot has taken a game theory class in college but was lazy and didn't go to the class where they covered iterated games. So he only plays single game mixed nash equilibrium. Turns out 1/5 2/5 2/5 is the mixed NE for the payoffs.
class NashEquilibrium:
def round(self, _):
a = random.random()
if a <= 0.2:
return "C"
elif a <= 0.6:
return "N"
else:
return "D"
Constant Abusing Nash Equilibrium
This bot picked up a lesson or two from his lazy brother. His lazy brother's problem was that he didn't take advantage of fixed strategies. This version checks if the opponent is a constant player or titfortat and plays accordingly, else it plays the regular nash equilibrium.
It's only downside is that it averages 2.2 points per turn playing against itself.
class NashEquilibrium2:
def __init__(self):
self.opphistory = [None, None, None]
self.titfortatcounter = 0
self.titfortatflag = 0
self.mylast = "C"
self.constantflag = 0
self.myret = "C"
def round(self, last):
self.opphistory.pop(0)
self.opphistory.append(last)
# check if its a constant bot, if so exploit
if self.opphistory.count(self.opphistory[0]) == 3:
self.constantflag = 1
if last == "C":
self.myret = "D"
elif last == "N":
self.myret = "C"
elif last == "D":
self.myret = "N"
# check if its a titfortat bot, if so exploit
# give it 2 chances to see if its titfortat as it might happen randomly
if self.mylast == "D" and last == "D":
self.titfortatcounter = self.titfortatcounter + 1
if self.mylast == "D" and last!= "D":
self.titfortatcounter = 0
if self.titfortatcounter >= 3:
self.titfortatflag = 1
if self.titfortatflag == 1:
if last == "C":
self.myret = "D"
elif last == "D":
self.myret = "N"
elif last == "N":
# tit for tat doesn't return N, we made a mistake somewhere
self.titfortatflag = 0
self.titfortatcounter = 0
# else play the single game nash equilibrium
if self.constantflag == 0 and self.titfortatflag == 0:
a = random.random()
if a <= 0.2:
self.myret = "C"
elif a <= 0.6:
self.myret = "N"
else:
self.myret = "D"
self.mylast = self.myret
return self.myret
New contributor
Omer Faruk Yatkin is a new contributor to this site. Take care in asking for clarification, commenting, and answering.
Check out our Code of Conduct.
NashEquilibrium
This bot has taken a game theory class in college but was lazy and didn't go to the class where they covered iterated games. So he only plays single game mixed nash equilibrium. Turns out 1/5 2/5 2/5 is the mixed NE for the payoffs.
class NashEquilibrium:
def round(self, _):
a = random.random()
if a <= 0.2:
return "C"
elif a <= 0.6:
return "N"
else:
return "D"
Constant Abusing Nash Equilibrium
This bot picked up a lesson or two from his lazy brother. His lazy brother's problem was that he didn't take advantage of fixed strategies. This version checks if the opponent is a constant player or titfortat and plays accordingly, else it plays the regular nash equilibrium.
It's only downside is that it averages 2.2 points per turn playing against itself.
class NashEquilibrium2:
def __init__(self):
self.opphistory = [None, None, None]
self.titfortatcounter = 0
self.titfortatflag = 0
self.mylast = "C"
self.constantflag = 0
self.myret = "C"
def round(self, last):
self.opphistory.pop(0)
self.opphistory.append(last)
# check if its a constant bot, if so exploit
if self.opphistory.count(self.opphistory[0]) == 3:
self.constantflag = 1
if last == "C":
self.myret = "D"
elif last == "N":
self.myret = "C"
elif last == "D":
self.myret = "N"
# check if its a titfortat bot, if so exploit
# give it 2 chances to see if its titfortat as it might happen randomly
if self.mylast == "D" and last == "D":
self.titfortatcounter = self.titfortatcounter + 1
if self.mylast == "D" and last!= "D":
self.titfortatcounter = 0
if self.titfortatcounter >= 3:
self.titfortatflag = 1
if self.titfortatflag == 1:
if last == "C":
self.myret = "D"
elif last == "D":
self.myret = "N"
elif last == "N":
# tit for tat doesn't return N, we made a mistake somewhere
self.titfortatflag = 0
self.titfortatcounter = 0
# else play the single game nash equilibrium
if self.constantflag == 0 and self.titfortatflag == 0:
a = random.random()
if a <= 0.2:
self.myret = "C"
elif a <= 0.6:
self.myret = "N"
else:
self.myret = "D"
self.mylast = self.myret
return self.myret
New contributor
Omer Faruk Yatkin is a new contributor to this site. Take care in asking for clarification, commenting, and answering.
Check out our Code of Conduct.
edited 11 hours ago
Blacksilver
413214
413214
New contributor
Omer Faruk Yatkin is a new contributor to this site. Take care in asking for clarification, commenting, and answering.
Check out our Code of Conduct.
answered 17 hours ago
Omer Faruk Yatkin
514
514
New contributor
Omer Faruk Yatkin is a new contributor to this site. Take care in asking for clarification, commenting, and answering.
Check out our Code of Conduct.
New contributor
Omer Faruk Yatkin is a new contributor to this site. Take care in asking for clarification, commenting, and answering.
Check out our Code of Conduct.
Omer Faruk Yatkin is a new contributor to this site. Take care in asking for clarification, commenting, and answering.
Check out our Code of Conduct.
1
NashEquilibrium.round needs to take arguments even if it doesn't use them, so as to fit the expected function prototype.
– Ray
11 hours ago
Thank you fixed it
– Omer Faruk Yatkin
11 hours ago
add a comment |
1
NashEquilibrium.round needs to take arguments even if it doesn't use them, so as to fit the expected function prototype.
– Ray
11 hours ago
Thank you fixed it
– Omer Faruk Yatkin
11 hours ago
1
1
NashEquilibrium.round needs to take arguments even if it doesn't use them, so as to fit the expected function prototype.
– Ray
11 hours ago
NashEquilibrium.round needs to take arguments even if it doesn't use them, so as to fit the expected function prototype.
– Ray
11 hours ago
Thank you fixed it
– Omer Faruk Yatkin
11 hours ago
Thank you fixed it
– Omer Faruk Yatkin
11 hours ago
add a comment |
up vote
4
down vote
Jade
class Jade:
def __init__(self):
self.dRate = 0.001
self.nRate = 0.003
def round(self, last):
if last == 'D':
self.dRate *= 1.1
self.nRate *= 1.2
elif last == 'N':
self.dRate *= 1.03
self.nRate *= 1.05
self.dRate = min(self.dRate, 1)
self.nRate = min(self.nRate, 1)
x = random.random()
if x > (1 - self.dRate):
return 'D'
elif x > (1 - self.nRate):
return 'N'
else:
return 'C'
Starts out optimistic, but gets progressively more bitter as the opponent refuses to cooperate. Lots of magic constants that could probably be tweaked, but this probably isn't going to do well enough to justify the time.
add a comment |
up vote
4
down vote
Jade
class Jade:
def __init__(self):
self.dRate = 0.001
self.nRate = 0.003
def round(self, last):
if last == 'D':
self.dRate *= 1.1
self.nRate *= 1.2
elif last == 'N':
self.dRate *= 1.03
self.nRate *= 1.05
self.dRate = min(self.dRate, 1)
self.nRate = min(self.nRate, 1)
x = random.random()
if x > (1 - self.dRate):
return 'D'
elif x > (1 - self.nRate):
return 'N'
else:
return 'C'
Starts out optimistic, but gets progressively more bitter as the opponent refuses to cooperate. Lots of magic constants that could probably be tweaked, but this probably isn't going to do well enough to justify the time.
add a comment |
up vote
4
down vote
up vote
4
down vote
Jade
class Jade:
def __init__(self):
self.dRate = 0.001
self.nRate = 0.003
def round(self, last):
if last == 'D':
self.dRate *= 1.1
self.nRate *= 1.2
elif last == 'N':
self.dRate *= 1.03
self.nRate *= 1.05
self.dRate = min(self.dRate, 1)
self.nRate = min(self.nRate, 1)
x = random.random()
if x > (1 - self.dRate):
return 'D'
elif x > (1 - self.nRate):
return 'N'
else:
return 'C'
Starts out optimistic, but gets progressively more bitter as the opponent refuses to cooperate. Lots of magic constants that could probably be tweaked, but this probably isn't going to do well enough to justify the time.
Jade
class Jade:
def __init__(self):
self.dRate = 0.001
self.nRate = 0.003
def round(self, last):
if last == 'D':
self.dRate *= 1.1
self.nRate *= 1.2
elif last == 'N':
self.dRate *= 1.03
self.nRate *= 1.05
self.dRate = min(self.dRate, 1)
self.nRate = min(self.nRate, 1)
x = random.random()
if x > (1 - self.dRate):
return 'D'
elif x > (1 - self.nRate):
return 'N'
else:
return 'C'
Starts out optimistic, but gets progressively more bitter as the opponent refuses to cooperate. Lots of magic constants that could probably be tweaked, but this probably isn't going to do well enough to justify the time.
answered 18 hours ago
Mnemonic
4,6271630
4,6271630
add a comment |
add a comment |
up vote
4
down vote
OldTitForTat
Old school player is too lazy to update for the new rules.
class OldTitForTat:
def round(self, last):
if(last == None)
return "C"
if(last == "C"):
return "C"
return "D"
add a comment |
up vote
4
down vote
OldTitForTat
Old school player is too lazy to update for the new rules.
class OldTitForTat:
def round(self, last):
if(last == None)
return "C"
if(last == "C"):
return "C"
return "D"
add a comment |
up vote
4
down vote
up vote
4
down vote
OldTitForTat
Old school player is too lazy to update for the new rules.
class OldTitForTat:
def round(self, last):
if(last == None)
return "C"
if(last == "C"):
return "C"
return "D"
OldTitForTat
Old school player is too lazy to update for the new rules.
class OldTitForTat:
def round(self, last):
if(last == None)
return "C"
if(last == "C"):
return "C"
return "D"
answered 17 hours ago
Joshua
2,385918
2,385918
add a comment |
add a comment |
up vote
3
down vote
NeverCOOP
class NeverCOOP:
def round(self, last):
try:
if last in "ND":
return "D"
else:
return "N"
except:
return "N"
If the opposing bot defects or is neutral, choose defect. Otherwise if this is the first turn or the opposing bot cooperates, choose neutral. I'm not sure how good this will work...
New contributor
glietz is a new contributor to this site. Take care in asking for clarification, commenting, and answering.
Check out our Code of Conduct.
What's the try/except for?
– Blacksilver
19 hours ago
1
@Blacksilver I'd assume it serves the same purpose as theif(last):in your Grudger bot, detecting whether there was a previous round.
– ETHproductions
19 hours ago
ahh, I see.None in "ND"errors.
– Blacksilver
19 hours ago
Becauseif last and last in "ND":was too complicated?
– immibis
17 hours ago
add a comment |
up vote
3
down vote
NeverCOOP
class NeverCOOP:
def round(self, last):
try:
if last in "ND":
return "D"
else:
return "N"
except:
return "N"
If the opposing bot defects or is neutral, choose defect. Otherwise if this is the first turn or the opposing bot cooperates, choose neutral. I'm not sure how good this will work...
New contributor
glietz is a new contributor to this site. Take care in asking for clarification, commenting, and answering.
Check out our Code of Conduct.
What's the try/except for?
– Blacksilver
19 hours ago
1
@Blacksilver I'd assume it serves the same purpose as theif(last):in your Grudger bot, detecting whether there was a previous round.
– ETHproductions
19 hours ago
ahh, I see.None in "ND"errors.
– Blacksilver
19 hours ago
Becauseif last and last in "ND":was too complicated?
– immibis
17 hours ago
add a comment |
up vote
3
down vote
up vote
3
down vote
NeverCOOP
class NeverCOOP:
def round(self, last):
try:
if last in "ND":
return "D"
else:
return "N"
except:
return "N"
If the opposing bot defects or is neutral, choose defect. Otherwise if this is the first turn or the opposing bot cooperates, choose neutral. I'm not sure how good this will work...
New contributor
glietz is a new contributor to this site. Take care in asking for clarification, commenting, and answering.
Check out our Code of Conduct.
NeverCOOP
class NeverCOOP:
def round(self, last):
try:
if last in "ND":
return "D"
else:
return "N"
except:
return "N"
If the opposing bot defects or is neutral, choose defect. Otherwise if this is the first turn or the opposing bot cooperates, choose neutral. I'm not sure how good this will work...
New contributor
glietz is a new contributor to this site. Take care in asking for clarification, commenting, and answering.
Check out our Code of Conduct.
New contributor
glietz is a new contributor to this site. Take care in asking for clarification, commenting, and answering.
Check out our Code of Conduct.
answered 20 hours ago
glietz
716
716
New contributor
glietz is a new contributor to this site. Take care in asking for clarification, commenting, and answering.
Check out our Code of Conduct.
New contributor
glietz is a new contributor to this site. Take care in asking for clarification, commenting, and answering.
Check out our Code of Conduct.
glietz is a new contributor to this site. Take care in asking for clarification, commenting, and answering.
Check out our Code of Conduct.
What's the try/except for?
– Blacksilver
19 hours ago
1
@Blacksilver I'd assume it serves the same purpose as theif(last):in your Grudger bot, detecting whether there was a previous round.
– ETHproductions
19 hours ago
ahh, I see.None in "ND"errors.
– Blacksilver
19 hours ago
Becauseif last and last in "ND":was too complicated?
– immibis
17 hours ago
add a comment |
What's the try/except for?
– Blacksilver
19 hours ago
1
@Blacksilver I'd assume it serves the same purpose as theif(last):in your Grudger bot, detecting whether there was a previous round.
– ETHproductions
19 hours ago
ahh, I see.None in "ND"errors.
– Blacksilver
19 hours ago
Becauseif last and last in "ND":was too complicated?
– immibis
17 hours ago
What's the try/except for?
– Blacksilver
19 hours ago
What's the try/except for?
– Blacksilver
19 hours ago
1
1
@Blacksilver I'd assume it serves the same purpose as the
if(last): in your Grudger bot, detecting whether there was a previous round.– ETHproductions
19 hours ago
@Blacksilver I'd assume it serves the same purpose as the
if(last): in your Grudger bot, detecting whether there was a previous round.– ETHproductions
19 hours ago
ahh, I see.
None in "ND" errors.– Blacksilver
19 hours ago
ahh, I see.
None in "ND" errors.– Blacksilver
19 hours ago
Because
if last and last in "ND": was too complicated?– immibis
17 hours ago
Because
if last and last in "ND": was too complicated?– immibis
17 hours ago
add a comment |
up vote
3
down vote
LastOptimalBot
class LastOptimalBot:
def round(self, last):
return "N" if last == "D" else ("D" if last == "C" else "C")
Assumes that the opposing bot will always play the same move again, and chooses the one that has the best payoff against it.
Averages:
Me Opp
2.6 2 vs TitForTat
5 0 vs AllC
4 1 vs AllN
3 2 vs AllD
3.5 3.5 vs Random
3 2 vs Grudger
2 2 vs LastOptimalBot
1 3.5 vs TatForTit
4 1 vs NeverCOOP
1 4 vs EvaluaterBot
2.28 2.24 vs NashEquilibrium
2.91 average overall
oof. Maybe T4T would do better asreturn last.
– Blacksilver
20 hours ago
I'd like that! If TitForTat werereturn last, LOB would go 18-9 over 6 rounds rather than the 13-10 over 5 rounds it's currently getting. I think it's fine as is - don't worry about optimizing the example bots.
– Spitemaster
20 hours ago
return lastwould be a better T4T for this challenge, I think
– Sparr
20 hours ago
Just tried -- theif(last): return last; else: return "C"is worse.
– Blacksilver
19 hours ago
Right, but as @Sparr was saying, it might be more appropriate. Up to you, I suppose.
– Spitemaster
19 hours ago
add a comment |
up vote
3
down vote
LastOptimalBot
class LastOptimalBot:
def round(self, last):
return "N" if last == "D" else ("D" if last == "C" else "C")
Assumes that the opposing bot will always play the same move again, and chooses the one that has the best payoff against it.
Averages:
Me Opp
2.6 2 vs TitForTat
5 0 vs AllC
4 1 vs AllN
3 2 vs AllD
3.5 3.5 vs Random
3 2 vs Grudger
2 2 vs LastOptimalBot
1 3.5 vs TatForTit
4 1 vs NeverCOOP
1 4 vs EvaluaterBot
2.28 2.24 vs NashEquilibrium
2.91 average overall
oof. Maybe T4T would do better asreturn last.
– Blacksilver
20 hours ago
I'd like that! If TitForTat werereturn last, LOB would go 18-9 over 6 rounds rather than the 13-10 over 5 rounds it's currently getting. I think it's fine as is - don't worry about optimizing the example bots.
– Spitemaster
20 hours ago
return lastwould be a better T4T for this challenge, I think
– Sparr
20 hours ago
Just tried -- theif(last): return last; else: return "C"is worse.
– Blacksilver
19 hours ago
Right, but as @Sparr was saying, it might be more appropriate. Up to you, I suppose.
– Spitemaster
19 hours ago
add a comment |
up vote
3
down vote
up vote
3
down vote
LastOptimalBot
class LastOptimalBot:
def round(self, last):
return "N" if last == "D" else ("D" if last == "C" else "C")
Assumes that the opposing bot will always play the same move again, and chooses the one that has the best payoff against it.
Averages:
Me Opp
2.6 2 vs TitForTat
5 0 vs AllC
4 1 vs AllN
3 2 vs AllD
3.5 3.5 vs Random
3 2 vs Grudger
2 2 vs LastOptimalBot
1 3.5 vs TatForTit
4 1 vs NeverCOOP
1 4 vs EvaluaterBot
2.28 2.24 vs NashEquilibrium
2.91 average overall
LastOptimalBot
class LastOptimalBot:
def round(self, last):
return "N" if last == "D" else ("D" if last == "C" else "C")
Assumes that the opposing bot will always play the same move again, and chooses the one that has the best payoff against it.
Averages:
Me Opp
2.6 2 vs TitForTat
5 0 vs AllC
4 1 vs AllN
3 2 vs AllD
3.5 3.5 vs Random
3 2 vs Grudger
2 2 vs LastOptimalBot
1 3.5 vs TatForTit
4 1 vs NeverCOOP
1 4 vs EvaluaterBot
2.28 2.24 vs NashEquilibrium
2.91 average overall
edited 16 hours ago
answered 20 hours ago
Spitemaster
3214
3214
oof. Maybe T4T would do better asreturn last.
– Blacksilver
20 hours ago
I'd like that! If TitForTat werereturn last, LOB would go 18-9 over 6 rounds rather than the 13-10 over 5 rounds it's currently getting. I think it's fine as is - don't worry about optimizing the example bots.
– Spitemaster
20 hours ago
return lastwould be a better T4T for this challenge, I think
– Sparr
20 hours ago
Just tried -- theif(last): return last; else: return "C"is worse.
– Blacksilver
19 hours ago
Right, but as @Sparr was saying, it might be more appropriate. Up to you, I suppose.
– Spitemaster
19 hours ago
add a comment |
oof. Maybe T4T would do better asreturn last.
– Blacksilver
20 hours ago
I'd like that! If TitForTat werereturn last, LOB would go 18-9 over 6 rounds rather than the 13-10 over 5 rounds it's currently getting. I think it's fine as is - don't worry about optimizing the example bots.
– Spitemaster
20 hours ago
return lastwould be a better T4T for this challenge, I think
– Sparr
20 hours ago
Just tried -- theif(last): return last; else: return "C"is worse.
– Blacksilver
19 hours ago
Right, but as @Sparr was saying, it might be more appropriate. Up to you, I suppose.
– Spitemaster
19 hours ago
oof. Maybe T4T would do better as
return last.– Blacksilver
20 hours ago
oof. Maybe T4T would do better as
return last.– Blacksilver
20 hours ago
I'd like that! If TitForTat were
return last, LOB would go 18-9 over 6 rounds rather than the 13-10 over 5 rounds it's currently getting. I think it's fine as is - don't worry about optimizing the example bots.– Spitemaster
20 hours ago
I'd like that! If TitForTat were
return last, LOB would go 18-9 over 6 rounds rather than the 13-10 over 5 rounds it's currently getting. I think it's fine as is - don't worry about optimizing the example bots.– Spitemaster
20 hours ago
return last would be a better T4T for this challenge, I think– Sparr
20 hours ago
return last would be a better T4T for this challenge, I think– Sparr
20 hours ago
Just tried -- the
if(last): return last; else: return "C" is worse.– Blacksilver
19 hours ago
Just tried -- the
if(last): return last; else: return "C" is worse.– Blacksilver
19 hours ago
Right, but as @Sparr was saying, it might be more appropriate. Up to you, I suppose.
– Spitemaster
19 hours ago
Right, but as @Sparr was saying, it might be more appropriate. Up to you, I suppose.
– Spitemaster
19 hours ago
add a comment |
up vote
3
down vote
PatternFinder
class PatternFinder:
def __init__(self):
import collections
self.size = 10
self.moves = [None]
self.other =
self.patterns = collections.defaultdict(list)
self.counter_moves = {"C":"D", "N":"C", "D":"N"}
self.initial_move = "D"
self.pattern_length_exponent = 1
self.pattern_age_exponent = 1
self.debug = False
def round(self, last):
self.other.append(last)
best_pattern_match = None
best_pattern_score = None
best_pattern_response = None
self.debug and print("match so far:",tuple(zip(self.moves,self.other)))
for turn in range(max(0,len(self.moves)-self.size),len(self.moves)):
# record patterns ending with the move that just happened
pattern_full = tuple(zip(self.moves[turn:],self.other[turn:]))
if len(pattern_full) > 1:
pattern_trunc = pattern_full[:-1]
pattern_trunc_result = pattern_full[-1][1]
self.patterns[pattern_trunc].append([pattern_trunc_result,len(self.moves)-1])
if pattern_full in self.patterns:
# we've seen this pattern at least once before
self.debug and print("I've seen",pattern_full,"before:",self.patterns[pattern_full])
for [response,turn_num] in self.patterns[pattern_full]:
score = len(pattern_full) ** self.pattern_length_exponent / (len(self.moves) - turn_num) ** self.pattern_age_exponent
if best_pattern_score == None or score > best_pattern_score:
best_pattern_match = pattern_full
best_pattern_score = score
best_pattern_response = response
# this could be much smarter about aggregating previous responses
if best_pattern_response:
move = self.counter_moves[best_pattern_response]
else:
# fall back to playing nice
move = "C"
self.moves.append(move)
self.debug and print("I choose",move)
return move
This bot looks for previous occurrences of the recent game state to see how the opponent responded to those occurrences, with a preference for longer pattern matches and more recent matches, then plays the move that will "beat" the opponent's predicted move. There's a lot of room for it to be smarter with all the data it keeps track of, but I ran out of time to work on it.
When you get the time, mind giving her an optimization pass? It's easily the largest time-sink.
– Blacksilver
11 hours ago
1
@Blacksilver I just reduced the maximum pattern length from 100 to 10. It should run almost instantly now if you're running <200 rounds
– Sparr
9 hours ago
add a comment |
up vote
3
down vote
PatternFinder
class PatternFinder:
def __init__(self):
import collections
self.size = 10
self.moves = [None]
self.other =
self.patterns = collections.defaultdict(list)
self.counter_moves = {"C":"D", "N":"C", "D":"N"}
self.initial_move = "D"
self.pattern_length_exponent = 1
self.pattern_age_exponent = 1
self.debug = False
def round(self, last):
self.other.append(last)
best_pattern_match = None
best_pattern_score = None
best_pattern_response = None
self.debug and print("match so far:",tuple(zip(self.moves,self.other)))
for turn in range(max(0,len(self.moves)-self.size),len(self.moves)):
# record patterns ending with the move that just happened
pattern_full = tuple(zip(self.moves[turn:],self.other[turn:]))
if len(pattern_full) > 1:
pattern_trunc = pattern_full[:-1]
pattern_trunc_result = pattern_full[-1][1]
self.patterns[pattern_trunc].append([pattern_trunc_result,len(self.moves)-1])
if pattern_full in self.patterns:
# we've seen this pattern at least once before
self.debug and print("I've seen",pattern_full,"before:",self.patterns[pattern_full])
for [response,turn_num] in self.patterns[pattern_full]:
score = len(pattern_full) ** self.pattern_length_exponent / (len(self.moves) - turn_num) ** self.pattern_age_exponent
if best_pattern_score == None or score > best_pattern_score:
best_pattern_match = pattern_full
best_pattern_score = score
best_pattern_response = response
# this could be much smarter about aggregating previous responses
if best_pattern_response:
move = self.counter_moves[best_pattern_response]
else:
# fall back to playing nice
move = "C"
self.moves.append(move)
self.debug and print("I choose",move)
return move
This bot looks for previous occurrences of the recent game state to see how the opponent responded to those occurrences, with a preference for longer pattern matches and more recent matches, then plays the move that will "beat" the opponent's predicted move. There's a lot of room for it to be smarter with all the data it keeps track of, but I ran out of time to work on it.
When you get the time, mind giving her an optimization pass? It's easily the largest time-sink.
– Blacksilver
11 hours ago
1
@Blacksilver I just reduced the maximum pattern length from 100 to 10. It should run almost instantly now if you're running <200 rounds
– Sparr
9 hours ago
add a comment |
up vote
3
down vote
up vote
3
down vote
PatternFinder
class PatternFinder:
def __init__(self):
import collections
self.size = 10
self.moves = [None]
self.other =
self.patterns = collections.defaultdict(list)
self.counter_moves = {"C":"D", "N":"C", "D":"N"}
self.initial_move = "D"
self.pattern_length_exponent = 1
self.pattern_age_exponent = 1
self.debug = False
def round(self, last):
self.other.append(last)
best_pattern_match = None
best_pattern_score = None
best_pattern_response = None
self.debug and print("match so far:",tuple(zip(self.moves,self.other)))
for turn in range(max(0,len(self.moves)-self.size),len(self.moves)):
# record patterns ending with the move that just happened
pattern_full = tuple(zip(self.moves[turn:],self.other[turn:]))
if len(pattern_full) > 1:
pattern_trunc = pattern_full[:-1]
pattern_trunc_result = pattern_full[-1][1]
self.patterns[pattern_trunc].append([pattern_trunc_result,len(self.moves)-1])
if pattern_full in self.patterns:
# we've seen this pattern at least once before
self.debug and print("I've seen",pattern_full,"before:",self.patterns[pattern_full])
for [response,turn_num] in self.patterns[pattern_full]:
score = len(pattern_full) ** self.pattern_length_exponent / (len(self.moves) - turn_num) ** self.pattern_age_exponent
if best_pattern_score == None or score > best_pattern_score:
best_pattern_match = pattern_full
best_pattern_score = score
best_pattern_response = response
# this could be much smarter about aggregating previous responses
if best_pattern_response:
move = self.counter_moves[best_pattern_response]
else:
# fall back to playing nice
move = "C"
self.moves.append(move)
self.debug and print("I choose",move)
return move
This bot looks for previous occurrences of the recent game state to see how the opponent responded to those occurrences, with a preference for longer pattern matches and more recent matches, then plays the move that will "beat" the opponent's predicted move. There's a lot of room for it to be smarter with all the data it keeps track of, but I ran out of time to work on it.
PatternFinder
class PatternFinder:
def __init__(self):
import collections
self.size = 10
self.moves = [None]
self.other =
self.patterns = collections.defaultdict(list)
self.counter_moves = {"C":"D", "N":"C", "D":"N"}
self.initial_move = "D"
self.pattern_length_exponent = 1
self.pattern_age_exponent = 1
self.debug = False
def round(self, last):
self.other.append(last)
best_pattern_match = None
best_pattern_score = None
best_pattern_response = None
self.debug and print("match so far:",tuple(zip(self.moves,self.other)))
for turn in range(max(0,len(self.moves)-self.size),len(self.moves)):
# record patterns ending with the move that just happened
pattern_full = tuple(zip(self.moves[turn:],self.other[turn:]))
if len(pattern_full) > 1:
pattern_trunc = pattern_full[:-1]
pattern_trunc_result = pattern_full[-1][1]
self.patterns[pattern_trunc].append([pattern_trunc_result,len(self.moves)-1])
if pattern_full in self.patterns:
# we've seen this pattern at least once before
self.debug and print("I've seen",pattern_full,"before:",self.patterns[pattern_full])
for [response,turn_num] in self.patterns[pattern_full]:
score = len(pattern_full) ** self.pattern_length_exponent / (len(self.moves) - turn_num) ** self.pattern_age_exponent
if best_pattern_score == None or score > best_pattern_score:
best_pattern_match = pattern_full
best_pattern_score = score
best_pattern_response = response
# this could be much smarter about aggregating previous responses
if best_pattern_response:
move = self.counter_moves[best_pattern_response]
else:
# fall back to playing nice
move = "C"
self.moves.append(move)
self.debug and print("I choose",move)
return move
This bot looks for previous occurrences of the recent game state to see how the opponent responded to those occurrences, with a preference for longer pattern matches and more recent matches, then plays the move that will "beat" the opponent's predicted move. There's a lot of room for it to be smarter with all the data it keeps track of, but I ran out of time to work on it.
edited 9 hours ago
answered 15 hours ago
Sparr
4,9781633
4,9781633
When you get the time, mind giving her an optimization pass? It's easily the largest time-sink.
– Blacksilver
11 hours ago
1
@Blacksilver I just reduced the maximum pattern length from 100 to 10. It should run almost instantly now if you're running <200 rounds
– Sparr
9 hours ago
add a comment |
When you get the time, mind giving her an optimization pass? It's easily the largest time-sink.
– Blacksilver
11 hours ago
1
@Blacksilver I just reduced the maximum pattern length from 100 to 10. It should run almost instantly now if you're running <200 rounds
– Sparr
9 hours ago
When you get the time, mind giving her an optimization pass? It's easily the largest time-sink.
– Blacksilver
11 hours ago
When you get the time, mind giving her an optimization pass? It's easily the largest time-sink.
– Blacksilver
11 hours ago
1
1
@Blacksilver I just reduced the maximum pattern length from 100 to 10. It should run almost instantly now if you're running <200 rounds
– Sparr
9 hours ago
@Blacksilver I just reduced the maximum pattern length from 100 to 10. It should run almost instantly now if you're running <200 rounds
– Sparr
9 hours ago
add a comment |
up vote
2
down vote
CopyCat
class CopyCat:
def round(self, last):
if last:
return last
return "C"
Copies the opponent's last move.
I don't expect this to do well, but no one had implemented this classic yet.
New contributor
MannerPots is a new contributor to this site. Take care in asking for clarification, commenting, and answering.
Check out our Code of Conduct.
add a comment |
up vote
2
down vote
CopyCat
class CopyCat:
def round(self, last):
if last:
return last
return "C"
Copies the opponent's last move.
I don't expect this to do well, but no one had implemented this classic yet.
New contributor
MannerPots is a new contributor to this site. Take care in asking for clarification, commenting, and answering.
Check out our Code of Conduct.
add a comment |
up vote
2
down vote
up vote
2
down vote
CopyCat
class CopyCat:
def round(self, last):
if last:
return last
return "C"
Copies the opponent's last move.
I don't expect this to do well, but no one had implemented this classic yet.
New contributor
MannerPots is a new contributor to this site. Take care in asking for clarification, commenting, and answering.
Check out our Code of Conduct.
CopyCat
class CopyCat:
def round(self, last):
if last:
return last
return "C"
Copies the opponent's last move.
I don't expect this to do well, but no one had implemented this classic yet.
New contributor
MannerPots is a new contributor to this site. Take care in asking for clarification, commenting, and answering.
Check out our Code of Conduct.
edited 11 hours ago
Blacksilver
413214
413214
New contributor
MannerPots is a new contributor to this site. Take care in asking for clarification, commenting, and answering.
Check out our Code of Conduct.
answered 12 hours ago
MannerPots
212
212
New contributor
MannerPots is a new contributor to this site. Take care in asking for clarification, commenting, and answering.
Check out our Code of Conduct.
New contributor
MannerPots is a new contributor to this site. Take care in asking for clarification, commenting, and answering.
Check out our Code of Conduct.
MannerPots is a new contributor to this site. Take care in asking for clarification, commenting, and answering.
Check out our Code of Conduct.
add a comment |
add a comment |
up vote
1
down vote
class HistoricAverage:
PAYOFFS = {
"C":{"C":3,"N":1,"D":5},
"N":{"C":4,"N":2,"D":2},
"D":{"C":0,"N":3,"D":1}}
def __init__(self):
self.history =
def round(self, last):
if(last != None):
self.history.append(last)
payoffsum = {"C":0, "N":0, "D":0}
for move in self.history:
for x in payoffsum:
payoffsum[x] += HistoricAverage.PAYOFFS[move][x]
return max(payoffsum, key=payoffsum.get)
Looks at history and finds the action that would have been best on average. Starts cooperative.
add a comment |
up vote
1
down vote
class HistoricAverage:
PAYOFFS = {
"C":{"C":3,"N":1,"D":5},
"N":{"C":4,"N":2,"D":2},
"D":{"C":0,"N":3,"D":1}}
def __init__(self):
self.history =
def round(self, last):
if(last != None):
self.history.append(last)
payoffsum = {"C":0, "N":0, "D":0}
for move in self.history:
for x in payoffsum:
payoffsum[x] += HistoricAverage.PAYOFFS[move][x]
return max(payoffsum, key=payoffsum.get)
Looks at history and finds the action that would have been best on average. Starts cooperative.
add a comment |
up vote
1
down vote
up vote
1
down vote
class HistoricAverage:
PAYOFFS = {
"C":{"C":3,"N":1,"D":5},
"N":{"C":4,"N":2,"D":2},
"D":{"C":0,"N":3,"D":1}}
def __init__(self):
self.history =
def round(self, last):
if(last != None):
self.history.append(last)
payoffsum = {"C":0, "N":0, "D":0}
for move in self.history:
for x in payoffsum:
payoffsum[x] += HistoricAverage.PAYOFFS[move][x]
return max(payoffsum, key=payoffsum.get)
Looks at history and finds the action that would have been best on average. Starts cooperative.
class HistoricAverage:
PAYOFFS = {
"C":{"C":3,"N":1,"D":5},
"N":{"C":4,"N":2,"D":2},
"D":{"C":0,"N":3,"D":1}}
def __init__(self):
self.history =
def round(self, last):
if(last != None):
self.history.append(last)
payoffsum = {"C":0, "N":0, "D":0}
for move in self.history:
for x in payoffsum:
payoffsum[x] += HistoricAverage.PAYOFFS[move][x]
return max(payoffsum, key=payoffsum.get)
Looks at history and finds the action that would have been best on average. Starts cooperative.
answered 15 hours ago
MegaTom
3,4321324
3,4321324
add a comment |
add a comment |
up vote
1
down vote
Ensemble
This runs an ensemble of related models. The individual models consider different amounts of history, and have the option of either always choosing the move that will optimize the expected payout difference, or will randomly select a move in proportion to expected payout difference.
Each member of the ensemble then votes on their preferred move. They get a number of votes equal to how much more they've won than the opponent (which means that terrible models will get negative votes). Whichever move wins the vote is then selected.
(They should probably split their votes among the moves in proportion to how much they favor each, but I don't care enough to do that right now.)
It beats everything posted so far except EvaluaterBot and PatternFinder. (One-on-one, it beats EvaluaterBot and loses to PatternFinder).
from collections import defaultdict
import random
class Number6:
class Choices:
def __init__(self, C = 0, N = 0, D = 0):
self.C = C
self.N = N
self.D = D
def __init__(self, strategy = "maxExpected", markov_order = 3):
self.MARKOV_ORDER = markov_order;
self.my_choices = ""
self.opponent = defaultdict(lambda: self.Choices())
self.choice = None # previous choice
self.payoff = {
"C": { "C": 3-3, "N": 4-1, "D": 0-5 },
"N": { "C": 1-4, "N": 2-2, "D": 3-2 },
"D": { "C": 5-0, "N": 2-3, "D": 1-1 },
}
self.total_payoff = 0
# if random, will choose in proportion to payoff.
# otherwise, will always choose argmax
self.strategy = strategy
# maxExpected: maximize expected relative payoff
# random: like maxExpected, but it chooses in proportion to E[payoff]
# argmax: always choose the option that is optimal for expected opponent choice
def update_opponent_model(self, last):
for i in range(0, self.MARKOV_ORDER):
hist = self.my_choices[i:]
self.opponent[hist].C += ("C" == last)
self.opponent[hist].N += ("N" == last)
self.opponent[hist].D += ("D" == last)
def normalize(self, counts):
sum = float(counts.C + counts.N + counts.D)
if 0 == sum:
return self.Choices(1.0 / 3.0, 1.0 / 3.0, 1.0 / 3.0)
return self.Choices(
counts.C / sum, counts.N / sum, counts.D / sum)
def get_distribution(self):
for i in range(0, self.MARKOV_ORDER):
hist = self.my_choices[i:]
#print "check hist = " + hist
if hist in self.opponent:
return self.normalize(self.opponent[hist])
return self.Choices(1.0 / 3.0, 1.0 / 3.0, 1.0 / 3.0)
def choose(self, dist):
payoff = self.Choices()
# We're interested in *beating the opponent*, not
# maximizing our score, so we optimize the difference
payoff.C = (3-3) * dist.C + (4-1) * dist.N + (0-5) * dist.D
payoff.N = (1-4) * dist.C + (2-2) * dist.N + (3-2) * dist.D
payoff.D = (5-0) * dist.C + (2-3) * dist.N + (1-1) * dist.D
# D has slightly better payoff on uniform opponent,
# so we select it on ties
if self.strategy == "maxExpected":
if payoff.C > payoff.N:
return "C" if payoff.C > payoff.D else "D"
return "N" if payoff.N > payoff.D else "D"
elif self.strategy == "randomize":
payoff = self.normalize(payoff)
r = random.uniform(0.0, 1.0)
if (r < payoff.C): return "C"
return "N" if (r < payoff.N) else "D"
elif self.strategy == "argMax":
if dist.C > dist.N:
return "D" if dist.C > dist.D else "N"
return "C" if dist.N > dist.D else "N"
assert(0) #, "I am not a number! I am a free man!")
def update_history(self):
self.my_choices += self.choice
if len(self.my_choices) > self.MARKOV_ORDER:
assert(len(self.my_choices) == self.MARKOV_ORDER + 1)
self.my_choices = self.my_choices[1:]
def round(self, last):
if last: self.update_opponent_model(last)
dist = self.get_distribution()
self.choice = self.choose(dist)
self.update_history()
return self.choice
class Ensemble:
def __init__(self):
self.models =
self.votes =
self.prev_choice =
for order in range(0, 6):
self.models.append(Number6("maxExpected", order))
self.models.append(Number6("randomize", order))
#self.models.append(Number6("argMax", order))
for i in range(0, len(self.models)):
self.votes.append(0)
self.prev_choice.append("D")
self.payoff = {
"C": { "C": 3-3, "N": 4-1, "D": 0-5 },
"N": { "C": 1-4, "N": 2-2, "D": 3-2 },
"D": { "C": 5-0, "N": 2-3, "D": 1-1 },
}
def round(self, last):
if last:
for i in range(0, len(self.models)):
self.votes[i] += self.payoff[self.prev_choice[i]][last]
# vote. Sufficiently terrible models get negative votes
C = 0
N = 0
D = 0
for i in range(0, len(self.models)):
choice = self.models[i].round(last)
if "C" == choice: C += self.votes[i]
if "N" == choice: N += self.votes[i]
if "D" == choice: D += self.votes[i]
self.prev_choice[i] = choice
if C > D and C > N: return "C"
elif N > D: return "N"
else: return "D"
Test Framework
In case anyone else finds it useful, here's a test framework for looking at individual matchups. Python2. Just put all the opponents you're interested in in opponents.py, and change the references to Ensemble to your own.
import sys, inspect
import opponents
from ensemble import Ensemble
def count_payoff(label, them):
if None == them: return
me = choices[label]
payoff = {
"C": { "C": 3-3, "N": 4-1, "D": 0-5 },
"N": { "C": 1-4, "N": 2-2, "D": 3-2 },
"D": { "C": 5-0, "N": 2-3, "D": 1-1 },
}
if label not in total_payoff: total_payoff[label] = 0
total_payoff[label] += payoff[me][them]
def update_hist(label, choice):
choices[label] = choice
opponents = [ x[1] for x
in inspect.getmembers(sys.modules['opponents'], inspect.isclass)]
for k in opponents:
total_payoff = {}
for j in range(0, 100):
A = Ensemble()
B = k()
choices = {}
aChoice = None
bChoice = None
for i in range(0, 100):
count_payoff(A.__class__.__name__, bChoice)
a = A.round(bChoice)
update_hist(A.__class__.__name__, a)
count_payoff(B.__class__.__name__, aChoice)
b = B.round(aChoice)
update_hist(B.__class__.__name__, b)
aChoice = a
bChoice = b
print total_payoff
The controller is ready, you didn't have to do all that...
– Blacksilver
11 hours ago
1
@Blacksilver I realized that just as I was about to submit. But this one works in versions before 3.6, and gives information about individual matchups that can help identify weak spots, so it wasn't a complete waste of time.
– Ray
11 hours ago
Fair enough; running now. I'll probably add options to my controller to do similar things.
– Blacksilver
11 hours ago
"It beats everything posted so far except Ensemble and PatternFinder" I'm honored :)
– Sparr
7 hours ago
@Sparr Oops. That was supposed to say EvaluaterBot and PatternFinder. But that's when comparing total score against the entire field. PatternFinder remains the only one that beats this in a direct match up.
– Ray
6 hours ago
add a comment |
up vote
1
down vote
Ensemble
This runs an ensemble of related models. The individual models consider different amounts of history, and have the option of either always choosing the move that will optimize the expected payout difference, or will randomly select a move in proportion to expected payout difference.
Each member of the ensemble then votes on their preferred move. They get a number of votes equal to how much more they've won than the opponent (which means that terrible models will get negative votes). Whichever move wins the vote is then selected.
(They should probably split their votes among the moves in proportion to how much they favor each, but I don't care enough to do that right now.)
It beats everything posted so far except EvaluaterBot and PatternFinder. (One-on-one, it beats EvaluaterBot and loses to PatternFinder).
from collections import defaultdict
import random
class Number6:
class Choices:
def __init__(self, C = 0, N = 0, D = 0):
self.C = C
self.N = N
self.D = D
def __init__(self, strategy = "maxExpected", markov_order = 3):
self.MARKOV_ORDER = markov_order;
self.my_choices = ""
self.opponent = defaultdict(lambda: self.Choices())
self.choice = None # previous choice
self.payoff = {
"C": { "C": 3-3, "N": 4-1, "D": 0-5 },
"N": { "C": 1-4, "N": 2-2, "D": 3-2 },
"D": { "C": 5-0, "N": 2-3, "D": 1-1 },
}
self.total_payoff = 0
# if random, will choose in proportion to payoff.
# otherwise, will always choose argmax
self.strategy = strategy
# maxExpected: maximize expected relative payoff
# random: like maxExpected, but it chooses in proportion to E[payoff]
# argmax: always choose the option that is optimal for expected opponent choice
def update_opponent_model(self, last):
for i in range(0, self.MARKOV_ORDER):
hist = self.my_choices[i:]
self.opponent[hist].C += ("C" == last)
self.opponent[hist].N += ("N" == last)
self.opponent[hist].D += ("D" == last)
def normalize(self, counts):
sum = float(counts.C + counts.N + counts.D)
if 0 == sum:
return self.Choices(1.0 / 3.0, 1.0 / 3.0, 1.0 / 3.0)
return self.Choices(
counts.C / sum, counts.N / sum, counts.D / sum)
def get_distribution(self):
for i in range(0, self.MARKOV_ORDER):
hist = self.my_choices[i:]
#print "check hist = " + hist
if hist in self.opponent:
return self.normalize(self.opponent[hist])
return self.Choices(1.0 / 3.0, 1.0 / 3.0, 1.0 / 3.0)
def choose(self, dist):
payoff = self.Choices()
# We're interested in *beating the opponent*, not
# maximizing our score, so we optimize the difference
payoff.C = (3-3) * dist.C + (4-1) * dist.N + (0-5) * dist.D
payoff.N = (1-4) * dist.C + (2-2) * dist.N + (3-2) * dist.D
payoff.D = (5-0) * dist.C + (2-3) * dist.N + (1-1) * dist.D
# D has slightly better payoff on uniform opponent,
# so we select it on ties
if self.strategy == "maxExpected":
if payoff.C > payoff.N:
return "C" if payoff.C > payoff.D else "D"
return "N" if payoff.N > payoff.D else "D"
elif self.strategy == "randomize":
payoff = self.normalize(payoff)
r = random.uniform(0.0, 1.0)
if (r < payoff.C): return "C"
return "N" if (r < payoff.N) else "D"
elif self.strategy == "argMax":
if dist.C > dist.N:
return "D" if dist.C > dist.D else "N"
return "C" if dist.N > dist.D else "N"
assert(0) #, "I am not a number! I am a free man!")
def update_history(self):
self.my_choices += self.choice
if len(self.my_choices) > self.MARKOV_ORDER:
assert(len(self.my_choices) == self.MARKOV_ORDER + 1)
self.my_choices = self.my_choices[1:]
def round(self, last):
if last: self.update_opponent_model(last)
dist = self.get_distribution()
self.choice = self.choose(dist)
self.update_history()
return self.choice
class Ensemble:
def __init__(self):
self.models =
self.votes =
self.prev_choice =
for order in range(0, 6):
self.models.append(Number6("maxExpected", order))
self.models.append(Number6("randomize", order))
#self.models.append(Number6("argMax", order))
for i in range(0, len(self.models)):
self.votes.append(0)
self.prev_choice.append("D")
self.payoff = {
"C": { "C": 3-3, "N": 4-1, "D": 0-5 },
"N": { "C": 1-4, "N": 2-2, "D": 3-2 },
"D": { "C": 5-0, "N": 2-3, "D": 1-1 },
}
def round(self, last):
if last:
for i in range(0, len(self.models)):
self.votes[i] += self.payoff[self.prev_choice[i]][last]
# vote. Sufficiently terrible models get negative votes
C = 0
N = 0
D = 0
for i in range(0, len(self.models)):
choice = self.models[i].round(last)
if "C" == choice: C += self.votes[i]
if "N" == choice: N += self.votes[i]
if "D" == choice: D += self.votes[i]
self.prev_choice[i] = choice
if C > D and C > N: return "C"
elif N > D: return "N"
else: return "D"
Test Framework
In case anyone else finds it useful, here's a test framework for looking at individual matchups. Python2. Just put all the opponents you're interested in in opponents.py, and change the references to Ensemble to your own.
import sys, inspect
import opponents
from ensemble import Ensemble
def count_payoff(label, them):
if None == them: return
me = choices[label]
payoff = {
"C": { "C": 3-3, "N": 4-1, "D": 0-5 },
"N": { "C": 1-4, "N": 2-2, "D": 3-2 },
"D": { "C": 5-0, "N": 2-3, "D": 1-1 },
}
if label not in total_payoff: total_payoff[label] = 0
total_payoff[label] += payoff[me][them]
def update_hist(label, choice):
choices[label] = choice
opponents = [ x[1] for x
in inspect.getmembers(sys.modules['opponents'], inspect.isclass)]
for k in opponents:
total_payoff = {}
for j in range(0, 100):
A = Ensemble()
B = k()
choices = {}
aChoice = None
bChoice = None
for i in range(0, 100):
count_payoff(A.__class__.__name__, bChoice)
a = A.round(bChoice)
update_hist(A.__class__.__name__, a)
count_payoff(B.__class__.__name__, aChoice)
b = B.round(aChoice)
update_hist(B.__class__.__name__, b)
aChoice = a
bChoice = b
print total_payoff
The controller is ready, you didn't have to do all that...
– Blacksilver
11 hours ago
1
@Blacksilver I realized that just as I was about to submit. But this one works in versions before 3.6, and gives information about individual matchups that can help identify weak spots, so it wasn't a complete waste of time.
– Ray
11 hours ago
Fair enough; running now. I'll probably add options to my controller to do similar things.
– Blacksilver
11 hours ago
"It beats everything posted so far except Ensemble and PatternFinder" I'm honored :)
– Sparr
7 hours ago
@Sparr Oops. That was supposed to say EvaluaterBot and PatternFinder. But that's when comparing total score against the entire field. PatternFinder remains the only one that beats this in a direct match up.
– Ray
6 hours ago
add a comment |
up vote
1
down vote
up vote
1
down vote
Ensemble
This runs an ensemble of related models. The individual models consider different amounts of history, and have the option of either always choosing the move that will optimize the expected payout difference, or will randomly select a move in proportion to expected payout difference.
Each member of the ensemble then votes on their preferred move. They get a number of votes equal to how much more they've won than the opponent (which means that terrible models will get negative votes). Whichever move wins the vote is then selected.
(They should probably split their votes among the moves in proportion to how much they favor each, but I don't care enough to do that right now.)
It beats everything posted so far except EvaluaterBot and PatternFinder. (One-on-one, it beats EvaluaterBot and loses to PatternFinder).
from collections import defaultdict
import random
class Number6:
class Choices:
def __init__(self, C = 0, N = 0, D = 0):
self.C = C
self.N = N
self.D = D
def __init__(self, strategy = "maxExpected", markov_order = 3):
self.MARKOV_ORDER = markov_order;
self.my_choices = ""
self.opponent = defaultdict(lambda: self.Choices())
self.choice = None # previous choice
self.payoff = {
"C": { "C": 3-3, "N": 4-1, "D": 0-5 },
"N": { "C": 1-4, "N": 2-2, "D": 3-2 },
"D": { "C": 5-0, "N": 2-3, "D": 1-1 },
}
self.total_payoff = 0
# if random, will choose in proportion to payoff.
# otherwise, will always choose argmax
self.strategy = strategy
# maxExpected: maximize expected relative payoff
# random: like maxExpected, but it chooses in proportion to E[payoff]
# argmax: always choose the option that is optimal for expected opponent choice
def update_opponent_model(self, last):
for i in range(0, self.MARKOV_ORDER):
hist = self.my_choices[i:]
self.opponent[hist].C += ("C" == last)
self.opponent[hist].N += ("N" == last)
self.opponent[hist].D += ("D" == last)
def normalize(self, counts):
sum = float(counts.C + counts.N + counts.D)
if 0 == sum:
return self.Choices(1.0 / 3.0, 1.0 / 3.0, 1.0 / 3.0)
return self.Choices(
counts.C / sum, counts.N / sum, counts.D / sum)
def get_distribution(self):
for i in range(0, self.MARKOV_ORDER):
hist = self.my_choices[i:]
#print "check hist = " + hist
if hist in self.opponent:
return self.normalize(self.opponent[hist])
return self.Choices(1.0 / 3.0, 1.0 / 3.0, 1.0 / 3.0)
def choose(self, dist):
payoff = self.Choices()
# We're interested in *beating the opponent*, not
# maximizing our score, so we optimize the difference
payoff.C = (3-3) * dist.C + (4-1) * dist.N + (0-5) * dist.D
payoff.N = (1-4) * dist.C + (2-2) * dist.N + (3-2) * dist.D
payoff.D = (5-0) * dist.C + (2-3) * dist.N + (1-1) * dist.D
# D has slightly better payoff on uniform opponent,
# so we select it on ties
if self.strategy == "maxExpected":
if payoff.C > payoff.N:
return "C" if payoff.C > payoff.D else "D"
return "N" if payoff.N > payoff.D else "D"
elif self.strategy == "randomize":
payoff = self.normalize(payoff)
r = random.uniform(0.0, 1.0)
if (r < payoff.C): return "C"
return "N" if (r < payoff.N) else "D"
elif self.strategy == "argMax":
if dist.C > dist.N:
return "D" if dist.C > dist.D else "N"
return "C" if dist.N > dist.D else "N"
assert(0) #, "I am not a number! I am a free man!")
def update_history(self):
self.my_choices += self.choice
if len(self.my_choices) > self.MARKOV_ORDER:
assert(len(self.my_choices) == self.MARKOV_ORDER + 1)
self.my_choices = self.my_choices[1:]
def round(self, last):
if last: self.update_opponent_model(last)
dist = self.get_distribution()
self.choice = self.choose(dist)
self.update_history()
return self.choice
class Ensemble:
def __init__(self):
self.models =
self.votes =
self.prev_choice =
for order in range(0, 6):
self.models.append(Number6("maxExpected", order))
self.models.append(Number6("randomize", order))
#self.models.append(Number6("argMax", order))
for i in range(0, len(self.models)):
self.votes.append(0)
self.prev_choice.append("D")
self.payoff = {
"C": { "C": 3-3, "N": 4-1, "D": 0-5 },
"N": { "C": 1-4, "N": 2-2, "D": 3-2 },
"D": { "C": 5-0, "N": 2-3, "D": 1-1 },
}
def round(self, last):
if last:
for i in range(0, len(self.models)):
self.votes[i] += self.payoff[self.prev_choice[i]][last]
# vote. Sufficiently terrible models get negative votes
C = 0
N = 0
D = 0
for i in range(0, len(self.models)):
choice = self.models[i].round(last)
if "C" == choice: C += self.votes[i]
if "N" == choice: N += self.votes[i]
if "D" == choice: D += self.votes[i]
self.prev_choice[i] = choice
if C > D and C > N: return "C"
elif N > D: return "N"
else: return "D"
Test Framework
In case anyone else finds it useful, here's a test framework for looking at individual matchups. Python2. Just put all the opponents you're interested in in opponents.py, and change the references to Ensemble to your own.
import sys, inspect
import opponents
from ensemble import Ensemble
def count_payoff(label, them):
if None == them: return
me = choices[label]
payoff = {
"C": { "C": 3-3, "N": 4-1, "D": 0-5 },
"N": { "C": 1-4, "N": 2-2, "D": 3-2 },
"D": { "C": 5-0, "N": 2-3, "D": 1-1 },
}
if label not in total_payoff: total_payoff[label] = 0
total_payoff[label] += payoff[me][them]
def update_hist(label, choice):
choices[label] = choice
opponents = [ x[1] for x
in inspect.getmembers(sys.modules['opponents'], inspect.isclass)]
for k in opponents:
total_payoff = {}
for j in range(0, 100):
A = Ensemble()
B = k()
choices = {}
aChoice = None
bChoice = None
for i in range(0, 100):
count_payoff(A.__class__.__name__, bChoice)
a = A.round(bChoice)
update_hist(A.__class__.__name__, a)
count_payoff(B.__class__.__name__, aChoice)
b = B.round(aChoice)
update_hist(B.__class__.__name__, b)
aChoice = a
bChoice = b
print total_payoff
Ensemble
This runs an ensemble of related models. The individual models consider different amounts of history, and have the option of either always choosing the move that will optimize the expected payout difference, or will randomly select a move in proportion to expected payout difference.
Each member of the ensemble then votes on their preferred move. They get a number of votes equal to how much more they've won than the opponent (which means that terrible models will get negative votes). Whichever move wins the vote is then selected.
(They should probably split their votes among the moves in proportion to how much they favor each, but I don't care enough to do that right now.)
It beats everything posted so far except EvaluaterBot and PatternFinder. (One-on-one, it beats EvaluaterBot and loses to PatternFinder).
from collections import defaultdict
import random
class Number6:
class Choices:
def __init__(self, C = 0, N = 0, D = 0):
self.C = C
self.N = N
self.D = D
def __init__(self, strategy = "maxExpected", markov_order = 3):
self.MARKOV_ORDER = markov_order;
self.my_choices = ""
self.opponent = defaultdict(lambda: self.Choices())
self.choice = None # previous choice
self.payoff = {
"C": { "C": 3-3, "N": 4-1, "D": 0-5 },
"N": { "C": 1-4, "N": 2-2, "D": 3-2 },
"D": { "C": 5-0, "N": 2-3, "D": 1-1 },
}
self.total_payoff = 0
# if random, will choose in proportion to payoff.
# otherwise, will always choose argmax
self.strategy = strategy
# maxExpected: maximize expected relative payoff
# random: like maxExpected, but it chooses in proportion to E[payoff]
# argmax: always choose the option that is optimal for expected opponent choice
def update_opponent_model(self, last):
for i in range(0, self.MARKOV_ORDER):
hist = self.my_choices[i:]
self.opponent[hist].C += ("C" == last)
self.opponent[hist].N += ("N" == last)
self.opponent[hist].D += ("D" == last)
def normalize(self, counts):
sum = float(counts.C + counts.N + counts.D)
if 0 == sum:
return self.Choices(1.0 / 3.0, 1.0 / 3.0, 1.0 / 3.0)
return self.Choices(
counts.C / sum, counts.N / sum, counts.D / sum)
def get_distribution(self):
for i in range(0, self.MARKOV_ORDER):
hist = self.my_choices[i:]
#print "check hist = " + hist
if hist in self.opponent:
return self.normalize(self.opponent[hist])
return self.Choices(1.0 / 3.0, 1.0 / 3.0, 1.0 / 3.0)
def choose(self, dist):
payoff = self.Choices()
# We're interested in *beating the opponent*, not
# maximizing our score, so we optimize the difference
payoff.C = (3-3) * dist.C + (4-1) * dist.N + (0-5) * dist.D
payoff.N = (1-4) * dist.C + (2-2) * dist.N + (3-2) * dist.D
payoff.D = (5-0) * dist.C + (2-3) * dist.N + (1-1) * dist.D
# D has slightly better payoff on uniform opponent,
# so we select it on ties
if self.strategy == "maxExpected":
if payoff.C > payoff.N:
return "C" if payoff.C > payoff.D else "D"
return "N" if payoff.N > payoff.D else "D"
elif self.strategy == "randomize":
payoff = self.normalize(payoff)
r = random.uniform(0.0, 1.0)
if (r < payoff.C): return "C"
return "N" if (r < payoff.N) else "D"
elif self.strategy == "argMax":
if dist.C > dist.N:
return "D" if dist.C > dist.D else "N"
return "C" if dist.N > dist.D else "N"
assert(0) #, "I am not a number! I am a free man!")
def update_history(self):
self.my_choices += self.choice
if len(self.my_choices) > self.MARKOV_ORDER:
assert(len(self.my_choices) == self.MARKOV_ORDER + 1)
self.my_choices = self.my_choices[1:]
def round(self, last):
if last: self.update_opponent_model(last)
dist = self.get_distribution()
self.choice = self.choose(dist)
self.update_history()
return self.choice
class Ensemble:
def __init__(self):
self.models =
self.votes =
self.prev_choice =
for order in range(0, 6):
self.models.append(Number6("maxExpected", order))
self.models.append(Number6("randomize", order))
#self.models.append(Number6("argMax", order))
for i in range(0, len(self.models)):
self.votes.append(0)
self.prev_choice.append("D")
self.payoff = {
"C": { "C": 3-3, "N": 4-1, "D": 0-5 },
"N": { "C": 1-4, "N": 2-2, "D": 3-2 },
"D": { "C": 5-0, "N": 2-3, "D": 1-1 },
}
def round(self, last):
if last:
for i in range(0, len(self.models)):
self.votes[i] += self.payoff[self.prev_choice[i]][last]
# vote. Sufficiently terrible models get negative votes
C = 0
N = 0
D = 0
for i in range(0, len(self.models)):
choice = self.models[i].round(last)
if "C" == choice: C += self.votes[i]
if "N" == choice: N += self.votes[i]
if "D" == choice: D += self.votes[i]
self.prev_choice[i] = choice
if C > D and C > N: return "C"
elif N > D: return "N"
else: return "D"
Test Framework
In case anyone else finds it useful, here's a test framework for looking at individual matchups. Python2. Just put all the opponents you're interested in in opponents.py, and change the references to Ensemble to your own.
import sys, inspect
import opponents
from ensemble import Ensemble
def count_payoff(label, them):
if None == them: return
me = choices[label]
payoff = {
"C": { "C": 3-3, "N": 4-1, "D": 0-5 },
"N": { "C": 1-4, "N": 2-2, "D": 3-2 },
"D": { "C": 5-0, "N": 2-3, "D": 1-1 },
}
if label not in total_payoff: total_payoff[label] = 0
total_payoff[label] += payoff[me][them]
def update_hist(label, choice):
choices[label] = choice
opponents = [ x[1] for x
in inspect.getmembers(sys.modules['opponents'], inspect.isclass)]
for k in opponents:
total_payoff = {}
for j in range(0, 100):
A = Ensemble()
B = k()
choices = {}
aChoice = None
bChoice = None
for i in range(0, 100):
count_payoff(A.__class__.__name__, bChoice)
a = A.round(bChoice)
update_hist(A.__class__.__name__, a)
count_payoff(B.__class__.__name__, aChoice)
b = B.round(aChoice)
update_hist(B.__class__.__name__, b)
aChoice = a
bChoice = b
print total_payoff
edited 6 hours ago
answered 11 hours ago
Ray
1,428511
1,428511
The controller is ready, you didn't have to do all that...
– Blacksilver
11 hours ago
1
@Blacksilver I realized that just as I was about to submit. But this one works in versions before 3.6, and gives information about individual matchups that can help identify weak spots, so it wasn't a complete waste of time.
– Ray
11 hours ago
Fair enough; running now. I'll probably add options to my controller to do similar things.
– Blacksilver
11 hours ago
"It beats everything posted so far except Ensemble and PatternFinder" I'm honored :)
– Sparr
7 hours ago
@Sparr Oops. That was supposed to say EvaluaterBot and PatternFinder. But that's when comparing total score against the entire field. PatternFinder remains the only one that beats this in a direct match up.
– Ray
6 hours ago
add a comment |
The controller is ready, you didn't have to do all that...
– Blacksilver
11 hours ago
1
@Blacksilver I realized that just as I was about to submit. But this one works in versions before 3.6, and gives information about individual matchups that can help identify weak spots, so it wasn't a complete waste of time.
– Ray
11 hours ago
Fair enough; running now. I'll probably add options to my controller to do similar things.
– Blacksilver
11 hours ago
"It beats everything posted so far except Ensemble and PatternFinder" I'm honored :)
– Sparr
7 hours ago
@Sparr Oops. That was supposed to say EvaluaterBot and PatternFinder. But that's when comparing total score against the entire field. PatternFinder remains the only one that beats this in a direct match up.
– Ray
6 hours ago
The controller is ready, you didn't have to do all that...
– Blacksilver
11 hours ago
The controller is ready, you didn't have to do all that...
– Blacksilver
11 hours ago
1
1
@Blacksilver I realized that just as I was about to submit. But this one works in versions before 3.6, and gives information about individual matchups that can help identify weak spots, so it wasn't a complete waste of time.
– Ray
11 hours ago
@Blacksilver I realized that just as I was about to submit. But this one works in versions before 3.6, and gives information about individual matchups that can help identify weak spots, so it wasn't a complete waste of time.
– Ray
11 hours ago
Fair enough; running now. I'll probably add options to my controller to do similar things.
– Blacksilver
11 hours ago
Fair enough; running now. I'll probably add options to my controller to do similar things.
– Blacksilver
11 hours ago
"It beats everything posted so far except Ensemble and PatternFinder" I'm honored :)
– Sparr
7 hours ago
"It beats everything posted so far except Ensemble and PatternFinder" I'm honored :)
– Sparr
7 hours ago
@Sparr Oops. That was supposed to say EvaluaterBot and PatternFinder. But that's when comparing total score against the entire field. PatternFinder remains the only one that beats this in a direct match up.
– Ray
6 hours ago
@Sparr Oops. That was supposed to say EvaluaterBot and PatternFinder. But that's when comparing total score against the entire field. PatternFinder remains the only one that beats this in a direct match up.
– Ray
6 hours ago
add a comment |
up vote
1
down vote
Weighted Average
class WeightedAverageBot:
def __init__(self):
self.C_bias = 1/4
self.N = self.C_bias
self.D = self.C_bias
self.prev_weight = 1/2
def round(self, last):
if last:
if last == "C" or last == "N":
self.D *= self.prev_weight
if last == "C" or last == "D":
self.N *= self.prev_weight
if last == "N":
self.N = 1 - ((1 - self.N) * self.prev_weight)
if last == "D":
self.D = 1 - ((1 - self.D) * self.prev_weight)
if self.N <= self.C_bias and self.D <= self.C_bias:
return "D"
if self.N > self.D:
return "C"
return "N"
The opponent's behavior is modeled as a right triangle with corners for C N D at 0,0 0,1 1,0 respectively. Each opponent move shifts the point within that triangle toward that corner, and we play to beat the move indicated by the point (with C being given an adjustably small slice of the triangle). In theory I wanted this to have a longer memory with more weight to previous moves, but in practice the current meta favors bots that change quickly, so this devolves into an approximation of LastOptimalBot against most enemies. Posting for posterity; maybe someone will be inspired.
add a comment |
up vote
1
down vote
Weighted Average
class WeightedAverageBot:
def __init__(self):
self.C_bias = 1/4
self.N = self.C_bias
self.D = self.C_bias
self.prev_weight = 1/2
def round(self, last):
if last:
if last == "C" or last == "N":
self.D *= self.prev_weight
if last == "C" or last == "D":
self.N *= self.prev_weight
if last == "N":
self.N = 1 - ((1 - self.N) * self.prev_weight)
if last == "D":
self.D = 1 - ((1 - self.D) * self.prev_weight)
if self.N <= self.C_bias and self.D <= self.C_bias:
return "D"
if self.N > self.D:
return "C"
return "N"
The opponent's behavior is modeled as a right triangle with corners for C N D at 0,0 0,1 1,0 respectively. Each opponent move shifts the point within that triangle toward that corner, and we play to beat the move indicated by the point (with C being given an adjustably small slice of the triangle). In theory I wanted this to have a longer memory with more weight to previous moves, but in practice the current meta favors bots that change quickly, so this devolves into an approximation of LastOptimalBot against most enemies. Posting for posterity; maybe someone will be inspired.
add a comment |
up vote
1
down vote
up vote
1
down vote
Weighted Average
class WeightedAverageBot:
def __init__(self):
self.C_bias = 1/4
self.N = self.C_bias
self.D = self.C_bias
self.prev_weight = 1/2
def round(self, last):
if last:
if last == "C" or last == "N":
self.D *= self.prev_weight
if last == "C" or last == "D":
self.N *= self.prev_weight
if last == "N":
self.N = 1 - ((1 - self.N) * self.prev_weight)
if last == "D":
self.D = 1 - ((1 - self.D) * self.prev_weight)
if self.N <= self.C_bias and self.D <= self.C_bias:
return "D"
if self.N > self.D:
return "C"
return "N"
The opponent's behavior is modeled as a right triangle with corners for C N D at 0,0 0,1 1,0 respectively. Each opponent move shifts the point within that triangle toward that corner, and we play to beat the move indicated by the point (with C being given an adjustably small slice of the triangle). In theory I wanted this to have a longer memory with more weight to previous moves, but in practice the current meta favors bots that change quickly, so this devolves into an approximation of LastOptimalBot against most enemies. Posting for posterity; maybe someone will be inspired.
Weighted Average
class WeightedAverageBot:
def __init__(self):
self.C_bias = 1/4
self.N = self.C_bias
self.D = self.C_bias
self.prev_weight = 1/2
def round(self, last):
if last:
if last == "C" or last == "N":
self.D *= self.prev_weight
if last == "C" or last == "D":
self.N *= self.prev_weight
if last == "N":
self.N = 1 - ((1 - self.N) * self.prev_weight)
if last == "D":
self.D = 1 - ((1 - self.D) * self.prev_weight)
if self.N <= self.C_bias and self.D <= self.C_bias:
return "D"
if self.N > self.D:
return "C"
return "N"
The opponent's behavior is modeled as a right triangle with corners for C N D at 0,0 0,1 1,0 respectively. Each opponent move shifts the point within that triangle toward that corner, and we play to beat the move indicated by the point (with C being given an adjustably small slice of the triangle). In theory I wanted this to have a longer memory with more weight to previous moves, but in practice the current meta favors bots that change quickly, so this devolves into an approximation of LastOptimalBot against most enemies. Posting for posterity; maybe someone will be inspired.
answered 6 hours ago
Sparr
4,9781633
4,9781633
add a comment |
add a comment |
Sign up or log in
StackExchange.ready(function () {
StackExchange.helpers.onClickDraftSave('#login-link');
});
Sign up using Google
Sign up using Facebook
Sign up using Email and Password
Post as a guest
StackExchange.ready(
function () {
StackExchange.openid.initPostLogin('.new-post-login', 'https%3a%2f%2fcodegolf.stackexchange.com%2fquestions%2f175794%2fiterated-prisoners-trilemma%23new-answer', 'question_page');
}
);
Post as a guest
Sign up or log in
StackExchange.ready(function () {
StackExchange.helpers.onClickDraftSave('#login-link');
});
Sign up using Google
Sign up using Facebook
Sign up using Email and Password
Post as a guest
Sign up or log in
StackExchange.ready(function () {
StackExchange.helpers.onClickDraftSave('#login-link');
});
Sign up using Google
Sign up using Facebook
Sign up using Email and Password
Post as a guest
Sign up or log in
StackExchange.ready(function () {
StackExchange.helpers.onClickDraftSave('#login-link');
});
Sign up using Google
Sign up using Facebook
Sign up using Email and Password
Sign up using Google
Sign up using Facebook
Sign up using Email and Password
1
How are bots put against each other? I get from the Grudger that there are always two bots against/with each other and the enemy's last choice is passed to the bot. How many rounds are played? And for a game: Does only the result count (i.e. who won) or also the points?
– Black Owl Kai
21 hours ago
1
You would get more entries if you made this language-agnostic, or at least broader. You could have a wrapper python class that spawns a process and sends it text commands to get back text responses.
– Sparr
21 hours ago
1
Done. This was on the sandbox for like a month!
– Blacksilver
21 hours ago
proposal: normalize scores so that adding more bots or more rounds doesn't inflate the numbers
– Sparr
15 hours ago
1
If you wrap most of main.py in
while len(botlist) > 1:withbotlist.remove(lowest_scoring_bot)at the bottom of the loop, you get an elimination tournament with interesting results.– Sparr
7 hours ago