3.6 評価パラメータの更新 |
本節では評価パラメータを更新する関数Evaluator_Update()について説明します。
まず各パターンの評価値を更新する関数Evaluator_UpdatePattern()を見てみましょう。
static void Evaluator_UpdatePattern(Evaluator *self, int in_pattern, int in_id, int in_mirror, int in_diff) { if (MAX_PATTERN_VALUE - in_diff < self->Value[in_pattern][in_id]) { self->Value[in_pattern][in_id] = MAX_PATTERN_VALUE; } else if (-MAX_PATTERN_VALUE - in_diff > self->Value[in_pattern][in_id]) { self->Value[in_pattern][in_id] = -MAX_PATTERN_VALUE; } else { self->Value[in_pattern][in_id] += in_diff; } if (in_mirror >= 0) { self->Value[in_pattern][in_mirror] = self->Value[in_pattern][in_id]; } }
引数は以下の通りです。
self : Evaluatorクラスへのポインタ
in_pattern : 更新を行うパターンの種類
in_id : 更新を行うパターンインデックス
in_mirror : 更新を行うパターンと対称なパターンのインデックス
in_diff : 評価値の差分
in_patternとin_idで決まるパターンの評価値をin_diffだけ増やしています。
ただしパターンの評価値が-MAX_PATTERN_VALUEからMAX_PATTERN_VALUEまでの間に収まるようにしています。
また対称なパターンが存在する場合には、そのパターンの評価値も更新します。
それでは評価パラメータを更新する関数Evaluator_Update()の内部を見てみましょう。
この関数もちょっと長いですが、全部記述します。
void Evaluator_Update(Evaluator *self, const Board *in_board, int in_value) { int index, diff; diff = (int)((in_value - Evaluator_Value(self, in_board)) * UPDATE_RATIO); index = BOARD_INDEX_8(in_board, A4, B4, C4, D4, E4, F4, G4, H4); Evaluator_UpdatePattern(self, PATTERN_ID_LINE4, self->MirrorLine[index], index, diff); index = BOARD_INDEX_8(in_board, A5, B5, C5, D5, E5, F5, G5, H5); Evaluator_UpdatePattern(self, PATTERN_ID_LINE4, self->MirrorLine[index], index, diff); index = BOARD_INDEX_8(in_board, D1, D2, D3, D4, D5, D6, D7, D8); Evaluator_UpdatePattern(self, PATTERN_ID_LINE4, self->MirrorLine[index], index, diff); index = BOARD_INDEX_8(in_board, E1, E2, E3, E4, E5, E6, E7, E8); Evaluator_UpdatePattern(self, PATTERN_ID_LINE4, self->MirrorLine[index], index, diff); index = BOARD_INDEX_8(in_board, A3, B3, C3, D3, E3, F3, G3, H3); Evaluator_UpdatePattern(self, PATTERN_ID_LINE3, self->MirrorLine[index], index, diff); index = BOARD_INDEX_8(in_board, A6, B6, C6, D6, E6, F6, G6, H6); Evaluator_UpdatePattern(self, PATTERN_ID_LINE3, self->MirrorLine[index], index, diff); index = BOARD_INDEX_8(in_board, C1, C2, C3, C4, C5, C6, C7, C8); Evaluator_UpdatePattern(self, PATTERN_ID_LINE3, self->MirrorLine[index], index, diff); index = BOARD_INDEX_8(in_board, F1, F2, F3, F4, F5, F6, F7, F8); Evaluator_UpdatePattern(self, PATTERN_ID_LINE3, self->MirrorLine[index], index, diff); index = BOARD_INDEX_8(in_board, A2, B2, C2, D2, E2, F2, G2, H2); Evaluator_UpdatePattern(self, PATTERN_ID_LINE2, self->MirrorLine[index], index, diff); index = BOARD_INDEX_8(in_board, A7, B7, C7, D7, E7, F7, G7, H7); Evaluator_UpdatePattern(self, PATTERN_ID_LINE2, self->MirrorLine[index], index, diff); index = BOARD_INDEX_8(in_board, B1, B2, B3, B4, B5, B6, B7, B8); Evaluator_UpdatePattern(self, PATTERN_ID_LINE2, self->MirrorLine[index], index, diff); index = BOARD_INDEX_8(in_board, G1, G2, G3, G4, G5, G6, G7, G8); Evaluator_UpdatePattern(self, PATTERN_ID_LINE2, self->MirrorLine[index], index, diff); index = BOARD_INDEX_8(in_board, A1, B2, C3, D4, E5, F6, G7, H8); Evaluator_UpdatePattern(self, PATTERN_ID_DIAG8, self->MirrorLine[index], index, diff); index = BOARD_INDEX_8(in_board, A8, B7, C6, D5, E4, F3, G2, H1); Evaluator_UpdatePattern(self, PATTERN_ID_DIAG8, self->MirrorLine[index], index, diff); index = BOARD_INDEX_7(in_board, A2, B3, C4, D5, E6, F7, G8); Evaluator_UpdatePattern(self, PATTERN_ID_DIAG7, self->MirrorLine[index * POW_3_1], index, diff); index = BOARD_INDEX_7(in_board, B1, C2, D3, E4, F5, G6, H7); Evaluator_UpdatePattern(self, PATTERN_ID_DIAG7, self->MirrorLine[index * POW_3_1], index, diff); index = BOARD_INDEX_7(in_board, A7, B6, C5, D4, E3, F2, G1); Evaluator_UpdatePattern(self, PATTERN_ID_DIAG7, self->MirrorLine[index * POW_3_1], index, diff); index = BOARD_INDEX_7(in_board, B8, C7, D6, E5, F4, G3, H2); Evaluator_UpdatePattern(self, PATTERN_ID_DIAG7, self->MirrorLine[index * POW_3_1], index, diff); index = BOARD_INDEX_6(in_board, A3, B4, C5, D6, E7, F8); Evaluator_UpdatePattern(self, PATTERN_ID_DIAG6, self->MirrorLine[index * POW_3_2], index, diff); index = BOARD_INDEX_6(in_board, C1, D2, E3, F4, G5, H6); Evaluator_UpdatePattern(self, PATTERN_ID_DIAG6, self->MirrorLine[index * POW_3_2], index, diff); index = BOARD_INDEX_6(in_board, A6, B5, C4, D3, E2, F1); Evaluator_UpdatePattern(self, PATTERN_ID_DIAG6, self->MirrorLine[index * POW_3_2], index, diff); index = BOARD_INDEX_6(in_board, C8, D7, E6, F5, G4, H3); Evaluator_UpdatePattern(self, PATTERN_ID_DIAG6, self->MirrorLine[index * POW_3_2], index, diff); index = BOARD_INDEX_5(in_board, A4, B5, C6, D7, E8); Evaluator_UpdatePattern(self, PATTERN_ID_DIAG5, self->MirrorLine[index * POW_3_3], index, diff); index = BOARD_INDEX_5(in_board, D1, E2, F3, G4, H5); Evaluator_UpdatePattern(self, PATTERN_ID_DIAG5, self->MirrorLine[index * POW_3_3], index, diff); index = BOARD_INDEX_5(in_board, A5, B4, C3, D2, E1); Evaluator_UpdatePattern(self, PATTERN_ID_DIAG5, self->MirrorLine[index * POW_3_3], index, diff); index = BOARD_INDEX_5(in_board, D8, E7, F6, G5, H4); Evaluator_UpdatePattern(self, PATTERN_ID_DIAG5, self->MirrorLine[index * POW_3_3], index, diff); index = BOARD_INDEX_4(in_board, A5, B6, C7, D8); Evaluator_UpdatePattern(self, PATTERN_ID_DIAG4, self->MirrorLine[index * POW_3_4], index, diff); index = BOARD_INDEX_4(in_board, E1, F2, G3, H4); Evaluator_UpdatePattern(self, PATTERN_ID_DIAG4, self->MirrorLine[index * POW_3_4], index, diff); index = BOARD_INDEX_4(in_board, A4, B3, C2, D1); Evaluator_UpdatePattern(self, PATTERN_ID_DIAG4, self->MirrorLine[index * POW_3_4], index, diff); index = BOARD_INDEX_4(in_board, E8, F7, G6, H5); Evaluator_UpdatePattern(self, PATTERN_ID_DIAG4, self->MirrorLine[index * POW_3_4], index, diff); Evaluator_UpdatePattern(self, PATTERN_ID_EDGE8, BOARD_INDEX_8(in_board, A1, B1, C1, D1, E1, F1, G1, B2), -1, diff); Evaluator_UpdatePattern(self, PATTERN_ID_EDGE8, BOARD_INDEX_8(in_board, H1, G1, F1, E1, D1, C1, B1, G2), -1, diff); Evaluator_UpdatePattern(self, PATTERN_ID_EDGE8, BOARD_INDEX_8(in_board, A8, B8, C8, D8, E8, F8, G8, B7), -1, diff); Evaluator_UpdatePattern(self, PATTERN_ID_EDGE8, BOARD_INDEX_8(in_board, H8, G8, F8, E8, D8, C8, B8, G7), -1, diff); Evaluator_UpdatePattern(self, PATTERN_ID_EDGE8, BOARD_INDEX_8(in_board, A1, A2, A3, A4, A5, A6, A7, B2), -1, diff); Evaluator_UpdatePattern(self, PATTERN_ID_EDGE8, BOARD_INDEX_8(in_board, A8, A7, A6, A5, A4, A3, A2, B7), -1, diff); Evaluator_UpdatePattern(self, PATTERN_ID_EDGE8, BOARD_INDEX_8(in_board, H1, H2, H3, H4, H5, H6, H7, G2), -1, diff); Evaluator_UpdatePattern(self, PATTERN_ID_EDGE8, BOARD_INDEX_8(in_board, H8, H7, H6, H5, H4, H3, H2, G7), -1, diff); index = BOARD_INDEX_8(in_board, A1, B1, C1, A2, B2, C2, A3, B3); Evaluator_UpdatePattern(self, PATTERN_ID_CORNER8, self->MirrorCorner[index], index, diff); index = BOARD_INDEX_8(in_board, H1, G1, F1, H2, G2, F2, H3, G3); Evaluator_UpdatePattern(self, PATTERN_ID_CORNER8, self->MirrorCorner[index], index, diff); index = BOARD_INDEX_8(in_board, A8, B8, C8, A7, B7, C7, A6, B6); Evaluator_UpdatePattern(self, PATTERN_ID_CORNER8, self->MirrorCorner[index], index, diff); index = BOARD_INDEX_8(in_board, H8, G8, F8, H7, G7, F7, H6, G6); Evaluator_UpdatePattern(self, PATTERN_ID_CORNER8, self->MirrorCorner[index], index, diff); Evaluator_UpdatePattern(self, PATTERN_ID_PARITY, Board_CountDisks(in_board, EMPTY) & 1, -1, diff); }
この関数も、処理の内容は単純です。
まず最初に以下の式によって評価値の差分を決めています。
((与えられた局面評価値)−(Evaluatorによる局面評価値)) X (更新の度合い)
次に局面に含まれるパターンを抽出し、各パターンの評価値更新を行っています。
こうすると、Evaluatorによって得られる評価値が、与えられた局面評価値にだんだん近づくようになります。
式にある更新の度合いは、評価値をどの程度増減するかを決める値です。
小さすぎるとなかなか評価パラメータが更新されず、適切な値になりません。
かといって大きすぎると評価パラメータが安定しません。
ここでは0.003を使用することにしました。
必ずしも最適な値というわけではありませんが、ある程度適切な評価パラメータが得られることを確認しています。