March 29, 2023 # Prisoner’s dilemma game model Based on historical strategy information

Monte Carlo numerical simulation is used to simulate the evolution process of the whole population on a regular grid with $$L \times L$$ = $$200 \times 200$$ vertices. There are four neighbors around each individual, the initial cooperation probability is 50%, $$K = 0.1$$, and the system can be stable after $$10 ^ 4$$ iteration.

In order to verify whether the introduction of historical strategy information can promote group cooperation. Firstly, the evolution trend of cooperation rate Pc with defect temptation T under different historical memory length M is explored, as shown in Fig. 1. It can be seen in the figure that when $$M = 0$$ (i.e. the traditional Fermi strategy update rule49,52), the cooperation of the group cannot be maintained, the strategy cannot promote cooperation, and the defector finally occupies the whole group. When $$M > 0$$ (i.e. considering the historical strategy information strategy update rules), the cooperation rate of the group gradually increases and the number of defects gradually decreases, especially when $$M > 3$$. For different historical memory length M, the cooperation rate of the group increases with the increase of M. When $$M = 1$$, because the individual has less historical strategy information, the individual has less information on the stability of the strategy and has less impact on cooperation. However, compared with $$M = 0$$, the threshold of defect temptation t increases. It can be seen from the figure that no matter how M increases, with the increase of T, when $$T = 1.6$$, the whole group is completely occupied by the defectors. Therefore, with the increase of historical memory information, individuals can better understand the strategic information of themselves and their neighbors, so as to make better decisions. If the neighbor has used the defect strategy before, it will not maintain high return, so the individual will not learn the low return neighbor strategy. Only through cooperation can we obtain high returns. Individuals are more inclined to learn the strategy of neighbors with high stability. Therefore, the consideration of historical strategy information can promote the cooperation level of the group. How to define the threshold of defect temptation to promote cooperation? Next, the relationship between cooperation rate and M under different defect temptation T is explored.

In order to better study the impact of memory length on group cooperation, as shown in the Fig. 2a–c are the simulation results obtained when the temptation factor T is 1.04, 1.05 and 1.06 respectively, while the red curve, blue curve and green curve in the figure represent the irrational impact on individuals in strategy selection, that is, the values of noise factor K are 0.1, 0.2 and 0.3. According to the simulation results in Fig. 2, the cooperation rate $$P_c$$ increases with the increase of memory length M. When the memory length M increases from 1 to 100, the cooperation rate $$P_c$$ will gradually reach the saturation state, and it can be seen from the figure that when the memory length reaches a certain length, the cooperation rate evolved in the group reaches the highest point for the first time, and then the cooperation rate evolved in the group remains basically unchanged with the increase of memory length. Therefore, it can be concluded that the introduction of memory effect in individual decision-making can effectively improve the cooperation rate within the group. It shows that the long memory length means that it can promote the evolution of the population. As described in the model, individuals may frequently change strategies by “resetting memory”, so as to make the cooperative behavior between individuals appear faster. But from the perspective of behavior, if this oppression is too great, it is not conducive to cooperation. In the public goods game, M will increase with the iteration of time, so as to promote the emergence of cooperation. In this study, M does not increase with time.

Secondly, in order to better explore the impact of the introduction of historical strategy information on group evolution. Fig. 3 depicts the evolution curve of cooperation rate with time under different M. As can be seen from the figure, the overall cooperation rate of the group first decreased and then increased. The decline of cooperation rate shows that the traditional space or network reciprocity47,50 does not meet the needs of group cooperation and cannot evolve cooperation. After a period of time, if the temptation of defects is not too great, you can create a cooperative cluster. However, under the current parameter setting, cooperative clusters cannot be extended under the traditional reciprocal conditions. Therefore, it is necessary to provide additional information and consider the historical strategy information to break this situation. Even if additional information is provided, the expansion of cooperation cannot be maintained due to the small amount of information. When the additional information provided is appropriate, this situation will change greatly. With the help of historical strategy information, the emergence and expansion of cooperative clusters can be maintained, so that the level of cooperation will gradually improve after about 100 time steps, as shown in Fig. 3. Because memory has a certain limit, too much information does not necessarily promote cooperation. Finally, continue to increase M, and the cooperation rate remains at the same level. The current research results show that the appropriate introduction of historical strategy information can support the generation of cooperative clusters.

By observing the evolution curve of $$P_c$$ , it is considered that after introducing historical strategy information, partners can effectively resist the invasion of defectors by forming cooperation clusters. As shown in Fig. 4, when $$T = 1.05$$, it evolves to a stable state, and the distribution diagram of cooperators and defectors, in which red is cooperators and blue is defector. It can be seen from the figure that when the historical strategy information (i.e. $$M = 0$$) is not considered, the population finally evolves into traitors. When there is less historical strategy information (i.e. $$M = 1$$ or $$M = 2$$), the defector also occupies the whole population in the end. However, when there is more policy information ($$M > 2$$), some sporadic cooperative clusters begin to appear, such as Fig. 4d,e. After that, we can observe that as M increases from 10 to 30, the area of cooperative clusters further expands, as shown in Fig. 4f–h. In particular, when the historical strategy information is too large (i.e. $$M > 30$$), the cooperation cluster cannot continue to expand, and it is proved again that the introduction of historical strategy information can promote cooperation, but due to limited memory, more strategy information will not continue to promote cooperation, which may lead to the emergence of treason and will not create a better environment for the development of cooperation.

Lastly, generally speaking, people’s historical memory information is limited. In order to objectively reflect this realistic factor and avoid disadvantages, the parameter of memory weight is introduced. Figures 5 and 6 obtain the heat map of collaborators and betrayers in steady state under different memory weights. From the above two parts, by observing the evolution curve of $$P_c$$, it is considered that after the introduction of memory effect, the cooperator can effectively resist the invasion of defector through the formation of cooperator clusters. A snapshot of the distribution characteristics of upper cooperators and defectors is obtained when the population evolves into a stable state, as shown in Fig. 6. From Fig. 6a–c, it can be found that when the degree of influence $$\beta$$ ( the values are 0.1, 0.2 and 0.3 ), almost all cooperators and defectors coexist in the stable state. With the increase of $$\beta$$, the number of cooperators is also increasing. From Fig. 6d,e that when the degree of influence $$\beta$$ ( the value is 0.4 or 0.5 ), there are only a few sporadic defectors, while most cooperators occupy the whole network. After that, with the increase of the $$\beta$$, the cooperator clusters expand further. In Fig. 6f, the cooperator clusters reach the maximum and will not continue to expand, and there are almost no traitors.The number of cooperators increases with $$\beta$$ increases. It is conducive to the emergence of cooperation and the improvement of cooperation efficiency. At the same time, it also proves once again that the long historical memory length do not brings more promotion to the group, which requires other conditions to jointly promote cooperation, which is conducive to the emergence of better cooperation. It also proves that in real life, when making decisions, people do not necessarily rely entirely on memory or experience, but the combination of more factors, and memory experience is only a part of it.

In order to further study the evolution trend of system cooperation frequency with time, we conducted simulation experiments on the change of strategy stability ($$P = \fracn_x M$$) with time under different historical memory length M. The experimental results are shown in the Fig. 7.

In the system, the strategy stability P decreases rapidly at the beginning, and gradually increases to a relatively stable value after reaching a minimum value. The minimum value of strategy stability P is 0.15, and the maximum stability of the system can reach 0.95 in the equilibrium state. In the mechanism based on historical strategy information proposed in this paper, when $$M\>0$$, the strategy stability of the system increases with the increase of M. Only when individuals properly adhere to the cooperation strategy can the cooperation frequency of the whole system reach the highest.

The evolution of retention rates of two different strategies in the system with time is studied. Different strategy retention rates correspond to the stability of the whole system strategy, namely $$\fracn_x M$$. $$n_x$$ represents the same number of times as the current strategy. The greater the $$n_x$$, the greater the probability that an individual will keep the current strategy. $$P_C\longrightarrow C$$ represents the probability that the cooperator continues to maintain the cooperation strategy, while $$P_D \longrightarrow D$$ represents the probability that the defector continues to choose the defect strategy. It can be seen from the Fig. 8.

For different parameters M, the change trend of each probability is basically the same, but the change radian is different. At the beginning, the retention rate of the cooperation strategy is small, which is about 0.54, while the retention rate of the defect strategy is about 0.7. With the evolution, $$P_C\longrightarrow C$$ is decreasing while $$P_D\longrightarrow D$$ is increasing, which means that defect is the mainstream strategy of the system at this time. However, after some time, when the retention rate of the two different strategies is 0.75, the evolution has turned. At this time, $$P_C\longrightarrow C$$ will surpass PD, indicating that the cooperative strategy is gradually becoming a more dominant strategy. The bigger M, the earlier this turning point will occur. It can also be found from the figure that under the rules proposed in this paper, the cooperators will be more inclined to maintain the cooperation strategy, while most of the defectors will be inclined to choose the cooperation strategy and become the cooperators at the end of the evolution. These experimental results in this section are basically consistent with those in Fig. 3, which shows that the mechanism based on historical strategy information proposed in this paper can promote the emergence of system cooperative behavior.

Finally, we also studied the evolution of the retention rate $$P_C\longrightarrow C$$ of cooperation strategy and the retention rate $$P_D\longrightarrow D$$ of defect strategy with time under different memory length M and memory proportion $$\beta$$ in the system, as shown in the Fig. 9.

$$P_C\longrightarrow C$$ represents the probability that the cooperator will continue to cooperate, and $$P_D \longrightarrow D$$ represents the probability that the defector will continue to choose the defect strategy. In the early stage of evolutionary game, $$P_C\longrightarrow C$$ will decrease and $$P_D\longrightarrow D$$ will increase, which means that the defect strategy is the mainstream strategy in the system at this time. However, after a period of evolution, $$P_C\longrightarrow C$$ will exceed $$P_D\longrightarrow D$$. It is worth noting that the greater the value of memory ratio $$\beta$$, the earlier the intersection of the two curves occurs. With the increase of M, the probability of cooperation strategy gradually increases. This means that the cooperators basically still choose cooperation strategies, on the contrary, most of the defectors will choose to adopt cooperation strategies at the end of the evolutionary game, thus becoming cooperators. Therefore, the numerical simulation results also show that the update mechanism based on historical strategy information is very effective in maintaining and promoting the cooperative behavior of the system.