July 17, 2016


Technical Supplement for Course of Action Simulator Luke A. Maier Technical Paper 2014-1v2


The Course of Action Simulator (CAS) was developed as an analytical and educational model of how two states adopt courses of action in response to threats, resource constraints, and complex priorities. This analytical tool was featured in several presentations and publications, and its purpose is to help explore how agent-based modeling can inform security studies. This technical paper outlines the model’s underlying calculations and algorithmic theory to make it as transparent as possible so that its limitations and strengths can be evaluated by users seeking to advance strategic modeling.

Modeling strategic decision-making as algorithmic behavior provides a disciplined, transparent, and potentially data-driven method for forecasting how threats evolve and how we can address them. Designing algorithms that articulate how states make security decisions can help U.S. strategists understand and forecast how other nations will react to changes in their security environment. Advancing algorithmic models of state behavior can provide valuable opportunities to capitalize on rapidly advancing computational power and to address the need for strategic-level simulation for national security research. The following sections: 1) introduce the model’s basis in international relations scholarship, 2) explain why a state’s selection of a course of action resembles algorithmic behavior, and 3) detail the calculations that CAS uses to represent states’ identities, power, and priorities in an agent-based model.

Citation: Maier, L. 2014. Technical Supplement for Course of Action Model. Technical Paper 2014- 1v2. Fort Bragg, NC: Laboratory for Unconventional Conflict Analysis and Simulation.


States use formal and informal processes to make decisions regarding their national security. These processes can be grounded in law or tradition, or they can operate more subconsciously and without structure. In many states, these processes are relatively stable, though some states are more sporadic and ad hoc. But over long lengths of time, all states refine their decision-making processes to adapt to changing political, economic, and social circumstances and new national security threats.

International relations scholarship suggests there are three alternative decisioncalculi that explain security strategy. These broad categories of prediction are conventionally grouped into the realist, liberal, and constructivist paradigms. Each offers a comprehensive set of explanations about what drives state behavior in the international politics. These explanations all treat an individual nations’ behavior as a reaction to the nature and conditions of the international system, which can be considered a step-by-step process in which a state senses threats in its environment and applies a procedure (either intentionally or subconsciously) to decide how to react to those threats.

This step-by-step procedure for determining action lends itself well to description as an algorithm. An algorithm is a step-by-step set of procedures applied to change inputs into outcomes. Certainly, the decision-making processes of each state vary significantly, but the three main international relations paradigms provide very useful generalizations about themes in all states. Also, these processes operate at all three of Kenneth Waltz’s “images”—or levels—of analysis: the drives of state behavior emanating human nature, individual leaders, and international system. Treating states’ strategy-making as algorithms provides a rich opportunity to integrate insights from a variety of fields that have studied decision-making through, for example, computational, philosophical, cultural, psychological, and national security studies.

Algorithmic modeling of international relations has a fundamental trade-off between creating a generalizable tool and a tool that is sufficiently specified to provide specific predictions about individual states. In theory, it is possible to use algorithms to articulate the decision-making models of each state, but that would require timeconsuming research and highly accurate information on the social, economic, historical, and psychological characteristics of each state’s government. Until this is feasible, it is useful to attempt to create an algorithmic model that can be generalized to many states.

CAS provides a generalized model whose algorithms were derived from the three main international relations paradigms. CAS draws heavily from agent-based modeling and simulation—a broad category of computational methods that simulate the behavior of multiple actors who follow simple sets of rules. An additional advantage of agent-based models is they allow an analyst to observe “emergent outcomes”—outcomes that can only be observed at the aggregate level. The rules in an agent-based model can be articulated in the language of algorithms, and the 3 Technical Supplement for Course of Action Simulator Technical Paper 2014-1v2 remainder of this paper describes how CAS incorporates international relations concepts as rules into a simulation of two states and their adoption of courses of action based on simple priorities and resource constrains.

Algorithms and National Security Decision-making

An algorithm is a series of calculations that describes how an initial “system state” proceeds to future states. Designing an algorithm progresses through creation of high-level, implementation, and formal descriptions, which each represent progressive levels of specificity (Sipser 2012, p. 185). High-level descriptions provide broad, prosaic accounts of the state’s decision-making model; these can be designed by international relations and subject matter experts who might have no background in computation. This requires identifying the variables that influence the decision (i.e., the “process variables”) and the variables that indicate change has occurred (i.e., the “output variables”). It answers questions like “what influences the state’s behavior?” The next, more specific level of description is of how the algorithm implements the process. Implementation description uses words or diagrams to portray the steps of the process in a specific order. For example, this answers “how do those influences affect state behavior?” (Often in computer algorithms, these are expressed as bulleted pseudo-code.) The most specific level is formal description, wherein the analysts writes the series of equations that specifies how the description is implemented.

The formal algorithm comprises various functions of the processed variables which, together, represent the process that changes the inputs into a decision. Consider the following discussion of algorithm g. The step function g is used to change the values of the output variables (i.e., foreign policies), defined as set R. The values in R change from t to t+1 such that: g : 𝐑𝐭 → 𝐑𝐭+𝐑𝐭+𝟏

After an initial run at time t, the algorithm can be applied to 𝐑𝐭+𝟏 to calculate 𝐑𝐭+𝟐 and so on until 𝐑𝐭+𝐧 , where t+n is the desired end-state.

These algorithmic processes can take on two forms (Holland 1995). The first is an “internal model” that states adopt. Internal models include, among other processes, decision-making processes adopted inadvertently. This could be a subconscious process (such as the various types of fallacies or miscalculations and misperceptions that systematically influence state behavior), or a process that is deterministic. A deterministic process has no alternatives; it only has one possible outcome for each set of inputs, and thus can be thought of as an “action-reaction” model.

The second and perhaps richer form is the “overt model,” by which states actively compare alternatives through a specified deliberation process to actively optimize the decision’s outcome. For example, academic policy analysis often follows this model, where an analysis will identify a criteria for measuring success, identify alternative actions to meet that criteria, determine how comparatively well each alternative 4 Technical Supplement for Course of Action Simulator Technical Paper 2014-1v2 meets the criteria, then implement the best alternative. This second model provides many opportunities to model goal-seeking behavior, such as is envisioned in agentbased models and simulations. Both the internal and overt models can be used to represent states’ heuristic mechanisms, among other things.

Algorithmic strategic decision-making can draw heavily from concepts in artificial intelligence planning, in which a decision-maker “is given a set of possible states and is asked for a sequence of actions that will lead it to a desirable world state” (Vidal 2007). Advanced non-deterministic or randomized methods that draw from Monte Carlos decision chains and Markov processes (e.g., see Schrodt 2000) can be used to simulate all of the possible strings of moves (where a “string” is a certain series of moves) and compare their various utilities (Vidal 2007). Other potentially valuable outputs from these methods could be Poisson Distributions of occurrence and perhaps more interestingly—in limited instances where retrodictive validation is possible— probabilistic forecasts.

These algorithms can be incorporated into computer platforms such as game theory simulators or agent-based models (e.g., see Cederman 2003), all which have rising reliability and relevance in strategic analysis as our computational power and information accuracy improve. The process of designing such an algorithm begins by scrutinizing the relevant literatures from international relations, leadership and decision psychology, history, etc. to identify relevant processes at play in a state’s decision-making. The next step would be organizing these influences into stepped or explicitly interrelated processes, and the final steps would include mathematically formalizing these steps and then translating them into a computer programming language such as Python. The next section outlines how CAS is designed following this process.

The Mechanics of CAS

CAS aims to demonstrate how international security studies can apply agent-based models to simulate how states develop courses of action based on resource constraints and broader decision-making goals. Their decisions influence each other, and their courses of action are adopted to achieve specified goals. The goals are dually defined by the actor’s identity as a realist, liberal, or constructivist state, as well as their ideal deployment of eight elements of national power in each of the issues. Following modeling convention, these security issues are referred to as “objects.” The objects have a location in a three-dimensional space whose axes (in set 𝒗𝒕 ) represent the level of three kinds of threats: threats to fear, threats to honor, and threats which comprise set 𝒗𝒕 = {𝑓𝑡 , ℎ𝑡 , 𝑖𝑡 }. These dimensions of threats are differentially emphasized by the three main international relations: realists emphasize states make international relations decisions based on threats to fear, liberals emphasize threats to interest, and constructivists emphasize threats to honor, among other considerations broadly defined.

Each object has a “location” (𝒍𝒕 = [𝑓𝑡 , ℎ𝑡 , 𝑖𝑡 ]), with lower values on an axis representing low threat levels in that threat category for that object. High values represent the actor is experiencing high levels of threat on that issue. The actor’s goals are to reduce threats to their priorities. Realists prioritize reducing threats to fear, which relate to hard power resources including military, economic, and technological resources. Liberals seek to reduce threats to interests, which include economic, diplomatic, human, technology, and informational resources. While realists seek to increase their relative power, liberals seek to increase their absolute power. Constructivists seek to reduce threats to honor, and they also seek to fulfill certain ideal types which are defined as the possession of definable levels of each power resources, which we refer to as “power configurations.” The model measures each actor’s performance by these metrics in graphs like the following:


Few threats fit entirely into a single category; most threats have qualities of all three. Similarly, few states entirely follow one of the three IR theories. The initial threat levels for each object is set via sliders like those pictured below to the right. Each IR theory explains some state behavior, and certain states fit some theories better than others. To account for this, any particular state’s behavior is influenced by complex combinations of these international relations theories, as described as list 𝑤 = [𝑟, 𝑐, 𝑙] where r, c, and l are the relative weights of each of the IR theories. Using sliders like those below to the left, the model asks the analyst to estimate the degree (on a scale of 0-100) to which each state’s decisions match each of the three theories’ predictions. For example, while some states prioritize the reduction of military threats, others prioritize the reduction of economic threats.


A highly realist actor, for instance, would not take moves that deplete their hardpower resources. A highly honor-driven state would not adopt courses of action that deplete their information, human capital, and legitimacy. Each type of move has trade-offs. Some are good at achieving realist goals, while others only achieve liberal or constructive goals. Some moves are more balanced, and achieve limited progress for each axis. The initial settings are as follows, but the analyst can open the background pages and edit these move effects:


In the algorithm, the analyst assigns the effect of actions 𝒂𝟏 𝒗 through 𝒂𝒏 𝒗 in set of actions 𝑨𝒊 𝒗 on each component of v as such:

𝐴𝑖 𝑣 = [𝐴 𝑓 , 𝐴 ℎ , 𝐴 𝑖 ] = [[𝑎1 𝑓 𝑎2 𝑓 ⋯ 𝑎𝑛 𝑓 ],[𝑎1 ℎ 𝑎2 ℎ ⋯ 𝑎𝑛 ℎ],[𝑎1 𝑖 𝑎2 𝑖 ⋯ 𝑎𝑛 𝑖 ]]

Each move also has an effect on Actor A’s resource deployment for each object. For example, if Actor B takes military action, this will deplete Actor A’s military and economic resources. Each move is given a utility rating that represents how well the move achieves the actor’s goals considering its identities and priorities. The utilities for the individual components (which correspond to the f, h, and i axes) are summed together to yield overall utility (𝑈𝑡 ) for each move. Each utility component is an inverse function of the object’s distance from zero along the related axis; this means higher threat levels have lower utility. (.01 is added to the threat level so that its minimum is insignificant, but non-zero.) The inverse of the threat level is multiplied by the object weight (r, c, or l) so that the utility components for the prioritized axes will have more bearing on the move’s overall utility 𝑈𝑡 :

𝑈𝑡 = ( 𝑟 |𝑓𝑡 | + .01) + ( 𝑐 |ℎ𝑡 | + .01) + ( 𝑙 |𝑖 𝑡 | + .01)

However, the model checks to ensure that the move selected does not cause more than a specified decrease in the deployment of resources prioritized by the actor’s identity. The permissible degree of decrease is calculated as the standard of deviation of all the move outcomes for the axis of interest multiplied by the weight (w) of the associated identity subtracted from 1 as such:

[v𝑡 + 𝛼𝑡 𝑥 < 𝑣̅𝑡 − 𝜎𝑣𝑡 2 (1 − 𝑤)]

where 𝛼𝑡 𝑥 is the effect of the move, v𝑡 is the object position on the relevant axis at time t, and 𝑣̅𝑡 is the average object position on the relevant axis after all of the moves. So, for example, high weights for the realist identity will cause the actor to have little tolerance for losses along the fear axis. A moderately liberal actor will tolerate moderate losses along the interest axis. If the above expression evaluates as false, then the model selects the best move for which this expression evaluates as true.

When each actor has selected their best move, the object’s new location for axis v at time t+1 is determined by summing the effects on v of the selected moves and adding that sum to the object’s old location at time t such that:

𝑣𝑡+1 = 𝑣𝑡+1 𝑎𝑐𝑡𝑜𝑟_𝐴 + 𝑣𝑡+1 𝑎𝑐𝑡𝑜𝑟_𝐵 − 𝑣𝑡

where 𝑣𝑡+1 𝑎𝑐𝑡𝑜𝑟_𝐴 is the object’s location only after Actor A’s move, 𝑣𝑡+1 𝑎𝑐𝑡𝑜𝑟_𝐵 is the object’s location only after Actor B’s move, and 𝑣𝑡+1 is the location after both moves (i.e., after their moves are “resolved”).

Altogether, the move selection algorithm is described below. It is written like a Python function, but the variable names are described in mathematical notation and indexes so the reader can follow the variable paths throughout the loops more easily. The model iterates this function 24 times for each object to shift each location from 𝒍𝟎 to 𝒍𝒕+𝟐𝟒 at the end of the simulation. actors = {Actor A, Actor B} 𝒘 𝒂𝒄𝒕𝒐𝒓_𝑨 = [𝑟, 𝑐, 𝑙] 𝒗𝒕 = {𝑓𝑡 , ℎ𝑡 ,𝑖𝑡 } 𝒍𝒕 = [𝑓𝑡 , ℎ𝑡 ,𝑖𝑡 ] guardrails=“on” #this can be turned “off” to allow the actors to select moves without considering their effect Actor A’s resources.

for 𝑎𝑡 v in 𝐴 v : V𝑡 v . append(v𝑡 + 𝑎𝑡 v ) #adds v𝑡 to set V𝑡 v V𝑡 𝑎𝑐𝑡𝑜𝑟_𝑥 . append(V𝑡 v ) #adds V𝑡 v to V𝑡 𝑎𝑐𝑡𝑜𝑟_𝑥 𝑈𝑡 = [ ]

for n in range(len(𝐴 [1]): 𝑈𝑡 𝑛 = ( 𝑟 |V𝑡 𝑎𝑐𝑡𝑜𝑟_𝑥 [0][𝑛]|+.01) + ( 𝑐 |V𝑡 𝑎𝑐𝑡𝑜𝑟_𝑥 [1][𝑛]|+.01) + ( 𝑙 |V𝑡 𝑎𝑐𝑡𝑜𝑟_𝑥 [2][𝑛]|+.01) ∗ 𝜀 𝑈𝑡 . 𝑎𝑝𝑝𝑒𝑛𝑑(𝑈𝑡 𝑛) 𝛼𝑡 𝑏𝑒𝑠𝑡_𝑥 = 𝑈𝑡 . index (𝑚𝑖𝑛(𝑈𝑡 )) #finds the index of the best move

𝑉𝑡+1 𝑎𝑐𝑡𝑜𝑟_𝑥 = [[𝑉𝑡 𝑎𝑐𝑡𝑜𝑟_𝑥 [0][𝛼𝑡 𝑏𝑒𝑠𝑡_𝑥 ]],[𝑉𝑡 𝑎𝑐𝑡𝑜𝑟_𝑥 [1][𝛼𝑡 𝑏𝑒𝑠𝑡_𝑥 ]],[𝑉𝑡 𝑎𝑐𝑡𝑜𝑟_𝑥 [2][𝛼𝑡 𝑏𝑒𝑠𝑡_𝑥 ]]] moves. append(𝑉𝑡+1 𝑎𝑐𝑡𝑜𝑟_𝑥 ) 𝑙𝑡+1 = [ ]

for v in 𝑣𝑡+1: v𝑡+1 = moves[0][0] + moves[1][0] − 𝑙𝑡 𝑙𝑡+1 . append(v𝑡+1) #this makes 𝑙𝑡+1 = [𝑓𝑡+1 , ℎ𝑡+1 ,𝑖𝑡+1] return 𝑙𝑡+1

In the Excel model, this algorithm is written in matrix form and repeated for each time t from t=0 until t=24. This allows the actors to select 24 independent moves for each object. Together, these moves are considered the actors’ courses of action, which are the model’s fundamental outputs. They are listed in the outputs page, which also contains graphs illustrating how Actor A’s resource deployment changed over time.

Future Research

CAS represents an early attempt to integrate complexity, algorithmic agent-based models, and international relations theory into an analytical tool. Complexity theory and its expanding array of statistical and computational methods provides rich accounts the world’s interactive and emergent relationships, such as networks (e.g., see Slaughter 2009), political ecologies, and dynamic systems. Complex adaptive systems and insights from evolution and genetic adaptation could empower analysts to study how states adapt their processes to accommodate new circumstances, such as changes in national power or new domestically-driven political mandates. Moreover, when one considers that states often act and react to one another’s decisions in ways that are intersubjective, it seems quite beneficial to simulate that states autonomously adapt their specifications over time to optimize their security outcomes in the context of a broader security environment. The lab will continue to refine and expand the model as new data and theories emerge. However, states’ iterative and relative decision-making models require further theoretical description and data collection for validation, but deriving a generalized model such as CAS is an important interim step to advance the strategic modeling enterprise.