Tag Archives: Machine Learning

Computational Military Reasoning Part 4: Learning

In my previous three posts on computational military reasoning (tactical artificial intelligence) we introduced my algorithms for detecting the absence or presence of anchored and unanchored flanks, interior lines and restricted avenues of attack (approach) and retreat. In this post I present my doctoral research1)TIGER: An Unsupervised Machine Learning Tactical Inference Generator which can be downloaded here which utilizes these algorithms, and others, in the construction of an unsupervised machine learning program that is able to classify the current tactical situation (battlefield) in the context of previously observed battles. In other words, it learns and it remembers.

‘Machine learning’ is the computer science term for learning software (in computer science ‘machine’ often means ‘software’ or ‘program’ ever since the ‘Turning Machine'2)https://en.wikipedia.org/wiki/Turing_machine which was not a physical machine but an abstract thought experiment.

There are two forms of machine learning: supervised and unsupervised machine learning. Supervised machine learning requires a human to ‘teach’ the software. An example of supervised machine learning is the Netflix recommendation system. Every time you watch a show on Netflix you are teaching their software what you like. Well, theoretically. Netflix recommendations are often laughingly terrible (no, I do not want to see the new Bratz kids movie regardless of how many times you keep recommending it to me).

Unsupervised machine learning is a completely different animal. Without human intervention an unsupervised machine learning program tries to make sense of a series of ‘objects’ that are presented to it. For the TIGER / MATE program, these objects are battles and the program classifies them into similar clusters. In other words, every time TIGER / MATE ‘sees’ a new tactical situation it asks itself if this is something similar (and how similar) to what it’s seen before or is it something entirely new?

I use convenient terms like ‘a computer tries to make sense of’ or a ‘computer sees’ or a ‘computer thinks’ but I’m not trying to make the argument that computers are sentient or that they see or think. These are just linguistic crutches that I employ to make it easier to write about these topics.

So, a snapshot of a battle (the terrain, elevation and unit positions at a specific time) is an ‘object’ and this object is described by a number of ‘attributes’. In the case of TIGER / MATE, the attributes that describe a battle object are:

  • Interior Line Value
  • Anchored / Unanchored Flank Value
  • REDFOR (Red Forces) Choke Points Value
  • BLUEFOR (Blue Forces) Choke Points Value
  • Weighted Force Ratio
  • Attack Slope

The algorithms for calculating the metrics for the first four attributes were discussed in the three previous blog posts cited above. The algorithms for calculating the Weighted Force Ratio and Attack Slope metrics are straightforward: Weighted Force Ratio is the strength of Red over the strength of Blue weighted by unit type and the Attack Slope is just that: the slope (uphill or downhill) that the attacker is charging over.

TIGER / MATE constructs a hierarchical tree of battlefield snapshots. This tree represents the relationship and similarity of different battlefield snapshots. For example, two battlefield situations that are very similar will appear in the same node, while two battlefield situations that are very different will appear in disparate nodes. This will be easier to follow with a number of screen shots. Unfortunately, we first have to introduce the Category Utility Function.

So, first let me apologize for all the math. It isn’t necessary for you to understand how the TIGER / MATE unsupervised machine learning process works, but if I don’t show it I’m guilty of this:

The Category Utility Function (or CU, for short) is the equation that determines how similar or dissimilar too objects (battlefields) are. This it the CU function:

‘Acuity’ is the concept of the minimum value that separates two ‘instances’ (in our case, battles). It has to have a value of 1.0 or very bad things will happen.

 

So, let’s recap what we’ve got:

  • A series of algorithms that analyze a battlefield and return values representing various conditions that SMEs agree are significant (flanks, attack and retreat routes, unit strengths, etc., etc).
  • A Category Utility Function (CU) that uses the products of these algorithms to determine how similar analyzed battlefields are.

So now, we just need to put this all together. A battlefield (tactical situation) is analyzed by TIGER / MATE. It is ‘fed’ into the unsupervised machine learning function and, using the Category Utility Function one of four things happen:

  1. All the children of the parent node are evaluated using the CU function and the object (tactical situation)is added to an existing node with the best score.
  2. The object is placed in a new node all by itself.
  3. The two top-scoring nodes are combined into a single node and the new object is added to it.
  4. A node is divided into several nodes with the new objected to one of them.

These rules (above) construct a hierarchical tree structure. TIGER was fed 20 historical tactical situations (below):

  1. Kasserine Pass February 14,1943
  2. KasserinePass February 19, 1943
  3. Lake Trasimene, 217 BCE
  4. Shiloh Day 2
  5. Shiloh Day 1, 0900 hours
  6. Shiloh Day 1, 1200 hours
  7. Antietam 0600 hours
  8. Antietam 1630 hours
  9. Fredericksburg, December 10
  10. Fredericksburg, December 13
  11. Chancellorsville May 1
  12. Chancellorsville May 2
  13. Gazala
  14. Gettysburg, Day 1
  15. Gettysburg, Day 2
  16. Gettysburg, Day 3
  17. Sinai, June 5
  18. Waterloo, 1000 hours
  19. Waterloo, 1600 hours
  20. Waterloo, 1930 hours

In addition to these 20 historical tactical situations five hypothetical situations were created labeled A-E. This is the resulting tree which TIGER created:

The hierarchical tree created by TIGER from 20 historical and 5 hypothetical tactical situations. The numbers in the nodes refer to the above legend. Battles placed in the same nodes are considered very similar by TIGER. Click to enlarge.

If we look at the tree that TIGER constructed we can see that it placed Shiloh Day 1 0900 hours and Shiloh Day 1 1200 hours together in cluster C35. Indeed, as we look around the tree we observe that TIGER did a remarkable job of analyzing tactical situations and placing like with like. But, that’s easy for me to say, I wrote TIGER. My opinion doesn’t count. So we asked 23 SMEs which included:

  • 7 Professional Wargame Designers
  • 14 Active duty and retired U. S. Army officers including:
  • Colonel (Ret.) USMC infantry 5 combat tours, 3 advisory tours
  • Maj. USA. (SE Core) Project Leader, TCM-Virtual Training
  • Officer at TRADOC (U. S. Army Training and Doctrine Command)
  • West Point; Warfighting Simulation Center
  • Instructor, Dept of Tactics Command & General Staff College
  • PhD student at RMIT
  • Tactics Instructor at Kingston (Canada)

And in a blind survey asked them not what TIGER did but what they would do. For example:

Twenty-three SMEs were asked this question: is this hypothetical tactical situation (top) more like Kasserine Pass or Gettysburg?. Click to enlarge.

And this is how the responded:

Results from 23 SMEs answering the above question. Overwhelmingly the SMEs agreed that that the hypothetical tactical situation was most like the battle of Kasserine Pass.

So, 91.3% of SMEs agreed that the hypothetical tactical situation was more like Kasserine Pass than Gettysburg Day 1. Unbeknownst to the SMEs TIGER had already classified these three tactical situations like this:

How TIGER classified Kasserine Pass (1), Gettysburg Day 1 (14) and a hypothetical tactical situation (B). The cluster C1 contains two tactical situations that both have restricted avenues of attack caused by armor traveling through narrow mountainous passes. These passes also partially create restricted avenues of retreat. REDFOR does not have anchored flanks.Click to enlarge.

In conclusion: over the last four blog posts about Computational Military Reasoning we have demonstrated:

  • Algorithms for analyzing a battlefield (tactical situation).
  • Algorithms for implementing offensive maneuvers.
  • An Unsupervised Machine Learning system for classifying tactical situations and clustering like situations together. Furthermore, this system is never-ending and as it encounters new tactical situations it will continue this process which enables the AI to plan maneuvers based on previously observed and annotated situations.

This is the AI that will be used in General Staff. It is unique and revolutionary. No computer military simulation – either commercially available or any military simulation used by any of the world’s armies – employ an AI of this depth.

As always, please feel free to contact me directly with questions or comments. You can use our online email form here or write to me directly at Ezra [at] RiverviewAI.com.

References   [ + ]

1. TIGER: An Unsupervised Machine Learning Tactical Inference Generator which can be downloaded here
2. https://en.wikipedia.org/wiki/Turing_machine

Computational Military Reasoning (Tactical Artificial Intelligence) Part 2

In my last blog post I described how the TIGER / MATE programs classified battles (in computer science terms ‘objects’) based on attributes and that anchored or unanchored flanks was one such attribute. After demonstrating the algorithm for calculating the presence or absence of anchored flanks we saw how the envelopment and turning tactical maneuvers were implemented. In this blog post we will look at another attribute: restricted avenues of attack and restricted avenues of retreat.

The only avenue of retreat from the Battle of First Bull Run back to Washington was over a narrow Stone Bridge. When a wagon overturned panic ensued. Library of Congress.

One classic example of a restricted avenue of retreat was the narrow stone bridge crossing Cub Run Creek which was the only eastern exit from the First Bull Run battlefield. The entire Union army would have to pass over this bridge as it fell back on Washington, D.C. When artillery fire caused a wagon to overturn and block the bridge, panic ensued.

At the battle of Antietam Burnside tried to force his entire corps over a narrow bridge to attack a Confederate position on the hill directly above. The bridge was famously carried by the 51st New York Infantry and 51st Pennsylvania Infantry who demanded restoration of their whiskey rations in return for this daring charge. From the original Edwin Forbes drawing. Click to enlarge.

Burnside’s Bridge at the battle of Antietam is a famous example of a restricted avenue of attack. Burnside was unaware that Snavely Ford was only 1.4 miles south of the stone bridge and allowed a back door into the Confederate position. Consequently, he continued to throw his corps across the bridge with disastrous results.

How to determine if there is a restricted line of attack or a restricted line of retreat on a battlefield

From the perspective of computer science restricted avenues of retreat and restricted avenues of attack are basically the same problem and can be solved with a similar algorithm.

As before we must first establish that there is agreement among Subject Matter Experts (SMEs) of the existence of – and the ability to quantify – the attributes of ‘Avenue of Attack’, ‘Avenue of Retreat’ and ‘Choke Point’.

The following slides are from an unclassified briefing that I gave to DARPA (the Defense Advanced Research Project Agency) on my MATE program (funded by DARPA research grant W911NF-11-200024):

All slides can be enlarged by clicking on them.

Now that we have determined if there is a restricted avenue of attack the next blog post will discuss what to do with this information; specifically the implementation of the infiltration and penetration offensive maneuvers.

As always, if you have any questions please feel free to email me.

References:

TIGER: An Unsupervised Machine Learning Tactical Inference Generator, Sidran, D. E. Download here.

A Wargame 55 Years in the Making (Part 5)

In June, 2009 I successfully defended my thesis and was awarded a doctorate of computer science by the University of Iowa. What followed were some of the most productive years of my professional career. I designed, programmed, project managed and was principal investigator on:

MARS: Military Advanced Real-time Simulator (2009)

OneSAF is the “Semi Automated Forces” wargame / simulator used for training by the US Army. It relies on ‘pucksters’ (see pucksters in this blog) who are usually retired military officers who make all the moves for OPFOR (Opposition Forces), MARS provided an intuitive Graphical User Interface (GUI) for the modification and running of OneSAF scenarios.

Screen capture of the MARS project for the US Army. MARS was a front end to facilitate creating and managing scenarios run on the Army's OneSAF military simulator. Click to enlarge.

Screen capture of the MARS project for the US Army. MARS was a front end to facilitate creating and managing scenarios run on the Army’s OneSAF military simulator. The ‘Magic Bomb’ option is the puckster’s term for ‘magically’ removing a unit from the simulation. Click to enlarge.

CAPTURE: Cognitive and Physiological Testing Urban Research Environment (2010)

While on the surface CAPTURE appears to be a standard ‘shooting gallery’ program it was actually designed to test and store data about how returning veterans saw targets, ‘spiraled in’ on targets and reacted. There were two parts to CAPTURE: the first allowed the tester to set up any particular scenario they wanted (top image, below) and the second part (bottom image, below) was run using a projector, a large screen, an M16 air soft gun with Wii remote and laser mounted to the barrel and an IR camera. CAPTURE was done for the Office of Naval Research (Marines).

Two screens showing the CAPTURE program. The top screen shows the interface for creating target scenarios. The bottom screen is one of the the shooting ranges generated by CAPTURE. Click to enlarge.

Two screens showing the CAPTURE program. The top screen shows the interface for creating target scenarios. The bottom screen is one of the the shooting ranges generated by CAPTURE. Click to enlarge.

NexGEN Behavior Composer (2011)

NexGEN Behavior Composer was another front-end project for OneSAF. Enemy units in OneSAF use scripted AI behavior written in XAML. These AI scripts often contained errors. NexGEN allowed the puckster to select a behavior from a hierarchical tree structure (top image, below) and click and drag it to a composing canvass where a series of behaviors could be joined together (bottom image, below). The artwork for the behaviors was done by my old friend, Ed Isenberg, who has worked with me on games since the ’80s.

Screen shot of NexGEN Behavior Composer which facilitated creating OneSAF behaviors by clicking and dragging behavior icons. Click to enlarge.

Screen shot of NexGEN Behavior Composer which facilitated creating OneSAF behaviors by clicking and dragging behavior icons. This is the hierarchical tree structure of behavior primitives. Click to enlarge.

And example of a OneSAF behavior composed of NexGEN behavior icons. Click to enlarge.

A series of behaviors have been placed together to create a complex behavior (a unit fires, conducts reconnaissance, waits for one minute, moves and then occupies a position). Click to enlarge.

MATE: Machine Analysis of Tactical Environments (2012)

Funded by a DARPA (Defense Advanced Research Project Agency) research grant (W911NF-11-200024) MATE added a new level of battlefield analysis to the TIGER project. Building on the previous nine years of research MATE had the capability of generating a series of ‘predicate statements’ that described the battlefield and then using them to construct a hypothetical syllogism that resulted in a precise Course of Action (COA) for BLUEFOR (US forces). MATE then output this COA as an HTML file and automatically launched a browser to view the COA. MATE was designed to be available to commanders in the field via a small handheld device like a tablet. It was able to perform battlefield analysis in less than 10 seconds.

Consider this real-world situation from the Battle of Marjah:

Given the same data that the commander had in the above video MATE returned this COA (complete with unit paths and ETAs):

MATE's analysis and COA for the Battle of Marjah: a right-flank envelopment maneuver with two infantry platoons while a fixing force of the mortar platoon and a third infantry platoon kept the enemy's attention. Click to enlarge.

MATE’s analysis and COA for the Battle of Marjah: a right-flank envelopment maneuver with two infantry platoons while a fixing force of the mortar platoon and a third infantry platoon kept the enemy’s attention. Click to enlarge.

To see the entire MATE analysis and COA results for the Battle of Marjah click here. (this will load a PDF of MATE’s HTML output on a new tab).


Unfortunately, about the time that I demonstrated MATE to DARPA a series of unfortunate events occurred that were to change my life. The United States Congress passed the Sequestration Transparency Act of 2012. This resulted in a 10% across the board cut in all federal spending. DARPA seemed especially hard hit and they stopped all funding for 4CI (Command, Control, Communications, Computers and Intelligence) research. Only a few years after receiving my doctorate, specifically so I could be the Principal Investigator on government funded 4CI research, I was out of a job.

Without any research funding, and not wanting to relocate I returned to the University of Iowa as a Visiting Assistant Professor teaching Computer Game Design and CS1.  I love teaching. And I am extraordinarily proud of receiving the highest student evaluations in the department of Computer Science but I didn’t have as much strength as I used to have. I found myself out of breath and exhausted after a lecture. And then my kidneys began to inexplicably fail.

In 2013, because of the efforts of superb doctors Kelly Skelly and Joel Gordon at the University of Iowa Hospital, I was diagnosed with a very rare and usually fatal blood disease, AL amyloidosis.  In 2014, thanks to the Affordable Care Act, I was hospitalized for 32 days, my immune system  was purposely destroyed and I received an autologous bone marrow / stem cell transplant. This was followed up by a year of chemotherapy. Being severely immunocompromised I have contracted pneumonia six times in the last two years. Now, against the odds (and I’m a guy that deals with probabilities a lot so I’m being literal) I’ve completely recovered. My kidneys and lungs are permanently damaged but I’m not going to die from this disease. But, it also means I can’t teach anymore, either.

Luckily, I can still sit at a desk and write computer code. General Staff is my return to writing a commercial computer wargame and it will be the first commercial implementation of my tactical AI algorithms that I have been developing since 2003.

I need to produce a game that you grognards want. And, that means next week I will be posting a new gameplay survey to pin down exactly what features you want to see in the new game. As always, please feel free to contact me directly (click here) if you have any questions or comments.

A Wargame 55 Years in the Making (Part 4)

It’s easy to say that you want to create an artificial intelligence that is capable of making human-level tactical decisions but that’s just not how it’s done in academia. First off, the term ‘human-level’ is vague. And then there’s the question of how do you prove your claim? I am indebted to Professor (now Dean) Joe Kearney who proposed the following hypotheses to state my doctoral thesis:

Hypothesis 1:  There is agreement among military experts that tactical situations exhibit certain features (or attributes) and that these features can be used by SMEs (Subject Matter Experts) to group tactical situations by similarity.

Hypothesis 2:  The best match by TIGER (the Tactical Inference Generator  program) of a new scenario to a scenario from its historical database predicts what the experts would choose.

In other words, a preponderance of SMEs will describe a tactical situation in the same way (like ‘Blue has a severely restricted line of retreat’ or ‘Red has anchored flanks’) and a computer program can be written that will describe the same battlefield in the same way as the human experts. And, if TIGER correctly predicts what the SMEs would choose in four out of five tests  (using a one sided Wald test resulted in  p = 0.0001 which is statistically significant) you get a PhD in Computer Science.  By the way, I am also indebted to Dr. Joseph Lang of the Department of Statistics and Actuarial Science at the University of Iowa who calculated the p value.

An example of a tactical description question asked of Subject Matter Experts.

An example of an online survey tactical description question asked of Subject Matter Experts. Image from Sidran’s TIGER: A Tactical Inference Generator. Click to enlarge.

The results of the above survey question: 100% of SMEs agree that RED's left flank is anchored on the Potomac; 79% agree that RED's right flank is anchored on the Antietam. Click to enlarge.

The results of the above survey question: 100% of SMEs agree that RED’s left flank is anchored on the Potomac; 79% agree that RED’s right flank is anchored on the Antietam. Image from Sidran’s TIGER: A Tactical Inference Generator. Click to enlarge.

The descriptors (features or attributes) that were identified by the SMEs included Anchored Flanks, Unanchored Flanks, Restricted Avenues of Attack, Unrestricted Avenues of Attack, Restricted Avenues of Retreat, Unrestricted Avenues of Retreat and Interior Lines. If you are interested in the methodology and algorithms there are links at the end of this post.

Now that the features have been identified (and algorithms written and tested that return a value representing the extent of the attribute) the next step is separating battlefield situations into categories is creating a machine learning classifier program.

There are two kinds of machine learning programs: supervised and unsupervised. Imagine two kinds of fish coming down a conveyor belt with a human being watching this on a TV with two buttons to push. If he pushes the button on the left the fish is classified as, say, ‘tuna’. And if he pushes the button the right the fish is classified as a ‘catfish’. (Why tuna and catfish are coming down this conveyor belt is beyond me, but please stay with the explanation.). In this way the program is taught the difference between a tuna and a catfish (tuna are bigger and longer). This is called supervised learning and is the method used by Netflix and Spotify to select movies or songs that are similar to choices you have previously made. I don’t like supervised systems because they have to be ‘trained’ and, in my opinion, have a tendency to oversimplify classification problems (for example, Netflix movie suggestions are usually awful).

Unsupervised machine learning works differently: there are a number of ‘objects’ that are described with certain ‘attributes’. These objects are presented to the ‘machine’ and the machine separates them into categories based on the likelihood (probability) that they belong together and then displays the results in a hierarchical tree structure. This is how the TIGER program works. The ‘objects’ are battlefield maps and the attributes are things like ‘anchored flanks’ and ‘restricted lines of attack’.

In the image, below, one branch of a tree structure of classified battles (both real and hypothetical) is displayed:

TIGER has classified four battlefield snapshots (Lake Trasimene 217 BC, Antietam 0600 hours, Antietam 1630 hours and a test battlefield submitted to TIGER and the SMEs as being similar. Note how TIGER sees the two Antietam snapshots as 'more similar' and puts them in their own node. Image taken from TIGER: An Unsupervised Machine Learning Tactical Inference Generator. Click to enlarge.

TIGER has classified four battlefield snapshots (Lake Trasimene 217 BC, Antietam 0600 hours, Antietam 1630 hours and a test battlefield submitted to TIGER and the SMEs as being similar. Note how TIGER sees the two Antietam snapshots as ‘more similar’ and puts them in their own node. Image taken from TIGER: An Unsupervised Machine Learning Tactical Inference Generator. Click to enlarge.

The method that TIGER uses to classify battlefields is Gennari, Fisher and Langley’s ClassIT algorithm which is explained in detail in my thesis (link below). Basically, a number of objects are ‘fed’ to the machine (in computer science the terms ‘machine’ and ‘program’ are synonymous) and every time the machine ‘consumes’ an object it asks itself, “does this new object belong in a previously existing category, or a new category, or should I combine two existing categories and add this new object or should I split an existing category and add this new object to one of them? Caveat: just as I explained in the previous blog, computers don’t actually ‘say’ or ‘ask itself’ but it’s easier to explain these processes using those terms. This is unsupervised because there is no human involvement or training. And there is no limit to the number of objects (battlefields) that can be shown to the program. TIGER is constantly learning.

Below is an example blind survey question given to >20 SMEs to validate TIGER’s ability to predict what the majority of SMEs would choose. My good friend, Ralph Sharp, who has worked on art for many of my games did the hypothetical battlefield maps.

An example of the blind survey questions asked of SMEs: is the hypothetical battlefield situation on the top more like the historical battlefield in the middle (Kasserine Pass) or the historical battlefield at the bottom (Gettysburg). Click to enlarge.

An example of the blind survey questions asked of SMEs: is the hypothetical battlefield situation on the top more like the historical battlefield in the middle (Kasserine Pass) or the historical battlefield at the bottom (Gettysburg). Click to enlarge.

The results show a statistically significant number of SMEs are in agreement that the hypothetical battlefield situation most closely resembles Kasserine Pass.

The results from the, above, blind survey question. The SMEs overwhelmingly state that the the battle of Kasserine Pass most resembles the hypothetical battle situation. The TIGER program also chose the 'Kasserine Pass'. Click to enlarge.

The results from the, above, blind survey question. The SMEs overwhelmingly state that the the battle of Kasserine Pass most resembles the hypothetical battle situation. The TIGER program also chose the ‘Kasserine Pass’. Click to enlarge.

Once again this week’s post ran longer than I anticipated. It looks like this story won’t conclude at least until Part 5. It has been said that by the time a PhD dissertation is defended only five people in the world are capable of understanding it. I certainly hope that wasn’t the case with my research. Below is a link to download a PDF of my thesis. Please feel free to contact me directly if I can answer any questions.

Lastly, my good friend Mike Morton, sent me a link to this piece just before my defense:  The “Snake Fight” Portion of Your Thesis Defense. Anybody thinking of getting a PhD should probably read this first (and laugh and then cry).


Papers that were cited in this post with download links:

“Algorithms for Generating Attribute Values for the Classification of Tactical Situations.”

In PDF Format

TIGER: An Unsupervised Machine Learning Tactical Inference Generator.

In PDF Format