Conversations on Strategy Podcast

Conversations on Strategy Podcast


Conversations on Strategy Podcast – Ep 21 – C. Anthony Pfaff and Christopher J. Lowrance – Trusting AI: Integrating Artificial Intelligence into the Army’s Professional Expert Knowledge

June 22, 2023

Integrating artificially intelligent technologies for military purposes poses a special challenge. In previous arms races, such as the race to atomic bomb technology during World War II, expertise resided within the Department of Defense. But in the artificial intelligence (AI) arms race, expertise dwells mostly within industry and academia. Also, unlike the development of the bomb, effective employment of AI technology cannot be relegated to a few specialists; almost everyone will have to develop some level of AI and data literacy. Complicating matters is AI-driven systems can be a “black box” in that humans may not be able to explain some output, much less be held accountable for its consequences. This inability to explain coupled with the cession to a machine of some functions normally performed by humans risks the relinquishment of some jurisdiction and, consequently, autonomy to those outside the profession. Ceding jurisdiction could impact the American people’s trust in their military and, thus, its professional standing. To avoid these outcomes, creating and maintaining trust requires integrating knowledge of AI and data science into the military’s professional expertise. This knowledge covers both AI technology and how its use impacts command responsibility; talent management; governance; and the military’s relationship with the US government, the private sector, and society.


Read the monograph: https://press.armywarcollege.edu/monographs/959/


Keywords: artificial intelligence (AI), data science, lethal targeting, professional expert knowledge, talent management, ethical AI, civil-military relations


Episode transcript:


Trusting AI: Integrating Artificial Intelligence into the Army’s Professional Expert Knowledge


Stephanie Crider (Host)

You’re listening to Conversations on Strategy. The views and opinions expressed in this podcast are those of the authors and are not necessarily those of the Department of the Army, the US Army War College, or any other agency of the US government.

Joining me today are Doctor C. Anthony Pfaff and Colonel Christopher J. Lowrance, coauthors of Trusting AI: Integrating Artificial Intelligence into the Army’s Professional Expert Knowledge with Brie Washburn and Brett Carey.

Pfaff, a retired US Army colonel, is the research professor for strategy, the military profession, and ethics at the US Army War College Strategic Studies Institute and a senior nonresident fellow at the Atlantic Council.

Colonel Christopher J. Lowrance is the chief autonomous systems engineer at the US Army Artificial Intelligence Integration Center.

Your monograph notes that AI literacy is critical to future military readiness. Give us your working definition of AI literacy, please.

Dr. C. Anthony Pfaff

AI literacy is more aimed at our human operators (and that means commanders and staffs, as well as, you know, the operators themselves) able to employ these systems in a way that not only we can optimize the advantage these systems promise but also be accountable for their output. That requires knowing things about how data is properly curated. It will include knowing things about how algorithms work, but, of course, not everyone can become an AI engineer. So, we have to kind of figure out at whatever level, given whatever tasks you have, what do you need to know for these kinds of operations to be intelligent?

Col. Christopher J. Lowrance

I think a big part of it is going to be also educating the workforce. And that goes all the way from senior leaders down to the users of the systems. And so, a critical part of it is understanding how best AI-enabled systems can fit in, their appropriate roles that they can play, and how best they can team or augment soldiers as they complete their task. And so, with that, that’s going to take senior leader education coupled with different levels of technical expertise within the force, especially when it comes to employing and maintaining these types of systems, as well as down to the user that’s going to have to provide some level of feedback to the system as it’s being employed.

Host

Tell me about some of the challenges of integrating AI and data technologies.

Pfaff

What we tried to do is sort of look at it from a professional perspective. And from that perspective, so I’ll talk maybe a little bit more later, but, you know, in many ways there are lots of aspects of the challenge that aren’t really that different. We brought on tanks, airplanes, and submarines that all required new knowledge that not only led to changes in how we fight wars and the character of war but corresponding changes to doctrine organizational culture, which we’re seeing with AI.

We’ve even seen some of the issues that AI brings up before when we introduce automated technology, which, in reducing the cognitive load on operators introduces concerns like accountability gaps and automation biases that arise because humans are just trusting the machine or don’t understand how the machine is working or how to do the process manually, and, as a result, they’re not able to assess its output. The paradigm example of that, of course, is the USS Vincennes incident, where you have an automated system. Even though there was plenty of information that it was giving that should have caused a human operator not to permit shooting down what ended up being a civilian airliner. So, we’ve dealt with that in the past. AI kind of puts that on steroids.

Two of the challenges that I think that are unique to AI, with data-driven systems, they actually can change in capabilities as you use them. For instance, a system that starts off able to identify, perhaps, a few high-value targets, over time, as it collects more data, gets more questions. And as humans see patterns, or as a machine identifies patterns, and humans ask the machine to test it, you’re able to start discerning properties of organizations, both friendly and enemy, you wouldn’t have seen before. And that allows for greater prediction. What that means is that the same system, used in different places with different people with different tasks, are going to be different systems and have different capabilities over time.

The other thing that I think is happening is the way it’s changing how we’re able to view the battlefield. Rather than a cycle of Intel driving OPS, driving Intel and so on, with the right kind of sensors in place, getting us the right kind of data, we’re able to get more of a real-time picture. The intel side can make assessments based on friendly situations, and the friendly can make targeting decisions and assessments about their own situation based on intel. So, that’s coming together in ways that are also pretty interesting, and I don’t think we fully wrestled with yet.

Lowrance

Yeah, just to echo a couple of things that Dr. Pfaff has alluded to here is that, you know, overarching, I think the challenge is gaining trust in the system. And trust has really earned. And it’s earned through use is one aspect. But you’ve got to walk in being informed, and that’s where the data literacy and the AI literacy piece comes in.

And as Dr. Pfaff mentioned, these data-driven systems, generally speaking, will perform based on the type of data that they’ve been trained against and those types of scenarios in which that data was collected. And so, one of the big challenge areas is the adaptation over time. But they are teachable, so to speak. So, as you collect and curate new data examples, you can better inform the systems of how they should adapt over time. And that’s going to be really key to gaining trust. And that’s where the users and the commanders of these systems need to understand some of the limitations of the platforms, their strengths, and understanding also how to retrain or reteach to systems over time using new data so that they can more quickly adapt.

But there’s definitely some technical barriers to gaining trust, but they certainly can be overcome with the proper approach.

Host

What else should we consider, then, when it comes to developing trustworthy AI?

Pfaff

We’ve kind of taken this from the professional perspective, and so we’re starting with an understanding of professions that a profession entails specialized knowledge that’s in service to some social good that allows professionals to exercise autonomy over specific jurisdictions. An example, of course, would be doctors and the medical profession. They have specialized knowledge. They are certified in it by other doctors. They’re able to make medical decisions without nonprofessionals being able to override those.

So, the military is the same thing, where we have a particular expertise. And then the question is, how does the introduction of AI affect what counts as expert knowledge? Because that is the core functional imperative of the profession—that is able to provide that service. In that regard, you’re going to look at the system. We need to be able to know, as professionals, if the system is effective. It also is predictable and understandable. I am able to replicate results and understand the ones that I get.

We also have to trust the professional. That means the professional has to be certified. And the big question is, as Chris alluded to, in what? But not just certified in the knowledge, but also responsible norms and accountable. The reason for that is clients rely on professionals because they don’t have this knowledge themselves. Generally speaking, the client’s not in the position to judge whether or not that diagnosis, for example, is good or not. They can go out and find another opinion, but they’re going out to go seek another profession. So, clients not only need to trust the expert knows what they’re doing but there’s an ethics that governs them and that they are accountable.

Finally, to trust the profession as an institution—that it actually has what’s required to conduct the right kinds of certification, as well as the institutions required to hold professionals accountable. So that’s the big overarching framework in which we’re trying to take up the differences and challenges that AI provides.

Lowrance

Like I mentioned earlier, I think it’s about also getting the soldiers and commanders involved early during the development process and gaining that invaluable feedback. So, it’s kind of an incremental rollout, potentially, of AI-enabled systems is one aspect, or way of looking at it. And so that way you can start to gauge and get a better appreciation and understanding of the strengths of AI and how best it can team with commanders and soldiers as they employ the systems. And that teaming can be adaptive. And I think it’s really important for commanders and soldiers to feel like they can have some level of control of how best to employ AI-enabled systems and some degree of mechanism, let’s say, how much they’re willing to trust at a given moment or instance for the AI system to perform a particular function based on the conditions.

As we know as military leaders, the environment can be very dynamic, and conditions change. If you look at the scale of operations from counterinsurgency to a large-scale combat operation, you know those are different ends of a spectrum here of types of conflicts that might be potentially faced by our commanders and our soldiers on the ground with AI-enabled systems. And so, they need to adapt and have some level of control and different trusts of the system based on understanding that system, its limitations, its strengths, and so on.

Host

You touched on barriers just a moment ago. Can you expand a little bit more on that piece of it?

Lowrance

Often times when you look at it from a perspective of machine-learning applications, these are algorithms where the system is able to ingest data examples. So basically, historical examples of conditions of past events. And so, just to make this a little bit more tangible, think of an object recognition algorithm that can look at imagery and that (maybe it’s geospatial imagery for satellites that have taken an aerial photo of the ground plane) you could train it to look for certain objects like airplanes. Well, over time, the AI learns to look for these based on the features of these examples within past imagery. With that, sometimes if you take that type of example data and the conditions of the environment change, maybe it’s the backdrop or maybe it’s a different airstrip or different type of airplane or something changes, then performance can degrade to some degree. And this goes back to adaptability.

How do these algorithms best adapt? This goes back to the teaming aspect of having users working with the AI recognizing when that performance is starting to degrade, to some degree, kind of through a checks-and-balances type of system. And then you give feedback by curating new examples and having the system adapt. I think giving the soldiers/commanders, for instance, the old analogy of a baseball card with performance statistics of a particular player, where you would have a baseball card for a particular AI-enabled system, giving you the types of training statistics. For example, what kind of scenario was this system trained for? What kind of data examples? How many data examples and so on, and that would give commanders and operators a better sense of these strengths and limitations of the systems, where and under what conditions has it been tested and evaluated. And, therefore, when it’s employed in a condition that doesn’t necessarily meet those kinds of conditions, then that’s an early cue to be more cautious . . . to take a more aggressive teaming stance with the system and checking more rigorously, obviously, what the AI is potentially predicting or recommending to the soldiers and operators.

And that’s one example. I think you’ve got to have the context where, most instances, depending on the type of AI application, if you will, really drives how much control or task effort you’re going to give to the AI system. In some instances, as we see on the commercial sector today, there’s a high degree of autonomy given to some AI systems that are recommending, for instance, what you maybe want to purchase or what movie you should shop for and so on, but what’s the risk of employing that type of system or if that system makes a mistake? And I think that’s really important is the context here and then having the right precautions and the right level of teaming in place when you’re going into those more risky types of situations.

And I think another final point of the barriers to help overcome them is, again, going back to this notion of giving commanders and soldiers some degree of control over the system. A good analogy is like a rheostat knob. Based on the conditions on the ground. Based on their past use of this system and their understanding, they start to gain an understanding of the strengths and limitations of the system and then, based on the conditions, can really dial up or dial down the degree of autonomy that they’re willing to grant the system. And I think this is another way of overcoming barriers to, let’s say, highly restricting the use of AI-enabled systems, especially when they’re recognizing targets or threats as part of the targeting cycle, and that’s one of the lenses that we looked at in this particular study.

Pfaff

When we’re looking at expert knowledge, we break it into four components—the technical part, which we’ve covered. But we also look at, to have that profession, professionals have to engage in human development, which means recruiting the right kinds of people, training and educating the right kinds of ways, and then develop them over a career to be leaders in the field. And we’ve already talked about the importance of having norms that ensure the trust of the client. Then there’s the political, which stresses mostly how the professions maintain legitimacy and compete for jurisdiction with other professions. (These are) all issues that AI brings up. So those introduce a number of other kinds of concerns that you have to be able to take into account for any of the kinds of things that Chris talked about for us to be able to do that. So, I would say growing the institution along those four avenues that I talked about represents a set of barriers that need to be overcome.

Host

Let’s talk about ethics and politics in relation to AI in the military. What do we need to consider here?

Pfaff

It’s about the trust of the client, but that needs to be amplified a little bit. What’s the client trusting us to do? Not only use this knowledge on their behalf, but also the way that reflects their values. That means systems that conform to the law of armed conflict. Systems that enable humane and humanitarian decision making—even in high intensity combat. The big concerns there, (include) the issue(s) of accountability and automation bias. Accountability arises because there’s only so much you’re going to be able to understand about the system as a whole. And when we’re talking about the system, it’s not just the data and the algorithms, it’s the whole thing, from sensors to operators. So, it will always be a little bit of a black box. If you don’t understand what’s going on, or if you get rushed (and war does come with a sense of urgency) you’re going to be tempted to go with the results the machine produces.

Our recommendation is to create some kind of interface. We use the idea of fuzzy logic that allows the system and humans to interact with it to identify specific targets in multiple sets. The idea was . . . given any particular risk tolerance the commander has because machines when they produce these outputs, they assign a probability to it . . . so for example, if it identifies a tank, it will say something to the effect of “80% tank.” So, if I have a high-risk tolerance for potential collateral harms, risk emission, or whatever, and I have a very high confidence that the target I’m about to shoot as legitimate, I can let the machine do more of the work. And with a fuzzy logic controller, you can use that to determine where in the system humans need to intervene when that risk tolerance changes or that confidence changes. And this addresses accountability because it specifies what commander, staff, and operator are accountable for—getting the risk assessment right, as well as ensuring that the data is properly curated and the algorithms trained.

It helps with automation bias because the machine’s telling you what level of confidence it has. So, it’s giving you prompts to recheck it should there be any kinds of doubts. And one of the ways you can enhance that, that we talked about in the monograph, is in addition to looking for things that you want to shoot, also look for things you don’t want to shoot. That’ll paint a better picture of the environment, (and) overall reduce the kind of risk of using these systems.

Now when it comes to politics, you’ve got a couple of issues here. One is at the level of civ-mil relations. And Peter Singer brought this up 10 years ago when talking about drones. His concern was that drone operation would be better done by private-sector contractors. As we rely more on drones, what it came to mean in applying military force would largely be taken over by contractors and, thus, expert knowledge leaves the profession and goes somewhere else. And that’s going to undermine the credibility and legitimacy of the profession with political implications.

That didn’t exactly happen because military operators always retained the ability to do this. The only ones who are authorized to use these systems with lethal force. There were some contractors augmenting them, but with AI right now, as we sort through what the private sector/government roles and expertise is going to be, we have a situation where you could end up . . . one strategy of doing this is that the military expert knowledge doesn’t change, all the data science algorithms are going on on the other side of an interface where the interface just presents information that the military operator needs to know, and he responds to that information without really completely understanding how it got there in the first place. I think that’s a concern because that is when expertise migrates outside the profession. It also puts the operators, commanders, and staffs in a position where (A.) they will not necessarily be able to assess the results well without some level of understanding. They also won’t be able to optimize the system as its capabilities develop over time.

We want to be careful about that because, in the end, the big thing in this issue is expectation management. Because these are risk-reducing technologies . . . because they’re more precise, they lower risk to friendly soldiers, as well as civilians and so on. So, we want them to make sure that we are able to set the right kinds of expectations, which will be a thing senior militaries have to do. And regarding the effectives of the technology, so civilian leaders don’t over rely on it, and the public doesn’t become frustrated by lack of results when it doesn’t quite work out. Because the military, they can’t deliver results but also imposes any risk to soldiers and noncombatants alike is not one that’s probably going to be trusted.

Lowrance

Regarding ethics and politics and relations to AI and the military, I think it’s really important, obviously, throughout the development cycle of an AI system, that you’re taking these types of considerations in early and, obviously, often. So, I know one guiding principle that we have here is that if you break down an AI system across a stack all the way from the hardware to the data to the model and then to deployment in the application, really ethics wraps all of that.

So, it’s really important that the guiding principles already set forth through various documents from DoD and the Army regarding responsible AI and employment that that is followed in the hereto. Now, in terms of what we looked at from the paper, from the political lens, it’s an interesting dynamic when you start looking at the interaction between the employment of these systems. And really from the sense of, let’s say, of urgency of at least leveraging this technology from either a bottom-up or a top-down type of fashion. So, what I mean by that is from a research and development perspective, you know, there’s an S and T (or science and technology) base that really leads the armies—and really DoD if you look outside from a joint perspective the development of new systems. But yet, as you know, the commercial sector is leveraging AI now, today, and sometimes there’s a sense of urgency. It’s like, hey, it’s mature enough in these types of aspects. Let’s go ahead and start leveraging it.

And so, a more deliberate approach would be traditional rollout through the S and T environment where it goes through rigorous test and evaluation processes and then eventually becomes a program of record and then deployed and fielded. Whereas it doesn’t necessarily prohibit a unit right now that obviously says, “Hey, I can take this commercial off-the-shelf AI system and start leveraging it and go ahead and get some early experience.” So, I think there’s this interesting aspect between the traditional program of record acquisition effort versus this kind of bottom-up unit level experimentation and how those are blending together.

And it also brings up the role, I think, of soldiers and, let’s say, contractors play in terms of developing and eventually deploying and employing AI-enabled systems. You know, inherently AI-enabled systems are complex, and so who has the requisite skills to sustain, update, and adapt these systems over time? Is it the contractor, or should it be the soldiers? And where does that take place? We’ve looked at different aspects of this in this study, and there’s probably a combination, a hybrid.

But one part of the study is we talked about the workforce development program and how important that is because in tactical field environments, you’re not necessarily always going to be able to have contractors out present in these field sites. Nor are you going to have, always, the luxury of high bandwidth communications out to the tactical edge where these AI-enabled systems are being employed. Because of that, you’re going to have to have the ability to have that technical knowledge of updating and adapting AI-enabled systems with the soldiers. That’s one thing we definitely emphasized as part of the study of these kinds of relationships.

Host

Would you like to share any final thoughts before we go?

Lowrance

One thing I would just like to reemphasize again is this ability that we can overcome some of these technical barriers that we discussed throughout the paper. But we can do so deliberately, obviously, and responsibly. Part of that is, we think, and this is what one of our big findings from our study is, that from taking an adaptive teaming approach. We know that AI inherently, and especially in a targeting cycle application, is an augmentation tool. It’s going to be paired with soldiers. It’s not going to be just running autonomously by itself. What does that teaming look like? It goes back to this notion of giving control down to the commander level, and that’s where that trust is going to start to come in, where if the commander on the ground knows that he can change the system behavior, or change that teaming aspect that is taking place, and the level of teaming, that inherently is going to grow the amount of trust that he or she has in the system during its application.

We briefly talked a little bit about that, but I just want to echo, or reinforce, that. And it’s this concept of an explainable fuzzy logic controller. And the big two inputs to that controller are what is the risk tolerance of the commander based on the conditions of the ground, whether it’s counterinsurgency or large-scale combat operations versus what the AI system is telling them, Generally speaking, in most predictive applications, the AI has some degree of confidence score associated with its prediction or recommendation. So, leverage that. And leverage the combination of those. And that should give you an indication of how much trust or how much teaming, in other words, you know, for a given function or role, should take place with this AI augmentation and between the soldier and the actual AI augmentation tool that’s taking place.

This can be broken down, obviously, in stages just like the targeting cycle is. And our targeting cycle and joint doctrine is, for dynamic targeting, as F2T2 EA. Find fix, track, target, engage, and assess. And each one of those, obviously more some than others, is where AI can play a constructive role. We can employ it in a role where we’re doing so responsibly and it’s providing an advantage, in some instances augmenting the soldiers in such a way that really exceeds the performance a human alone could do. And that deals with speed, for example. Or finding those really hidden types of targets, these kinds of things that would be even difficult for human to do alone. Taking that adaptive teaming lens is going to be really important moving forward.

Pfaff

When it comes to employing AI, particularly for military purposes, there’s a concern that the sense of urgency that comes with combat operations will overwhelm the human ability to control the machine. We will always want to rely on the speed. And like Chris said, you don’t get the best performance out of the machine that way.

It really is all about teaming. And none of the barriers that we talked about, none of the challenges we talked about, are even remotely insurmountable. But these are the kinds of things you have to pay attention to. There is a learning curve, and to engage in strategies that minimize the amount of adaptation members of the military going to have to perform, I think it will be a mistake in the long term even to get short-term results.

Host

Listeners, you can learn more about this, if you want to really dig into the details here, you can download the monograph at press.armywarcollege.edu/monographs/959. Dr. Pfaff, Col. Lowrance, thank you so much for your time today.

Pfaff

Thank you, Stephanie. It’s great to be here.

Host

If you enjoyed this episode and would like to hear more, you can find us on any major podcast platform.

About the Project Director Dr. C. Anthony Pfaff (colonel, US Army retired) is the research professor for strategy, the military profession, and ethics at the US Army War College Strategic Studies Institute and a senior nonresident fellow at the Atlantic Council. He is the author of several articles on ethics and disruptive technologies, such as “The Ethics of Acquiring Disruptive Military Technologies,” published in the Texas National Security Review. Pfaff holds a bachelor’s degree in philosophy and economics from Washington and Lee University, a master’s degree in philosophy from Stanford University (with a concentration in philosophy of science), a master’s degree in national resource management from the Dwight D. Eisenhower School for National Security and Resource Strategy, and a doctorate degree in philosophy from Georgetown University.

About the Researchers Lieutenant Colonel Christopher J. Lowrance is the chief autonomous systems engineer at the US Army Artificial Intelligence Integration Center. He holds a doctorate degree in computer science and engineering from the University of Louisville, a master’s degree in electrical engineering from The George Washington University, a master’s degree in strategic studies from the US Army War College, and a bachelor’s degree in electrical engineering from the Virginia Military Institute.

Lieutenant Colonel Bre M. Washburn is a US Army military intelligence officer with over 19 years serving in tactical, operational, and strategic units. Her interests include development and mentorship; diversity, equity, and inclusion; and the digital transformation of Army intelligence forces. Washburn is a 2003 graduate of the United States Military Academy and a Marshall and Harry S. Truman scholar. She holds master’s degrees in international security studies, national security studies, and war studies.

Lieutenant Colonel Brett A. Carey, US Army, is a nuclear and counter weapons of mass destruction (functional area 52) officer with more than 33 years of service, including 15 years as an explosive ordnance disposal technician, both enlisted and officer. He is an action officer at the Office of the Under Secretary of Defense for Policy (homeland defense integration and defense support of civil authorities). He holds a master of science degree in mechanical engineering with a specialization in explosives engineering from the New Mexico Institute of Mining and Technology.