U.S. Central Command Employs Large Language Model-based Artificial Intelligence
Amidst an intense geopolitical environment and active war in the Middle East region, the U.S. Central Command (CENTCOM) is pulling in certain artificial intelligence- (AI-) based tools, namely large language models, to gain efficiencies at the operational level. CENTCOM is already seeing value from large language model applications used for code augmentation, document disclosure processing and general enterprise use.
About a year ago , Sky Moore, the command’s chief technology officer (CTO) started looking into how large language models could help CENTCOM. Officials are “learning as they go” and are quickly discerning where applying large language model solutions works or does not work.
“We’ve discovered that there are places where we can deliver exquisite value and chip away seconds, minutes and hours of your time, and there are places where it makes absolutely no sense,” Moore noted, speaking in October at NVIDIA’s artificial intelligence (AI) conference in Washington, D.C. “We have had an abundance of experience with both. And we are just dipping a toe into the opportunities that large language models can provide for us.”
The command’s first foray last year was an example of where applying a large language model did not make sense.
“On October 7 [2023], everything changed for us with the Hamas-Israel war,” Moore stated. “We saw increased attacks on our forces in Iraq and Syria. The Houthis began firing just about everything that they could into the Red Sea, disrupting maritime traffic and so on. Within a 24-hour period, the pace and operational tempo of our command changed.”
Correspondingly, the command’s level of necessary communications abounded. The CTO’s office considered how a large language model could assist with the requirements to quickly summarize information, briefings and meetings and then push out information during and about critical events. The transfer of information was rapidly needed not only within CENTCOM, but also outside of the command, to the joint staff, the National Security Council, and other organizations.
“And so the first question that our users came back to us with was, ‘Is there any way to make this easier? Can you help us with summaries?’” she said. “But the next piece of taking all of those summaries of different meetings and being able to push information into different templates for whoever might need to receive it turned out to be much harder than we initially thought it was.”
The CTO’s office found that the information was to be delivered to a crisis action team at a stand-up meeting and the list of who needed to present information changed every time. For an initial large language model application, the ability to consistently use a defined prompt that would provide effective summaries proved early on not to be true, Moore said.
However, three use cases over the past nine months have been promising—in code augmentation and generation, document disclosure processes and general office use.
With access to a specific platform, called CENTGPT, designed for the Secure Internet Protocol Router Network (SIPRnet), with the needs of CENTCOM’s operating environment in mind, more gains are coming.
Software programmers at the command are using the tool in code generation and augmentation, Moore shared. “What we discovered right off the bat was that our software programmers were overjoyed by the ability to have a large language model where they could put in a simple query in a particular message format to deliver a certain function,” she said. “By having a large language model available on a SIPR environment, we opened the complete aperture of what our programmers could do.”
CENTGPT has improved the programmers’ effectiveness and, frankly, elevated their workplace contentment. “Code generation and augmentation is a good fit for a large language model tool because it can easily catch errors. You’ll be very quick to notice if your code just doesn’t generate the output you want,” Moore explained.
The officials are also applying CENTGPT to perform machine-assisted processing for document and information disclosure.
The command generates “an inordinate” amount of documents daily, and with that activity comes a requirement for disclosure, Moore said, whether it’s a U.S. citizen asking for document disclosure through the Freedom of Information Act, for example, or the command needing to disclose certain information to a partner nation during allied operations.
“There are a variety of different requirements that come out of this,” Moore continued. “And it means a huge burden on teams to sift through the trove of documents to be able to filter out what needs to be disclosed. We think that large language models can help us a lot with that ‘first triage.’ A large language model is not going to magically ingest all of your documents and clearly spit out the ones that can be disclosed. But what it can do is say, ‘I have high confidence these documents can be disclosed.’ And so machine-assisted disclosure has breathed new life for our foreign disclosure officers who have previously been drowning in documentation.”
The third area where CENTCOM is applying large language models is for general office tasks. “And this one, I think, will always be growing and developing in whatever ways and where our users see fit,” the CTO stated. “We now have something on the order of 500 plus folks on CENTGPT. Every hour, the number of users online at any time has increased. So it used to be like single digits, and now we’re up closer to 50.”
That regular use means the command’s staff is starting to pick apart what adds value for them, Moore continued. She expects the tool to work overtime for many different workflows. And here, again, having the platform on a secret network, means the ability to harness common Internet-like capabilities, such as a browser search.
“If you are on a secret network and you need to [search] something, you must quickly turn to your other server,” she clarified. “And if you are lucky enough that the screens are here, you must remember what you saw on one screen and then type it onto another one.”
Instead, CENTGPT’s query functions can act like a browser to find information quickly for users.
In addition, the tool has proven helpful for summarizing large documents. “If you needed to read through a 50-page document and another 100-page document, and you wanted to be able to get really quickly the general summary of what these documents are, [it identifies] the piece that I need to read through. That alone is value add, chipping away at some of the workflow that otherwise [would] have taken hours on that time.”
Moore cautioned that the output of large language models cannot be followed blindly and that the command has implemented awareness processes to reinforce this practice.
“We are really clear to our users that there is risk associated with large language models,” she specified. “We force our users to sign a document before they get their own account to say, ‘I acknowledge that regardless of what this model gives me, I am responsible for the outputs that I present. I am the human who is responsible for vetting the material.’ But really, what we’ve discovered again is that our humans do that because they understand their responsibilities, and more importantly, they understand the content of what they’re asking for and can vet it faster than any of us.”
The CENTGPT platform is based on the Department of the Air Force’s (DAF’s) NIPRGPT, unveiled to reporters at the Pentagon in June. The platform is tied to applications on the U.S. Department of Defense’s Nonclassified Internet Protocol Router Network (NIPRNet). It was approved for Impact Level Four security environments and is hosted on the DoD network through the Defense Information Systems Agency HPC office’s cloud compute platform. It requires CAC authentication to enter the system.
The NIPRGPT platform is meant to be a “sandbox,” or safe experimental platform for understanding, researching, testing and leveraging natural language models and other generative AI tools on a large scale. DAF CIO Venice Goodwine led the NIPRGPT effort along with other department officials, the Air Force Research Laboratory’s (AFRL’s) Chief Information Officer Alexis Bonnell, and other researchers from AFRL, including Senior Computer Scientist Collen Roller, a natural language processing engineer at the AFRL. The genesis of NIPRGPT stems from Roller’s Dark Saber software platform developed at the AFRL’s Information Directorate in Rome, New York.
Having a secured government capability for DoD to use a large language model was important, Roller said, speaking to reporters in June and at the October NVIDIA AI conference. “We don’t have people throwing documents or information into ChatGPT anymore, where OpenAI is collecting that information,” he said. “We need to be active in making solutions that are in our specific environments where we can both hold and present information.”
As users learn to leverage this type of AI in a secure environment, the DAF is also using the platform to see how warfighters interact with large language models. Moreover, the platform’s use will also help inform future policy, acquisition and investment decisions related to AI and large language models.
Like the results seen from CENTCOM, the DAF’s goal is to bring AI into operations and day-to-day tasks, and increase human-machine comfort level and productivity. Chandra Donelson, DAF chief data and artificial intelligence officer, characterized what warfighters had already crafted with the NIPRGPT capability as “phenomenal.”
Roller sees the platform, which has been out a year, already reducing workplace toil. “For the use cases that we are seeing for NIPRGPT, we see a lot of people using this for basic toil reduction tasks, the monotonous tasks,” he noted. “For where I have to open up a blank Word document and start coming up with an outline for my presentation, or maybe it’s the fact that I have to write a bold background paper on a topic, and I really don’t want to start from nothing. By plugging in a prompt into a large language model, you really have the opportunity to advance yourself, to put yourself more in an editing role than sitting there late at night just trying to get the paper done for whatever the requirement is that you have.”
One of his favorite use cases is pulling all of his daily meeting summaries into NIPRGPT to summarize “what did I do today,” he said. The platform offers a simple summary that is easily digested.
“With Gen AI, there are so many amazing things that are being brought to fruition, capability-wise, with this technology,” Roller noted. “This is truly going to be a game changer.”