Do AI Proposal Writers Dream of PowerPoint?
[Programming note: This is my last planned post for the year! It's been a joy to write these posts every week. More importantly, though, thank you so so much for reading them! (You too, mom!) I look forward to taking a few weeks off and then getting back into it for 2024. But if you have any suggestions about things I should write more about, or less about, send me an email; I'd love to hear any ideas. And, if nothing else, I hope you have a happy and healthy holiday season and new year!]
A minor trend in the world of government contracting is the rise of folks talking about how vendors can use Large Language Models to write proposals, and the parallel rise of folks worrying about vendors using LLMs to write proposals.
In its strongest form, the concern is that a vendor would give an LLM a solicitation and the LLM would write a proposal in a way that is most likely to win the work regardless of whether the proposal is true.
Even if you eliminate the concern about whether the proposal is true (presumably by, you know, requiring a human to sign the proposal under penalty of law), folks still seem to have concerns about AI-generated proposals.
Most of those concerns seem to center around evaluation concerns. If LLMs are really good at writing proposals that are responsive to a solicitation, then it will be hard for the government to eliminate potential vendors. And, I guess, the fear is that the government wouldn't necessarily be able to distinguish between companies who "get it" and companies who "don't."
There sure seems to be a lot of companies who are all trying to build LLM-driven applications that promise to write proposals for companies. I dunno, maybe that's the future?
In the present though, one trend that seems to have taken off during this moment of Love in the Time of Artificial Intelligence is the increased use of oral presentations.
Oral presentations have been a tool in the evaluation-criteria toolkit for a while now. The idea is that, instead of requiring a vendor to write a whole bunch of stuff down in a proposal, the government can interview vendors just like you might interview a job candidate. You ask a bunch of questions, you hear the vendor's responses, and you make a selection. "What are your weaknesses?" the government might ask. "Caring too much for the government's needs," the vendor can answer.
A benefit of oral presentations is that, with current technology anyway, it's hard to use artificial intelligence to do them.[1] So, if you're the government and you're worried about whether or not you can differentiate vendors because of LLMs, making people do an orals presentation is a path that sidesteps that concern.
A drawback of oral presentations, though, is that people aren't always good at answering complicated questions in an interview?
Here's a recent GAO decision that highlights that problem. The case involved a procurement by the Defense Intelligence Agency (DIA) for "joint counterintelligence training activity missions support services." As part of the evaluation process, DIA required vendors to put together a slide deck and do an oral presentation about their technical approach.
Among other things not relevant to this discussion, one thing that DIA asked for as part of the slide deck and presentation was a "Staffing Plan demonstrating a feasible strategy to recruit, assimilate, and/or retain qualified individuals for all labor categories identified in the SOW and mitigation strategies to reduce risk of negatively impacting training execution schedule if staffing falls below 75 percent."
After the presentations, DIA awarded the contract to DarkStar Intelligence, LLC, and the losing bidder SecuriFense Inc. protested.[2]
As part of the protest, SecuriFense first argued that its technical approach was unfairly graded. Specifically, DIA lowered its confidence rating for SecuriFense because it "did not have a clear strategy for reducing risk to the mission if staffing fell below 75 percent." But GAO rejected that argument because, well, SecuriFense didn't have a clear strategy for reducing risk if staffing fell below 75 percent.
According to the oral-presentation transcript, "SecuriFense discussed how it would deal with gaps that resulted from paid time off, personnel assigned to temporary duty, and resignations [but did not] specifically mention the strategies it would utilize to reduce the potential negative impacts on the training execution mission if staffing fell below 75 percent."
In written form, some "red team" reviewer[3] might read the transcript, looked at the evaluation criteria closely, saw that the proposal wasn't specific about "mitigation strategies," and left a comment in the doc asking "oh hey, what mitigation strategies do we use?" Then the proposal team would "recover" and create a text box with a drop shadow or whatever in the proposal to highlight the various strategies that SecuriFense would use.
But, when it's an oral presentation, there's no red-team review. And unless the government asks a follow up question, when the technical evaluation panel goes to write up "what was SecuriFense's staffing shortage mitigation strategy?" the answer should be "we don't know!" And that leads to a lower confidence rating.
SecuriFense didn't include mitigation strategies and got a lower technical rating. And if that was the end of the protest, SecuriFense might be playing the sad trombone.
But that wasn't the end of the protest. Because, as we've talked about before, in addition to claiming that the agency rated the protestor too low, a vendor can also argue that agency rated the winner too high.
And, here, SecuriFense had more luck. Here's GAO:
During its oral presentation DarkStar stated, "Although we do not expect staffing to fall below 75 percent, DarkStar employs risk mitigation techniques to reduce the risk of negatively impacting JCITA training execution schedule should you experience staffing shortages." Aside from this general mention of mitigation techniques, the record before us does not include a discussion of the mitigation strategies that DarkStar proposed should staffing fall below 75 percent. Rather, in response to the protest, the agency explains that the final decision authority determined that based on DarkStar’s recruitment strategy and its average time of [DELETED] to fill a vacant position, DarkStar’s mitigation strategy should staffing fall below 75 percent “was to never let it occur because they have efficiency in their hiring process.”
The agency’s position does not account for the solicitation specific request for mitigation strategies in the event staffing fell below 75 percent since the agency was concerned with the potential impact this could have on its mission. In this regard, the question posed to vendors was what they would do if staffing fell below 75 percent, not what they would do to prevent staffing from falling below 75 percent. Given that, it was unreasonable for the agency to conclude that SecuriFense’s failure to specifically address its mitigation strategies in the event of multiple vacancies warranted a finding of decreased confidence, while DarkStar’s lack of mitigation strategies did not warrant a similar decreased confidence finding because the agency concluded that DarkStar simply would not allow staffing to fall below 75 percent.
In other words, DarkStar didn't answer the mitigation strategy question either. During oral presentations, DarkStar said that they use mitigation strategies and DarkStar also said that they will never need mitigation strategies. But they never said what those mitigations strategies were. Protest sustained. Them's the breaks.
Again, though, this is the sort of thing that almost certainly would have been caught if the proposals were written proposals. Because they were oral presentations, everyone lost!
And you know, I kinda feel bad for SecuriFense and DarkStar... If you've ever interviewed for a job or done an oral presentation, you know there's a balance between (a) appearing natural and speaking like a human and (b) reading notes and answering questions like a robot. So, if the government is trying to use oral presentations because it's concerned about robots, you can imagine the impulse to lean toward being a human. But this is government contracting, and there are rules.
Who knows whether oral presentations are actually becoming more of a thing because of the prospect of LLMs writing proposals. But if it's true, if vendors want to win more government contracts, it does sure seem like humans will have to get better at being more like LLMs and less like, well, humans.
[1] I guess you could feed questions into ChatGPT during the interview, but it'd be weird. And maybe there'd be hallucinations. Who knows. I sure wouldn't try it.
[2] What is going on with the names SecuriFense and DarkStar? Look, I love Pascal Casing as much as the NextGuy but was this sort of naming convention a requirement of the HCATS vehicle or what?
[3] For the uninitiated, almost all government contractors who put together proposals use some version of "color team review." This means that some people, who don't write the proposal, serve as, like, "pink team" reviewers to help improve the proposal. It's a thing.