Content Operations
Futureproof your content ops for the coming knowledge collapse
What happens when AI accelerates faster than your content can keep up? In this podcast, host Sarah O’Keefe and guest Michael Iantosca break down the current state of AI in content operations and what it means for documentation teams and executives. Together, they offer a forward-thinking look at how professionals can respond, adapt, and lead in a rapidly shifting landscape.
Sarah O’Keefe: How do you talk to executives about this? How do you find that balance between the promise of what these new tool sets can do for us, what automation looks like, and the risk that is introduced by the limitations of the technology? What’s the roadmap for somebody that’s trying to navigate this with people that are all-in on just getting the AI to do it?
Michael Iantosca: We need to remind them that the current state of AI still carries with it a probabilistic nature. And no matter what we do, unless we add more deterministic structural methods to guardrail it, things are going to be wrong even when all the input is right.
Related links:
- Scriptorium: AI and content: Avoiding disaster
- Scriptorium: The cost of knowledge graphs
- Michael Iantosca: The coming collapse of corporate knowledge: How AI is eating its own brain
- Michael Iantosca: The Wild West of AI Content Management and Metadata
- MIT report: 95% of generative AI pilots at companies are failing
LinkedIn:
Transcript:
Introduction with ambient background music
Christine Cuellar: From Scriptorium, this is Content Operations, a show that delivers industry-leading insights for global organizations.
Bill Swallow: In the end, you have a unified experience so that people aren’t relearning how to engage with your content in every context you produce it.
SO: Change is perceived as being risky; you have to convince me that making the change is less risky than not making the change.
Alan Pringle: And at some point, you are going to have tools, technology, and processes that no longer support your needs, so if you think about that ahead of time, you’re going to be much better off.
End of introduction
Sarah O’Keefe: Hey everyone, I’m Sarah O’Keefe. In this episode, I’m delighted to welcome Michael Iantosca to the show. Michael is the Senior Director of Content Platforms and Content Engineering at Avalara and one of the leading voices both in content ops and understanding the importance of AI and technical content. He’s had a longish career in this space. And so today we wanted to talk about AI and content. The context for this is that a few weeks ago, Michael published an article entitled The coming collapse of corporate knowledge: How AI is eating its own brain. So perhaps that gives us the theme for the show today. Michael, welcome.
Michael Iantosca: Thank you. I’m very honored to be here. Thank you for the opportunity.
SO: Well, I appreciate you being here. I would not describe you as anti-technology, and you’ve built out a lot of complex systems, and you’re doing a lot of interesting stuff with AI components. But you have this article out here that’s basically kind of apocalyptic. So what are your concerns with AI? What’s keeping you up at night here?
MI: That’s a loaded question, but we’ll do the best we can to address it. I’m a consummate information developer as we used to call ourselves. I just started my 45th year in the profession. I’ve been fortunate that not only have I been mentored by some of the best people in the industry over the decades, but I was very fortunate to begin with AI in the early 90s when it was called expert systems. And then through the evolution of Watson and when generative AI really hit the mainstream, those of us that had been involved for a long time were… there was no surprise, we were already pretty well-versed. What we didn’t expect was the acceleration of it at this speed. So what I’d like to say sometimes is the thing that is changing fastest is the rate at which the rate of change is changing. And that couldn’t be more true than today. But content and knowledge is not a snapshot in time. It is a living, moving organism, ever evolving. And if you think about it, the large language models, they spent a fortune on chips and systems to train the big large language models on everything that they can possibly get their hands and fingers into. And they did that originally several years ago. And the assumption is that, especially for critical knowledge, is that that knowledge is static. Now they do rescan the sources on the web, but that’s no guarantee that those sources have been updated. Or, you know, the new content conflicts or confuses with the old content. How do they tell the difference between a version of IBM database 2 of its 13 different versions, and how you do different tasks across 13 versions? And can you imagine, especially when it comes to software where most of us, a lot of us work, the thousands and thousands of changes that are made to those programs in the user interfaces and the functionality?
MI: And unless that content is kept up-to-date and not only the large language models, reconsume it, but the local vector databases on which a lot of chatbots and agenda workflows are being based. You’re basically dealing with out-of-date and incorrect content, especially in many doc shops. The resources are just not there to keep up with that volume and frequency of change. So we have a pending crisis, in my opinion. And the last thing we need to do is reduce the people that are the knowledge workers to update, not only create new content, but deal with the technical debt, so that we don’t collapse on this, I think, is a house of cards.
SO: Yeah, it’s interesting. And as you’re saying that, I’m thinking we’ve talked a lot about content debt and issues of automation. But for the first time, it occurs to me to think about this more in terms of pollution. It’s an ongoing battle to scrub the air, to take out all the gunk that is being introduced that has to, on an ongoing basis, be taken out. Plus, you have this issue that information decays, right? In the sense that when, I published it a month ago, it was up to date. And then a year later, it’s wrong. Like it evolved, entropy happened, the product changed. And now there’s this delta or this gap between the way it was documented versus the way it is. And it seems like that’s what you’re talking about is that gap of not keeping up with the rate of change.
MI: Mm-hmm. Yeah. I think it’s even more immediate than that. I think you’re right. But now we need to remember that development cycles have greatly accelerated. Now, when you bring AI for product development into the equation, we’re now looking at 30 and 60-day product cycles. When I started, a product cycle was five years. Now it’s a month or two. And if we start using AI to draft new content, for example, just brand new content, forget about the old content or update the old content. And we’re using AI to do that in the prototyping phase. We’re moving that more left upfront. We know that between then and CodeFreeze that there’s going to be a numerous number of changes to the product, to the function, to the code, to the UI. It’s always been difficult to keep up with it in the first place, but now we’re compressed even more. So we now need to start looking at AI to how does it help us even do that piece of it, let alone what might be a corpus that is years and years old, that’s not ever had enough technical writers to keep up with all the changes. So now we have a dual problem, including new content with this compressed development cycle.
SO: So the, I mean, the AI hype says we essentially, we don’t need people anymore and the AI will do everything from coding the thing to documenting the thing to, I guess, buying the thing via some sort of an agentic workflow. But what, I mean, you’re deeper into this than nearly anybody else. What is the promise of the AI hype, and what’s the reality of what it can actually do?
MI: That’s just the question of the day. Because those of us that are working in shops that have engineering resources, I have direct engineers that work for me and an extended engineering team. So does the likes of Amazon, other serious, not serious, but sizable shops with resources. We have a lot of shops that are smaller. They don’t have access to either their own dedicated content systems engineers or even their IT team to help them. First, I want to recognize that we’ve got a continuum out there, and the commercial providers are not providing anything to help us at this point. So it’s either you build it yourself today, and that’s happening. People are developing individual tools using AI where the more advanced shops are looking at developing entire agentic workflows.
And what we’re doing is looking at ways to accelerate that compressed timeframe for the content creators. And I want to use content creators a little more loosely because as we move the process left, and we involve our engineers, our programmers in the early, earlier in the phase, like they used to be, by the way, they used to write big specifications in my day. Boy, I want to go into a Gregorian chant. “Oh, in my day!” you know, but, but they don’t do that anymore. And basically the, the role of the content professional today is that of an investigative journalist. And you know what we do, right? We, we scrape and we claw. We test, we use, we interview, we use all of the capabilities of learning, of association, assimilation, synthesis, and of course, communication. And turns out that writing’s only 15% roughly of what the typical writer does in an information developer or technical documentation professional role, which is why we have a lot of different roles, by the way, that if we’re gonna replace or accelerate with people with AI, have to handle all those capabilities of all those roles. So, so where we are today is some of the more leading-edge shops are going ahead, and we’re looking at ways to ingest knowledge, new knowledge, and use that new knowledge with AI to draft new or updated content. But there are limitations to that. So, I want to be very clear. I am super bullish on AI. I think I use it every single day. I’m using it to help me write my novel. I’m using it to learn about astrophotography. I use it for so much. When the tasks are critical, when they’re regulatory, when they’re legal-related, when there’s liability involved, that’s the kind of content that we cannot afford to be wrong. We have to be right. We have to be 100% in many cases.
Whereas with other kinds of applications, we can very well be wrong. I always say AI and large language models are great on general knowledge that’s been around for years and evolves very slowly. But things that move quickly and change very quickly, in my business, it’s tax rates. There’s thousands and thousands of jurisdictions. Every tax rate is different and they change them. So you have to be 100% accurate or you’re going to pay a heck of a penalty financially if you’re wrong. So we are moving left. We are pulling knowledge from updated sources, things like videos that we could record and extract and capture, Figma designs, code even, to a limited degree that there’s assets in there that can be caught, and other collateral, and we’re able to build out and initial drafts. It’s pretty simple. Several companies are doing this right now, including my own team. And then the question comes, how good could it be initially? What can we do to improve that, make it as good as it can be? And then what is the downstream process for ensuring validity and quality of that content? What are the rubrics that we’re going to use to govern that? And therein is where most of the leading edge or bleeding edge or even hemorrhaging edge is right now.
SO: Yeah, and I mean, this is not really a new problem, and it’s not a problem specific to AI either, but we’ve had numerous projects where the delta between what, let’s say, the product design docs and the engineering content and the code, the as-designed documentation and the actual reality of the product walking out the door. So the as-built product, there was the resources, all that source material that you’re talking about, right, that we claw and scrape at. And I would like to also give a shout-out to the role of the anonymous source for the investigative journalists, because I feel like there’s some important stuff in there. But you go in there, you get all this as-designed stuff, right? Here’s the spec, here’s the code, here are the code comments, whatever. Or here’s the CAD for this hardware piece that we’re walking out the door. But the thing that actually comes down the factory assembly line or through the software compiler is different than what was documented in the designs because reality sets in and changes get made. And in many, many, many cases, the role of the technical writer was to ensure that the content that they were producing represented reality and not the artifacts that they started from. So there’s a gap. And there jobs to close that gap so that that document goes out and it’s accurate, right? And when we talk about these AI or automated or any sort of workflow, any sort of automation, any automation that does not take into account the gap between design and reality is going to run into problems. The level of problem depends on the accuracy of your source materials. Now, I wrote an article the other day and referred to the 100% accurate product specifications. I don’t know about you, I have seen one of those never in my life.
MI: Hahaha that’s absolutely true. That’s really true.
SO: The promise we have here is, AI is going to speed things up and it’s going to automate things and it’s going to make us more productive. And I think you and I both believe that that is true at a certain level. How do you talk to executives about this? How do you find that balance between the promise of what these new tool sets can do for us and what automation looks like and the risk that is introduced by the limitations of you know, of the technology itself? What does that conversation look like? What are the points that you try to make? What’s the roadmap for somebody that’s trying to, as you said, know, maybe in a smaller organization, navigate this with people that are, you know, all-in on “just get the AI to do it.”
MI: That’s a great question too, because we need to remind them that the current state of AI still carries with it a probabilistic nature. And no matter what we do, unless we add more deterministic structural methods to guardrail it, things are going to be wrong even when all the input is right. AI can still take a collection of collateral and get the order of the steps wrong. It can still include things or do too much. We’ve been trained to write as professional writers in a minimalistic capability. And we can control some of that through prompting. Some of that can be done with guardrails. But when you think about writing tech docs, some people might think, we document. we’re documenting APIs or documenting tasks and we, you know, we’ve always been heavily task-oriented, but you can extract all the correct steps and all the correct steps in the right order, but what doesn’t come along with it all too frequently and almost universally is the context behind it, the why part of it. I always say we can extract great things from code for APIs like endpoints or puts and, you know, gets and puts and things like that. That’s a great for creating reference documentation for programmers.
But if you want to know, it doesn’t tell you the why, it doesn’t tell you the steps, the exact steps, the code doesn’t tell you that. Now maybe your Figma does. And if your Figma has been done really well, your design docs have been done really well and comprehensively. That can mitigate it tremendously. But what have we done in this business? We’ve actually let go more UX people than probably even writers, you know, which is, which is counterproductive. And then you’ve got things like the happy path and the alternate paths that could exist, for example, through the use of a product or the edge cases, right? The what-ifs that occur. You might be able to, and we should, we are able to do better with the happy path, but the happy path is not the only path. These are multifunction beasts that we built. When we built iPhone apps, we often didn’t need documentation because they did one thing and they did one thing really well. You take a piece of middleware, and it can be implemented a thousand different ways. And you’re going to you’re going to document it by example and maybe give some variance, you’re not going to pull that from Figma design. You’re not going to pull that from code. There’s too much of it there that the human fact-baking capability can look at it and say, this is important, this is less important, this is essential, this is non-essential, to actually deliver useful information to the end user. And we need to be able to show what we can produce, continue to iterate and try to make it better and better, because someday we may actually get pretty darn close with support articles and completed support case payloads, we were able to develop an AI workflow that very often was 70% to 100% accurate and ready for publish.
But when you talk about user guides and complex applications, it’s another story because somebody builds a feature for a product and that feature boils down into not a single article, but into an entire collection of articles that are typed into the kind of breakdown that we do for disclosure, such as concepts, tasks, references, Q&A. So AI has got to be able to do something much more complex, which is to look at content and classify it and apply structure to separate those concerns. Because we know that when we deliver content in the electronic world, we’re no longer delivering PDF. Well, of us are hopefully not delivering PDF books made up of long chapters that intersperse all of these different content types because of the type of consumption, certainly not for AI and AI bots. Then when we, so we need to document, maybe the bottom line here is we need to show what we can do. We need to show where the risks are. We need to document the risks, and then we need the owners, the business decision makers, to see those risks, understand those risks, and sign off on those risks. And if they sign off on the risks, then me, as a technology developer and an information developer, I can sleep at night because I was clear on what it can do today. And that is not a statement that says it’s not going to be able to do that tomorrow. It’s only a today statement so that we can set expectations. And that’s the bottom line. How do we set expectations when there’s an easy button that Staples put in our face, and that’s the mentality of what AI is. It’s press a button and it’s automatic.
SO: Yeah, and I did want to briefly touch on, you know, the knowledge base articles are really, really interesting problem because in many cases you have knowledge base articles that are essentially bug fixes or edge cases when I, you know, hold my finger just so and push the button over here, you know, it blue screens.
MI: Mm-hmm.
SO: And that article can be very context-specific in the sense of you’re only going to see it if you have these five things installed on your system. And/or it can be temporal or time-limited in the sense that, while we fixed the bug, it’s no longer an issue. Okay. Well, so you have this knowledge-based article and you feed it into your LLM as an information source going forward, but we fixed the bug. So how do we pull it back out again?
MI: I love that question.
SO: I don’t!
MI: I love it. No, I’ve been, actually working for a couple of years on this very particular problem. The first problem we have, Sarah, is we’ve been so resource constrained that when doc shops built an operations model, the last thing they invested in is the operations and the operations automation. So when I’m in a conference and I have a big room of 300 professional technical doc folks. I love asking the simple question, how do you track your content? And inevitably, I get, yeah, well, we do it on Excel spreadsheets. To actually have a digital system of record, I get a few hands. And then I ask the question, well, does that digital system of record that you have for every piece of documentation you’ve ever published, does that span just the product doc or does that actually span more than product doc like your developer, your partner, your learning, your support, all these different things. Cause the customer doesn’t look at us as those different functions. They look at us as one company, one product. And inevitably, I’m lucky if I get one hand in the audience that says, yeah, we actually are doing that. So the first thing they don’t have is they don’t have a contemporary system of record that is digital that we can say, we know and can automate notifications as to when a piece of documentation should either be re-reviewed and revalidated or retired and taken out.
The other problem we have is that all of these AI implementations and companies, almost universally, not completely, but most of them, were based on building these vector databases. And what they did, was often to the completely ignoring the doc team, was just go out to the different sources they had available, Confluence, SharePoint. If you had a CCMS, they’d ask you for access to your CCMS or your content delivery platform, and they suck it in. They may date-stamp it, which is okay, but pretty rudimentary. And they may even have methods for rereading those sources every once in a while, but they’re not, unless they’re rebuilding the entire vector database, and then what did they do when they ingested the content? They shredded it up into a million different pieces, right? Because the context windows for large language models have limitations for token numbers and things like that. Maybe they’re bigger today, but they’re still limited. So how would they even replace a fragment of what used to be whole topics and whole collection of topics? And this is why we wrote the paper and did the implementation and share with the world what we call the document object model knowledge graph because we needed a way outside of the vector database to say go look over here and you can retrieve the original entire topic or collection of topics or related topics in their entirety to deliver to the user. And again, we’re still unless we update that content and it’s don’t treat it like a frozen snapshot in time, we’ll still have those content debt problems. But it’s becoming a bigger, bigger, a much bigger problem now. It wasn’t as big a problem when we put out chatbots. And the chatbots we’ve been building, what, for three, you know, two, three, four years now. And, you know, everybody celebrated, they popped the corks, you know, we can deflect X amount percentage of support cases. They can self-service. And I always talk about the precision paradox that once you reach a certain ceiling, it gets really hard to increment and get above that 70%, 80%, 85%, 90% window. And as you get closer and better, the tolerance for being wrong goes down like a rock. And you now have a real big problem.
So how do we do these guardrails to be more deterministic, to mitigate the probabilistic risk that we have and reality that we have? The problem is that people are still looking for fast and quick, not right. When I say right, I mean the building out of things like ontologies and leveraging our taxonomies that we labored over with all of that metadata that never even gets into the vector database because they strip it all away in addition to shredding it up. So if we don’t start building those things like knowledge graphs and retaining all of that knowledge, it’s even… now we’re compounding the problem. Now we have debt, and we have no way to fix the debt. And now when we get into the new world of agentic workflows, which is the true bleeding edge right now, when you have sequences of both agentic and agentive, and the difference between those two, by the way, is agentic is autonomous. There’s no human doing that task. It’s just doing it. And then agentive, which is a human in the loop, which is helping there. When you’ve got a mix of agentive and agentic processes in a business workflow, now you’ve got to worry about what happens if I get something wrong early in the chain of sequence in that workflow. And this doesn’t apply to just documentation, by the way. We’ll be seeing companies taking very complex workflows in finance and in marketing and in business planning and reporting and mapping out this is the workflow our humans do. And there’s hundreds, if not more steps and many roles involved in those workflows. And as we map those out and say, where can we inject AI, not as just individual tools, like just separately using a large language model or separately using a single agent, but stringing them together to automate a complex business workflow with dependencies upstream and downstream. How are we going to survive and make this work? And I think that’s why you saw the MIT study had come out where they said, you know, roughly only 5% or so of AI projects are succeeding. And I think that’s because we did the easy stuff first. We did the chatbots and they could be lossy in terms of accuracy. But when you now, when you get to these agenda workflows that we’re building, literally coding as we speak, now you’re facing a whole different experience and ballgame where precision and currency really matters.
SO: Yeah, and I think I mean, we’ve really only scraped the surface of this. Both of the articles that you’ve mentioned, the one that I started with and the one that you mentioned in this context, we’ll make sure we get those into the show notes. I believe they are on your is it Medium? On your website. So we’ll get those links in there. Any final parting words in the last? I don’t know. Fifteen seconds or so.
MI: No, that’s good. I want to give, I want to tell you the good news and the bad news for tech doc professionals. What I’m seeing in the industry hurts me. I think there’s a lot of excuse right now, not just in the tech doc space, but in all jobs where we’re seeing AI being used as an excuse to make business decisions, to scale back. It may take some time until the impact of some poor business decisions that are being made will reflect themselves, but there’s going to be reality that hits. And the question is, is how do we navigate the interim? I’m confident that we will. I’m confident that those of us that are building the AI, I feel like I’m evil and a savior at the same time. I’m evil because I’m building automation that can speed up and make people much more productive, meaning you need less people potentially. At the same time, I feel like we’re in a position when we do it, rather than an engineer that doesn’t even know the documentation space, we’re getting to redefine our space ourselves and not leave it to the whims of people that don’t understand the incredible intricacy and dependencies of creating what we know as high-quality content. So we’re in this tumult right now, I think we’re going to come out of it. I can’t tell you what that window looks like. There will be challenges through doing that, but I would rather see this community define their own, redefine their own future in this transformation that is unavoidable. It’s not going away. It’s going to accelerate and get more serious. But if we don’t define ourselves, others will. And I think that’s the message I want our community to take away. So when we go to conferences and we show what we’re doing and we’re open and we’re sharing all the stuff that we’re doing, that’s not, hi, look at us. That’s you come back to the next conference and the next webinar and show us what you took from us and made better and helped shape and mold that transformative industry that we know as knowledge and content. And I’m excited because I want to celebrate every single advance that I see as we share. And I think it’s incumbent upon us to share and be vocal. And I think when I write my articles, they’re aimed at not only our own community, they’re aimed at the executives and technologists themselves to educate them, so that if we don’t do it, who will? And it does fall on all of us to do that.
SO: I think I’m going to leave it there with a call for the executives to pay attention to what you are saying, and some of the rest of this community, many of the rest of this community are saying. So, Michael, thank you very much for taking the time. I look forward to seeing you at the next conference and seeing what more you’ve come up with. And we will see you soon.
MI: Thank you very much.
SO: Thank you.
Conclusion with ambient background music
CC: Thank you for listening to Content Operations by Scriptorium. For more information, visit Scriptorium.com or check the show notes for relevant links.
Want more content ops insights? Download our book, Content Transformation.The post Futureproof your content ops for the coming knowledge collapse appeared first on Scriptorium.





Subscribe