Keeping our network infrastructure healthy at Microsoft with an employee-built AI agent


Like many global companies, our network engineering environment here at Microsoft is gigantic.
It spans 88 countries, more than 700 buildings, 64,000 devices, 7,500 Microsoft Azure Virtual Networks, and nearly 150 lab sites. It’s a system that serves more than 220,000 employees and generates its fair share of service tickets, more than 170,000 per year.
How do you keep something of that size healthy?
Joshua Green and Soundarya Tekkalakota wondered if their team in Microsoft Digital, the company’s IT organization, could build an AI agent with Microsoft 365 Copilot to help accomplish this goal. Green, an Infrastructure and Engineering Services (IES) principal software engineering manager, and Tekkalakota, an IES product manager, quickly realized that the answer was a resounding yes, if they sprinkled in a helping of artificial intelligence and machine learning.
“We essentially put an AI lens on the network engineering challenges that already existed, and that our engineering teams have been dealing with for years,” says Tekkalakota, who served as lead product manager on the effort. “We decided to use AI to enable faster gathering of information and data insights, and to identify network problems more quickly and efficiently—this would give our network engineers more time to take the human actions needed to resolve issues.”
That kicked off their AI journey, in which they and their team built a custom engine agent before eventually using the extensibility capabilities of Microsoft 365 Copilot to create a declarative AI agent. The result is Network Copilot (also known as Network Infrastructure Copilot, or “NiC”), a powerful tool that provides support for various networking and infrastructure-management tasks and helps us work toward our goal of operating the industry’s most secure and reliable enterprise network.
Importantly, Network Copilot is another proof point in our ongoing journey to show how we’re benefitting from Microsoft 365 Copilot internally here at Microsoft. Read this story to learn how we’re thinking about AI agents internally at Microsoft and to get our guidance on how to get started with them at your company.
The spark of inspiration

Network Copilot originated in a hackathon project in early 2023, inspired by the excitement at that time around generative AI and ChatGPT. Tekkalakota pulled together a small group of AI enthusiasts and launched the effort to develop a tool that would be able to simplify network management tasks.
“These were network engineers who were at that intersection of new tech enthusiasts and experts in their particular job,” Tekkalakota says. “We leaned on them heavily in the first few iterations of the project, collecting their feedback manually on what the right queries were. And as time went on, we kept adding more and more of these enthusiastic users to help us build the community, and to test the tool and gather feedback.”
On the engineering side, the project started out with a custom-agent approach, reflecting the available technology at that point in time.
“We went with a conversational agent built on Semantic Kernel and Azure OpenAI, because that was the only option at the time,” Green says. “Over time, we switched to a declarative-agent model based on the Microsoft 365 Copilot capabilities that were being released. In a sense, Network Copilot is the story of how fast AI technology is progressing, and how it’s becoming faster and easier to develop these kinds of tools.”
Improving network services with Network Copilot
Generative AI tools excel at one of the biggest challenges that network engineers face in their day-to-day work: how to quickly track down the specific information needed to resolve a network issue.
“There’s something like five to eight different steps in the network management workflow, and many of them have a manual component,” Tekkalakota says. “Network engineers drill through siloed documents like wikis and troubleshooting guides, data sources such as infrastructure data lake and incident management (IcM), and more to define data insights and documentation. We wanted to make this search faster and easier for these engineers.”
The answer was Network Copilot, an AI chat interface in which engineers can use natural-language queries to gain insights and determine recommended actions without leaving the flow of their work process.
“It’s a great solution because it keeps them in the context of their current work,” Tekkalakota says. “They don’t have to step out of the network lifecycle management task that they’re currently in to find answers. It gives them the next step in a concise, summarized manner—something that they would have to spend multiple hours tracking down outside of their context.”
The use of natural language to access network telemetry in real time is one example that Green cites when talking about how Network Copilot is transforming the way that engineers do their job.
“I can ask NiC, ‘What’s the network health of Building 32?’ and it will run a query against the network telemetry data,” he says. “Then it summarizes the results in a nice, clean report for the user, including details on risks and recommendations for that building’s network. Then the engineer can take the appropriate action.”
Transforming network engineering with a Copilot agent

Network Copilot development journey
The initial development of Network Copilot as a custom agent meant it relied on plug-ins to give it more flexibility.
“We first built NiC in a very modular way, and all its capabilities were done with plug-ins and APIs,” Green says. “For example, we provided a library of more than 1,000 queries, which were written by the teams that know the data best (like the wireless team, which wrote queries to check the health of wireless access points). So, when Copilot is able to access that data, it can stand toe-to-toe with the network engineers because it’s able to draw on that same knowledge base.”
Then, when declarative agents were released in 2024, the development strategy shifted to take advantage of these faster, less code-heavy solutions.
“One of the things we’re always trying to do at Microsoft is provide low-code and no-code options,” Green says. “That’s what Microsoft 365 Copilot is focused on. Or you can go with full-code development, do it all yourself and have ultimate control and customization. Our journey with NiC was kind of a hybrid approach. We’re still on the journey from full code to low code; we’re not there yet.”
Overcoming the challenges of AI tool adoption
As Green, Tekkalakota and the team began rolling out Network Copilot to larger and larger groups of network engineers, they began running into some of the challenges inherent in widespread AI tool adoption.
“The first thing was just the cultural change of our engineers building the daily habit of using the tool, because it’s not always top of mind for them,” Tekkalakota says. “It’s the stickiness factor, and that’s something we’re still working on. The other challenge was what we came to call ‘prompter’s block,’ where the engineers weren’t sure what to ask in the NiC chat, or they wouldn’t keep querying to get better results. So, we put out newsletters and did road shows to educate them on the tool and how to use it. It’s more about a larger cultural shift.”
One major takeaway from this process was that users wanted more integrated and one-click solutions for interacting with Network Copilot.
“Some of it might be contextual, where we’re able to integrate NiC on a specific tab or page or in a specific web application,” Green says. “In some cases, it could be in the form of a button they click that sends a pre-created prompt to the back end. It’s a more simplified approach, rather than just giving people a free-range chat interface where they can ask anything.”
The impact of Network Copilot
Today, Network Copilot is available to our company’s network professionals through an internal preview and is used by more than 200 network engineers. By surveying users, Tekkalakota has already been able to show that NiC has made a significant difference in terms of employee time and effort.
“We’ve found that NiC can cut the amount of time engineers take searching for documentation and insights by 20 to 25 minutes for each successful prompt,” she says. “It also drastically reduces documentation time and has cut live incidents down by 10%.”
This finding is backed up by employees such as Brandon Hughes, a senior service engineer who played an important role in developing Network Copilot.
“Being able to extract data through natural-language questions is a huge departure from having to manually write a Kusto query, which could take you a few hours to refine in order to get the exact output that you want,” Hughes says. “Whereas in NiC, I can spend five minutes questioning it like a human and get a response that includes specific data points from the actual databases. We get a huge amount of value from Network Copilot on a day-to-day basis.”
Hughes and others are also working on extending the capabilities of Network Copilot to handle tasks such as generating customer update emails, troubleshooting suggestions based on service ticket details, and postmortem report generation. They even hope to add the ability for NiC to analyze images of network environments and provide feedback and optimization suggestions.
Taking a wider view, agents like Network Copilot offer the ability to manage complexity and empower users to accomplish more, no matter their role.
“In general, these agents are going to make our lives easier,” says Abhishek Kumar, a software engineer who also assisted in the development of Network Copilot. “We’re always working to reduce complexity, and agents take that a step further—decreasing complexity where it’s needed but allowing the full breadth of complexity when required. They’re enabling users to do things they normally wouldn’t be able to do.”
Network Copilot and AI agents: The journey continues

Tekkalakota and Green know that, for as much as Network Copilot can do now, the team has only just scratched the surface of the full potential that AI agents have to change the way IT—and the world—works.
“I think we’re one of the earlier efforts at Microsoft to build an AI agent, figuring out what skills it needs to have and then building them,” Tekkalakota says. “The next steps are to build on the agent capabilities that it already has, adding things like monitoring or predictive alerting. Then, eventually be able to connect to other agents; having a connected experience between Copilot agents is the uber goal.”
Green emphasizes that when it comes to AI, the pace of change is remarkable.
“It’s still early days for AI agents, and things are moving and changing extremely quickly,” Green says. “What we did with Network Copilot was kind of like building a foundation. Now we’re working on adding more capabilities. The potential is great—we’re just seeing the tip of the iceberg.”

We learned some important lessons while developing Network Copilot that you can draw on when creating your own AI agent solutions, including:
- The team found it most effective to slowly build a community of enthusiastic users, continually soliciting feedback and ideas for improvements from these early adopters.
- Users expect an AI agent to “just work” with one prompt. Query debugging features (“Help me with this error”) and contextual prompts encourage users to engage in a conversation to generate the information they need.
- Users want the AI agent to know everything that their team knows. The Network Copilot team continues to expand the tool’s knowledge base with additional troubleshooting documents, network config files, and data sources
- It’s helpful if the agent is accessible from the UI the users are already in, so the team is working on an embedded Network Copilot experience in their custom web apps that offers buttons for commonly used functions.
- Frequently requested use cases for Network Copilot include network device deployment failure remediation, network health and inventory, troubleshooting, and log monitoring for anomalies.
- Technology moves fast. The team built Network Copilot in a modularized way (using plug-ins and APIs) so that they could adjust to the latest AI capabilities as they were released.
- Follow best practices for accessing data from external sources, ensuring that your data is secure and sensitive information isn’t exposed.


link