Share
STRATEGY & INSIGHTS
13 min read

Share
Figure 1: Semantic Parser Architecture
We started by passing the request through an NLP engine, which uses a pre-trained, language-specific model to analyze a given sentence, tag the words within according to their part of speech (POS), and then build a graph that shows the dependencies between the words according to their POS. The NLP engine can also identify and tag words as people, places, numbers, etc. For example, the parsing results for "All of Dave’s mail going to salesforce.com should go via security broker Lima," can be seen in figure 2.
Figure 2: NLP Engine Analysis
In order to add network semantics, we built what we call "lexical lists". An LL is a collection of key words, regular expressions, code snippets and application rules that combined, can identify and map words into SNR fields. Each LL targets a specific field. We built our LLs to be hierarchic so there would be inheritance of common sections between LLs. We soon found, however, that standard POSIX regular expressions were not enough to capture the complexity of a language. So we enhanced the regular expressions with the ability to run validation rules on capture groups, replace the content of capture groups according to predefined rules and checks, and even run snippets of code on capture groups to take advantage of all the data provided by the NLP engine and run more complicated checks and substitutions.
Validation rules are used to tell the LL engine how to evaluate the LL and allow the application of Boolean conditions on the results. By default, all the regular expressions and keyword lists are used, but are overridden by validation rules.
The LLs worked, but were not very robust since they required us to think of every possible way someone might phrase a request — which wasn’t feasible. However, they were an advantage for special and edge cases, where we found we needed to be able to define precedence rules for parsing and mappings. So we turned to DL to make our engine more robust. We set out to train our own named entity recognizer (NER), based on a deep neural network (DNN), to map a request into SNR fields.
The first problem we faced was getting training data. As any data scientist will tell you, getting good, tagged data is the most important and difficult part of building a deep learning (DL) model. Since no one has ever tried using NL for networking requests, we had no pre-existing data to use. As our entire team consisted of only three people, it would have taken us a very long time to create and tag enough data to be able to do decent training. We’re talking about millions of input samples! We could have tried to recruit other people to help, but again, we were looking for a solution to prove the feasibility of our system within a short timeframe.
Eventually, we decided to build our own data generator. We studied the typical structure of network requests in the chosen sub-domains and extracted several thousands of templates that define all possible request types. Then we created lists of words for each field in the templates, while also using data from our LLs’ key word lists. We were able to synthetically produce more than 15 quadrillion — that’s 15*10¹⁵ or 15,000,000,000,000,000 — different tagged requests. We used the generator to randomly produce 3 million sentences for each sub-domain to train our NER on.
As it turned out, this method worked very nicely for a single sub-domain and our trained NER correctly tagged many words that it had never seen during training. However, when training for multiple sub-domains, the NER got confused and couldn’t cope. We needed first to identify the specific sub-domain then run the NER using a domain-specific model.
Thus, we built a sub-domain classifier, again using a DNN. We augmented the generator to also include the sub-domain classification and generated 9 million new sentences. We used those to train the classifier and the domain-specific NER models. Our classifier was able to achieve 99% accuracy across various sets of 9 million sentences. We realized that using the classifier could also be useful before running the LLs to eliminate false positives, get better SNR mappings and reduce the runtime of our semantic parser. Finally, we put everything together by first running the classifier and then the domain-specific LLs and NER model, applying our scoring metric on the results of both.
Next, we added a phase where we weeded out invalid mappings (where some fields were missing or doubly mapped, etc.) and filled in default values for empty fields such as "all" when no source/destination were specified and so on. Lastly, an evaluation step was carried out where we used our scoring metric to choose the "best" SNR mapping for the given query.
It should be noted, that the semantic parser itself was built to be generic. We used sub-domain specific LLs as well as generator templates for the classifier and NER training (all in the form of JSON files) to teach it the chosen networking sub-domains.
We would start by parsing the NL request while applying network semantics, extracting the various bits of information, and mapping them into an SNR. To make sure we got everything right, we would show the user what we understood and get their approval.
Next, we pass the SNR through a verification mechanism to confirm we had filled in all the required fields and to translate the human understandable terms to machine terms (such as addresses, ports, etc.) Once everything is ready, we pass the SNR through an NMS specific module that builds the actual network request according to the supplied fields. In our case, we used an NMS that was configured via a REST API. In a bit more detail, figure 4 shows the actual flow we built.
Figure 4: SNR Processing Flow
The user speaks their request, and the speech-to-text mechanism converts it to text. The request passes through the semantic parser to give us the set of SNR candidates. Just as humans might understand the same sentence in several different ways so might our parser, and we need to examine each option to find the best one. We pass the candidates through a resolver that eliminates bad candidates according to predefined rules on required fields or sets of fields for the specific sub-domain, and fills in default values for empty fields. The candidates are ranked using our custom scoring metric and passed through an evaluator that chooses the best match according to the given scores. Once we have our "best" SNR in hand, we pass it through a standardizer to translate all the data into machine/network terms.
At this point, we might have several SNRs since some requests may translate into several network operations such as setting both directions of a network flow, handling both TCP and UDP requests, handling multiple port/address ranges etc. We then pass those SNRs through a dedicated NMS translator to convert everything to what the specific NMS expects to get, and then use an NMS specific API runner to build the corresponding REST API calls and carry them out.
This all looked good, but when trying some new real-world examples, we found that we were still missing some knowledge. The system didn’t know anything outside of the language and semantics we incorporated into it, but humans know more and have a context in which they make their requests. For example, a human technician would know what "the company site in San Jose" is, but the machine didn’t. How could we add such capabilities to our system?
Our solution was two-fold. Firstly, we decided to harvest the chosen NMS for data it already had regarding networks, VPNs, VLANS, hosts etc. Secondly, we added an adaptive learning knowledge base (ALKB). The first time the system encounters something it can’t translate into network/NMS terms, we ask the user to do this for us. We add the knowledge into our ALKB so that we can retrieve it the next time we encounter the same term. We use an iterative process in which we review all the unknowns and let the user fill them in. After a couple of iterations, we have the full mapping and translation and can build the REST calls used to configure the system.
Figure 6: The system is running the request through the semantic parser
Figure 7: The request has been parsed and the SNR mapping is shown after translation to NMS terms. Still some information is missing so the user is asked to make a choice.
Figure 8: The user makes a choice
Figure 9: Now the SNR is fully resolved
Figure 10: We can also see the list of API calls that will be carried out to fulfill the request — In this case we need a single call
Figure 11: The NMS is being configured…
Figure 12: The request was successfully carried out!
Figure 13: This is the NMS dashboard showing the new configuration that was added to fulfill the request

Get emerging insights on innovative technology straight to your inbox.
Outshift is leading the way in building an open, interoperable, agent-first, quantum-safe infrastructure for the future of artificial intelligence.

* No email required
The Shift is Outshift’s exclusive newsletter.
Get the latest news and updates on agentic AI, quantum, next-gen infra, and other groundbreaking innovations shaping the future of technology straight to your inbox.
