Internet-Draft | Neotec API | April 2025 |
Dunbar, et al. | Expires 25 October 2025 | [Page] |
This document explores how existing IETF YANG models, specifically the Attachment Circuit (AC) and Traffic Engineering (TE) topology models, can be applied to support a representative Neotec use case involving dynamic AI model placement at Edge Cloud sites. The use case, derived from China Telecom's Neotec side meeting at IETF 122, involves selecting optimal Edge locations for real-time AI inference based on end-to-end network performance between street-level cameras and Edge Cloud compute nodes. By mapping the use case across multiple network segments and applying relevant YANG models to query bandwidth, latency, and reliability, this document serves as a practical exercise to evaluate model suitability and identify potential gaps. The goal is to inform future work under the proposed Neotec Working Group.¶
This Internet-Draft is submitted in full conformance with the provisions of BCP 78 and BCP 79.¶
Internet-Drafts are working documents of the Internet Engineering Task Force (IETF). Note that other groups may also distribute working documents as Internet-Drafts. The list of current Internet-Drafts is at https://datatracker.ietf.org/drafts/current/.¶
Internet-Drafts are draft documents valid for a maximum of six months and may be updated, replaced, or obsoleted by other documents at any time. It is inappropriate to use Internet-Drafts as reference material or to cite them other than as "work in progress."¶
This Internet-Draft will expire on 25 October 2025.¶
Copyright (c) 2025 IETF Trust and the persons identified as the document authors. All rights reserved.¶
This document is subject to BCP 78 and the IETF Trust's Legal Provisions Relating to IETF Documents (https://trustee.ietf.org/license-info) in effect on the date of publication of this document. Please review these documents carefully, as they describe your rights and restrictions with respect to this document. Code Components extracted from this document must include Revised BSD License text as described in Section 4.e of the Trust Legal Provisions and are provided without warranty as described in the Revised BSD License.¶
This document explores using the Attachment Circuits YANG model [opsawg-teas-attachment-circuit] and TE topology YANG model, to support a simplified version of the use case described in China telecom's Neotec side meeting during IETF122(Cloud aware Network Operation for AI Services). Also specify the APIs needed to support the simplified use case.¶
Simplifed Use Case:¶
Let's assume there are 10 Edge Cloud sites. An City Surveillance AI model (e.g., detecting traffic congestion or garbage classification) needs to be deployed dynamically to some of these sites in response to real time events.¶
High level steps for selecting Edge sites to instantiate the City Surveillance AI model:¶
- Step 1: The Cloud Manager needs to query the network connectivity characteristics (bandwidth, latency, topology constraints, etc.) between street cameras (or gateways, eNB that connect to those street cameras) and candidate Edge Cloud sites in order to determine the optimal locations for the City Surveillance AI model deployment.¶
- Step 2: Based on the information gathered, the Cloud Manager decides to deploy the City Surveillance AI module in 4 of the 10 Edge Cloud sites.¶
High level steps to support the following desired outcome:¶
- Suppose the City Surveillance AI modules in the 4 Edge Cloud sites need to exchange large volumes of data with strict performance constraints (e.g., XX Gbps bandwidth and YY ms latency). The goal is to have the network controller dynamically adjust UCMP (Unequal Cost Multipath) load-balancing algorithms on all the nodes along the paths interconnecting those 4 sites.¶
Disclaimer¶
The use of specific YANG models (e.g., Attachment Circuit and TE topology) in this section is intended as a provisional exercise to explore how existing IETF models might address aspects of the Neotec use case. These examples are not exclusive or exhaustive. Other models, such as network slicing YANG modules or service function chaining models, could also be relevant depending on the network architecture and service requirements. The intent is to identify potential applicability and gaps, not to pre-define the final solution set.¶
The key words "MUST", "MUST NOT", "REQUIRED", "SHALL", "SHALL NOT", "SHOULD", "SHOULD NOT", "RECOMMENDED", "NOT RECOMMENDED", "MAY", and "OPTIONAL" in this document are to be interpreted as described in BCP 14 [RFC2119] [RFC8174] when, and only when, they appear in all capitals, as shown here.¶
In this use case, multiple Edge Cloud sites are located in close geographic proximity, allowing any of them to potentially host AI inference model instances for city surveillance tasks such as traffic congestion detection or garbage classification. The Cloud Manager must evaluate the end-to-end data paths between the street-level cameras and each Edge Cloud site to determine which sites offer sufficient compute resources and the required network performance (e.g., low latency and high throughput). This assessment enables dynamic placement of AI inference models in response to real-time events while ensuring reliable data delivery from the field.¶
This path typically spans three segments. The first is the access segment from the cameras local access node, such as a WiFi Access Point or an eNB, to a PE router. The second segment traverses the provider transport network between that Access PE and the PE router serving the candidate Edge Cloud site. The third segment connects that Edge PE to the Edge Cloud Gateway or compute node where the AI workload is deployed.¶
There are two primary types of access connectivity for the cameras: via cellular networks (e.g., through eNBs/gNBs and UPFs) or via WiFi Access Points. In cellular based access, cameras connect to the network through eNBs, which forward user data to User Plane Functions (UPFs) vir GTP-U tunnels. The UPFs serve as termination points for GTP-U tunnels and are often co-located with or adjacent to Edge Cloud sites. When the Cloud Manager selects Edge Cloud sites for hosting AI inference modules, it must ensure that the corresponding nearby UPFs are assigned as the serving UPFs for the associated eNBs. This enables the establishment of GTP-U tunnels from the eNBs directly to the selected Edge Cloud locations, minimizing latency and improving efficiency for real-time AI processing.¶
For cameras connected through WiFi Access Points, no GTP tunneling is involved. These access points typically connect to PE routers through Layer 2 or Layer 3 links. In this case, the Attachment Circuit (AC) YANG model can be directly used to represent the logical connectivity between the WiFi AP and the Access PE. The Cloud Manager or orchestrator can query the AC model to evaluate operational status, available bandwidth, latency, and packet loss, and determine whether the path is capable of supporting the target AI workload.¶
In both access scenarios, after determining viable Edge Cloud sites, the Cloud Manager must also evaluate the transport network (Segment 2) between the access side PE (or the PE adjacent to the UPF) and the Edge side PE. This segment is typically traffic engineered and can be modeled using the IETF TE topology model [RFC8795] and the TE tunnel model [RFC8776]. These models expose metrics such as available bandwidth, TE delay, and administrative attributes, which the orchestrator can use to assess whether the transport paths meet the end to end SLA constraints.¶
Finally, the last segment (Segment 3: from the Edge PE to the Edge Cloud Gateway or compute node) is again modeled using the AC YANG model. By combining insights from the AC and TE models across all three segments, the Cloud Manager and orchestrator can select Edge Cloud sites that not only have available compute capacity but also meet the network performance requirements to support low latency, high bandwidth AI model inference.¶
+---------------------+ | Street Cameras | +---------------------+ / \ / \ (Cellular Access) (WiFi Access) / \ +--------+ +-------------+ | eNB | | WiFi AP | +--------+ +-------------+ | | | | +----------------------+ +----------------------+ | UPF | | Access PE Router | +----------------------+ +----------------------+ | | | Segment 1 | |<--------------------------->| (Access Segment) \ / \ / \ / +----------------------+ | Provider Transport | | Network (TE) | +----------------------+ | Segment 2 | (PE-to-PE) v +---------------------------+ | Edge PE Router (per site)| +---------------------------+ | | Segment 3 v +--------------------+ | Edge Cloud Gateway| | + AI Model Module | +--------------------+
For the first and third segments, the Attachment Circuit (AC) YANG model, defined in [opsawg-teas-attachment-circuit], provides a standardized abstraction of the logical link between a service access point and the provider's routing infrastructure. This model allows the Cloud Manager to retrieve configuration and operational status of these links, including encapsulation type, bandwidth, and link status. By querying this information, the orchestrator can determine whether each access circuit is operational and capable of delivering real-time video streams to or from the AI inference point.¶
The second segment, the PE to PE transport path across the provider network, requires querying a different set of YANG models. The base network and topology models defined in [RFC8345] provide the foundation, while the Traffic Engineering Topology model [RFC8795] adds details such as TE metrics, link delay, bandwidth availability, and administrative constraints. These models are typically maintained by network controllers and made accessible via NETCONF or RESTCONF interfaces. The Cloud Manager can use this information to assess the performance characteristics of available transport paths between PEs. In addition, the orchestrator may use the TE Tunnel model [RFC8776] to discover existing LSPs or establish new traffic engineered paths. In architectures using PCE, path queries can also be issued via PCEP or a REST API to identify optimal transport paths that meet performance policies.¶
By combining the Attachment Circuit and TE topology models, the Cloud Manager can construct an end to end view of network connectivity from each camera access point to every potential Edge Cloud site. It can then select the subset of Edge Cloud locations that not only have sufficient computing resources but also meet the required network performance criteria for serving the selected set of street cameras. This allows dynamic, network aware placement of AI inference models, ensuring efficient use of edge infrastructure while meeting service quality requirements.¶
The Attachment Circuit (AC) YANG model provides a standardized abstraction to describe and manage the logical connections between customer-facing interfaces and the provider's edge infrastructure. In the context of dynamic AI model placement at Edge Cloud sites, this model is particularly useful for querying the health and performance of the two edge segments of the data path: Segment 1, which connects Access Points or eNBs to Access PE routers, and Segment 3, which links Edge PE routers to Edge Cloud Gateways or compute nodes.¶
Each attachment circuit is modeled with attributes such as administrative state, operational status, encapsulation type, and configured bandwidth. When augmented with telemetry or traffic engineering extensions, the model also supports operational performance data such as current utilization, available bandwidth, latency, and even packet loss. This enables the Cloud Manager or orchestrator to assess whether a given access or egress segment can support the performance requirements of the AI inference workload, typically low latency (e.g., less than 10ms) and guaranteed throughput (e.g., larger than 500 Mbps).¶
For example, if the Cloud Manager knows the PE router address at the access or edge side, it can issue a RESTCONF GET query to retrieve the state of attachment circuits connected to that PE. Consider the following RESTCONF request:¶
GET https://<controller>/restconf/data/ietf-ac:attachment-circuits /ac[pe-address='192.0.2.11']
{ "ietf-ac:ac": [ { "name": "ac-to-eNB-001", "pe-address": "192.0.2.11", "ce-interface": "ge-0/0/1", "oper-status": "up", "admin-status": "enabled", "encapsulation-type": "dot1q", "bandwidth": { "configured": 1000, "available": 850, "unit": "Mbps" }, "performance-metrics": { "latency": { "average": 5, "max": 10, "unit": "ms" }, "packet-loss": { "percentage": 0.01 } } } ] }
In this example, the attachment circuit is operational (oper-status: up), with 850 Mbps of available bandwidth and average latency of 5 ms, making it a strong candidate for supporting a real time AI application. The Cloud Manager may apply similar queries to all PE addresses associated with candidate Edge Cloud sites and their corresponding access points, filtering the results to identify which circuits meet both latency and throughput thresholds.¶
Segment 2 of the end to end path connects the Access PE, adjacent to the street level access points, to the Edge PE router serving a candidate Edge Cloud site. This segment traverses the provider's core or aggregation transport network and plays a critical role in determining whether the latency and bandwidth requirements of real time AI inference can be satisfied. To evaluate this segment, the Cloud Manager can leverage IETF YANG models that expose both the topological and performance characteristics of traffic-engineered (TE) or IGP routed networks.¶
The foundational models used for this purpose are the IETF Network Topology YANG model [RFC8345] and the Traffic Engineering Topology YANG model [RFC8795]. These models represent network nodes (e.g., PE routers), TE links (interconnecting core routers or PE-PE paths), and associated attributes such as available bandwidth, latency, SRLGs, and link metrics. In addition, TE tunnels modeled via the TE YANG module [RFC8776] can be queried to retrieve the current state of established Label Switched Paths (LSPs) or Segment Routing tunnels between PEs.¶
For example, to query available transport links from an Access PE at PE1 to a set of Edge PEs, the Cloud Manager may retrieve the list of TE links and their attributes using the following RESTCONF-style request:¶
GET https://<controller>/restconf/data/ietf-te-topology:te-topologies /topology=default/te-node=PE1
{ "te-links": [ { "name": "PE1-to-PE5", "link-id": "link-001", "oper-status": "up", "te-default-metric": 30, "te-bandwidth": { "max-link-bandwidth": 10000, "available-bandwidth": 7000, "unit": "Mbps" }, "delay": { "unidirectional-delay": 8, "unit": "ms" }, "adjacent-te-node-id": "PE5" } ] }
In this example, the link from PE1 to PE5 offers 7 Gbps of available bandwidth with 8 ms unidirectional delay, which may satisfy a 500 Mbps, sub-10 ms latency requirement for AI data ingestion. The Cloud Manager can evaluate similar TE links or tunnels to other Edge PEs (e.g., PE6, PE7) and compare their performance characteristics.¶
Alternatively, if the network has PCE deployed, the Cloud Manager can issue a path computation request, using PCEP to query the end to end path metrics and validate whether a PE to PE segment can meet specified SLA constraints. If Segment Routing (SR-MPLS or SRv6) is deployed, these models can also include SR label stacks or SID lists needed for forwarding decisions.¶
By querying the TE topology and tunnel models, the orchestrator can build a filtered set of feasible transport segments that support the expected latency and bandwidth for the AI workload. This insight, combined with data from the Attachment Circuit models for Segments 1 and 3, allows the orchestrator to make holistic decisions about AI workload placement and optimal traffic steering across the network.¶
The previous sections assume that the transport network between Access PEs and Edge PEs is traffic-engineered and that the orchestrator has access to detailed models, such as ietf-TE-topology, ietf-TE, or SR policy models. However, in some deployments, particularly those relying on Internet transit or best effort IP core, traffic engineering is not explicitly available.¶
In these cases, performance visibility and decision-making must rely on more general network telemetry. Useful data can still be obtained through IETF defined YANG models, including:¶
- RFC 8343 (Interface Management YANG Model) and RFC 8344 (IP Management YANG Model) for retrieving interface status and link-level attributes¶
- Operational telemetry models, such as ietf-system-telemetry or real time telemetry via YANG Push [RFC8641][RFC8639], to gather metrics such as interface utilization, packet loss, round trip time (RTT), and jitter, especially if measured using active probes or synthetic monitoring.¶
The Cloud Manager's query can trigger the network orchestrator to collect and aggregate per-hop or per-segment metrics using these models or active measurements (similar to IP SLA). The resulting data allows the Cloud Manager to estimate end-to-end path performance and make workload placement decisions accordingly, even if deterministic path selection or policy enforcement is not possible.¶
While this approach lacks the fine-grained path control of TE-based networks, it still enables adaptive, network-aware service placement in less structured or loosely-managed environments.¶
This section outlines a potential Neotec API interface to enable Kubernetes (or its external workload orchestrators) to make network-aware workload placement decisions, such as selecting appropriate Edge Cloud sites for deploying latency-sensitive AI inference modules.¶
Cloud-native platforms such as Kubernetes do not consume YANG-modeled data directly. Instead, they rely on REST-style APIs or gRPC interfaces with JSON or protobuf-encoded payloads. To support Neotec use cases, the network infrastructure can expose an abstracted API that translates YANG-modeled topology and performance data into a form consumable by Kubernetes or its scheduling extensions.¶
A representative Neotec API endpoint could take the form:¶
GET /network-advisor/query-path-performance? source-node=<access-node-id>&target-node=<edge-site-id>
This API allows Kubernetes to query the performance of the network path between a street camera's access node (e.g., a WiFi AP or eNB/UPF) and a candidate Edge Cloud site. The API response includes metrics such as available bandwidth, average path latency, and operational status. A sample response might be:¶
{ "source-node": "AP-235", "target-node": "EdgeSite-4", "path-latency-ms": 6.5, "available-bandwidth-mbps": 920, "path-status": "healthy" }
This API provides a read-only, low-latency interface for workload schedulers to assess whether a given path meets predefined service-level thresholds (e.g., latency less than 10 ms, bandwidth larger than 500 Mbps). The source node and target node identifiers correspond to access nodes (e.g., PE routers adjacent to UPFs or WiFi APs) and Edge Cloud PE routers respectively. These identifiers are assumed to be mapped within the operator domain.¶
The semantics of the fields are as follows:¶
- path-latency-ms: Derived from YANG-modeled metrics in ietf-TE-topology, representing end to end unidirectional delay.¶
- available-bandwidth-mbps: Aggregated from TE-bandwidth in the same topology model.¶
- path-status: Reflects operational state derived from oper-status fields in the TE and AC models.¶
The backend implementation of this API is expected to query IETF defined YANG models using RESTCONF or NETCONF. Specifically:¶
- Topology and path delay metrics are sourced from the ietf-te-topology and ietf-network-topology models [RFC8795], [RFC8345].¶
- Access circuit status and available bandwidth can be derived from the AC model [opsawg-teas-attachment-circuit].¶
This API represents a shim layer between the cloud native environment and the YANG driven network management domain, allowing real time path evaluations without requiring Kubernetes to consume YANG directly.¶
In the Neotec use case described in Section 1, AI inference modules are deployed across four Edge Cloud sites to support distributed city surveillance. These modules periodically exchange large volumes of data, for instance, during result aggregation or synchronized event analysis. These data exchanges are not continuous but are periodic and event driven, requiring guaranteed bandwidth and low latency for short time windows.¶
The underlying network connecting these Edge Cloud sites typically includes multiple paths between nodes. These paths are pre-established during network provisioning and may be realized using technologies such as SRv6, SR-MPLS, MPLS-TE, or enhanced IP forwarding. However, by default, routers forward traffic across these paths using Equal-Cost Multipath (ECMP), which spreads traffic evenly without regard for service-specific requirements.¶
When AI data exchange events are triggered, it is critical that the AI flows receive prioritized and efficient use of the available transport capacity. This cannot be guaranteed under ECMP. Instead, the network must support UCMP (Unequal Cost Multipath), which allows traffic to be distributed unevenly across multiple paths based on their real time bandwidth and latency characteristics.¶
To enable this behavior, the Cloud Manager or AI Service Controller must be able to dynamically trigger a change in the traffic distribution policy, activating UCMP across all routers involved in the paths connecting the Edge sites. This UCMP behavior should be time bound, applying only during the data exchange window, after which the network should revert to its default behavior.¶
A simplified example of a cloud-initiated API call to the network controller might look like:¶
POST /network-policy/ucmp-activation { "source-sites": ["EdgeSite-A", "EdgeSite-B"], "dest-sites": ["EdgeSite-C", "EdgeSite-D"], "start-time": "2025-05-01T10:00:00Z", "duration-sec": 300, "min-bandwidth-mbps": 5000, "max-latency-ms": 10 }
This request informs the network controller that a high-volume, low-latency data exchange will occur and that UCMP forwarding policies should be applied to optimize transport between the specified sites for the specified duration.¶
Assuming an SRv6 underlay, the network controller can use the ietf-sr-policy YANG model to update the traffic distribution weights across pre-established paths. For example, if three SRv6 paths exist between EdgeSite-A and EdgeSite-C, the controller can push the following configuration to the ingress node:¶
sr-policy { color 4001; endpoint "2001:db8:100::1"; candidate-paths { preference 100; path { weight 70; sid-list [2001:db8:10::1, 2001:db8:11::2]; } path { weight 20; sid-list [2001:db8:20::1, 2001:db8:21::2]; } path { weight 10; sid-list [2001:db8:30::1, 2001:db8:31::2]; } } }
This UCMP configuration tells the network to distribute traffic unequally across the three paths based on their capability. The underlying topology and metrics are derived from ietf-TE-topology and ietf-TE models, which expose bandwidth, latency, and available resources for each link.¶
Similar UCMP behavior can also be implemented over SR-MPLS, MPLS-TE, or enhanced IP networks, using the corresponding IETF YANG models (ietf-TE, ietf-routing, etc.). The key point is that the network paths are preexisting, and the only dynamic action is adjusting how traffic is forwarded among them in response to a cloud service request.¶
The Neotec use case, supporting real time, event driven placement and coordination of AI inference workloads across Edge Cloud sites, requires close interaction between cloud orchestration platforms and programmable transport networks. While IETF has standardized robust YANG models for traffic engineering (e.g., ietf-te-topology, ietf-sr-policy, and ietf-ac), these models are network-internal, and fall short in addressing cloud-driven, time-scoped network adaptation requirements.¶
This document evaluates two core capabilities against existing IETF YANG models:¶
- Network-aware workload placement at Edge sites¶
- Dynamic UCMP policy activation for inter-site AI data exchange¶
From these exercises, the following gaps have been identified:¶
Most IETF YANG models are accessed via NETCONF/RESTCONF, and are designed for network operator tools. In contrast, cloud-native environments rely on REST/gRPC APIs, JSON payloads, and declarative interfaces (e.g., Kubernetes CRDs). There is no standardized translation layer that exposes network topology or path performance in a form consumable by external cloud orchestrators or AI services.¶
In the UCMP use case, the cloud controller must be able to request changes in traffic distribution policy across existing network paths for a specific time window, such as when AI modules begin inter-site synchronization. Current IETF models (e.g., ietf-sr-policy) allow weighted path configuration but do not support time-scoping, activation triggers, or scheduling. These functions are essential for on-demand, just-in-time optimization and must be added.¶
There is no YANG model or API that allows the cloud controller to associate a workload (e.g., "AI inference service X") with network traffic that should receive enhanced treatment. While SR policies can be assigned to colors or DSCPs, there is no abstracted intent interface for service-aware flow classification or network SLA expression based on application context.¶
Even for read-only functions like path selection, there is no IETF-standardized API to answer high-level queries such as:¶
- "Which Edge sites have less than 10ms latency from these access nodes?"¶
- "What is the bandwidth-latency profile of the path from Access PE A to Edge PE B?"¶
Operators must instead manually interpret TE link state and construct custom tooling. A standardized API to query the aggregated path metrics between logical service endpoints is missing.¶
There is no defined mechanism for the network controller to confirm whether a policy request (e.g., UCMP activation) was accepted or enforced, nor to notify the cloud when SLA targets are not being met during the activation window. A bi-directional feedback channel is required to support closed-loop coordination between the cloud and the network.¶
While the IETF has defined robust identity and access control mechanisms (e.g., OAuth2, RPKI, TLS), integrating these mechanisms with cloud-native identity systems (such as Kubernetes RBAC, SPIFFE/SPIRE, or cloud IAM services) is still ad hoc. A standard framework for mutual trust establishment and token exchange between cloud and network domains is needed to support secure Neotec APIs under Zero Trust principles.¶
The modeling challenges discussed in Section 6 highlight the need for standardized mechanisms that enable dynamic coordination between cloud service platforms and transport networks. These requirements are not specific to any one transport technology (e.g., SRv6, SR-MPLS, MPLS-TE, or native IP), but instead call for common, technology agnostic abstractions that can be applied across diverse deployments.¶
Existing IETF working groups, such as TEAS, SPRING, IDR, and OPSAWG, are focused on protocol specific data modeling and operational behavior. While these groups have developed rich YANG models for their respective technologies, they are not scoped to define cross domain control inputs or service layer integration mechanisms that enable the network to adapt dynamically based on evolving service demands.¶
The proposed Neotec Working Group provides a focused venue to address this gap. Neotec's scope includes:¶
- Defining transport independent policy triggers and control inputs that allow applications and orchestration platforms to request network behavior aligned with workload demands (e.g., bandwidth guarantees, latency constraints, traffic prioritization).¶
- Developing modular YANG model extensions to allow temporary, per-service adjustments to traffic treatment across existing underlay technologies.¶
- Specifying interoperable API interfaces consistent with the approach outlined in the Neotec API strategy [neotec-api-strategy], where IETF documents define the API behavior and semantics, and developer-facing details (e.g., OpenAPI specifications) are maintained via open-source collaboration.¶
- Establishing a framework for policy lifecycle management, including activation, expiration, and real-time status feedback, to support cloud-network coordination at runtime.¶
By concentrating on service-driven, SLA-sensitive network behavior, Neotec aims to bridge the operational divide between cloud-native service orchestration and IETF-modeled transport infrastructure. its output will support reusable, vendor-neutral interfaces applicable across technologies and adaptable to dynamic service changes.¶
A dedicated WG will prevent fragmented development across protocol specific groups and promote architectural consistency. It will also provide a platform for broader engagement, enabling collaboration with the Kubernetes ecosystem, open source orchestration projects, and network operator communities. Neotec documents may explore solution frameworks and prototype workflows that extend the applicability of existing IETF models rather than replacing them.¶
Ultimately, Neotec seeks to define complementary APIs and interaction patterns that strengthen the cloud-network interface, enabling effective, scalable coordination of service placement, network telemetry, and policy enforcement across domains.¶
To be added¶
None¶
The authors would like to thank for following for discussions and providing input to this document: xxx.¶