Loading video player...
[Music]
Well, next up uh please join me in
welcoming Muri Wenut to the Nano stage.
Muri is a system architect and principal
engineer at Cisco. He leads the sonic
smart switch engineering architecture
and delivery and has previously worked
on multiple generations of NPU as6 and
platforms across service provider access
priag and mobility. And by the way, this
is his second presentation today. So
which is amazing. Please welcome.
>> Hello. Am I audible? Yeah. Okay, cool.
Uh hi everyone. I know uh it's two talks
before beer so bear with us 30 minutes a
lot of light on my face but hopefully I
can't see you. Um I want to introduce
Fred. Fred Gao is my co-partner and a
principal engineer at Cisco. He has
worked on various platforms XR Juno
uh name it he's done all the stuff. Um
so together we're going to talk a little
bit today about Sonic VPP and and I
don't know how many of you are familiar
here with Sonic maybe you know quick
show of hands
um I see a few uh VPP any takers quick
show of hands okay so this is a a
developer
geared uh topic but but I'm going to
simplify it down to you know what we are
trying to do here um primarily We run a
work group in Sonic uh which is the
virtual data plane work group and uh
Fred and I co-chair that work group and
uh we are doing this project under the
ambit of Sonic and we want to kind of
show what is possible with Sonic today
and and kind of lead you with a few you
know takeaways and things that you can
do in your environments and and
incorporating things that you want to do
with Sonic. So uh with that said I think
uh the clicker is here. Okay.
Okay. So, we want to talk about a quick
uh view of why Sonic, what Sonic does
for you and and give you a little bit
background on VPP uh if you haven't
heard or played with it. And it's also
another open source project. And we will
also talk about some of the features
that we have built into uh this
framework and how that may be useful for
some of the work that you may be doing.
And we'll also leave you with a few kind
of variants of what SonicVP can do for
you uh in in in your ability to do more
testing a sandbox use it as a sandbox or
build out your IP uh Ethernet fabrics.
uh you can do a lot of things with Sonic
or Sonic VPP and we'll also share a
little bit of you know what we are doing
in the community with Azure where Sonic
is actually you know uh housed uh as a
Linux foundation project and we'll also
talk about a few other community use
cases people who are using Sonic PPP and
what they're actually doing and some of
them are SPS and you know you may be
familiar with one or more of them and
we'll also have a little bit of you know
maybe you can join hands with us okay a
a Quick view. I borrowed this slide from
morning also. So we talked a little bit
about sonic and dash under the umbrella
of the smart switch. Uh sonic is your
NOS for the NPU and uh it can run on any
multiple vendor switches or AS6 if you
will. And it's a layered operating
system. It's a little bit more
modernized mo you know containerized uh
modular uh system. It builds off of an
infra software layer with base BSPs and
a bunch of you know routing protocols
all of that glom together and they has a
manageability layer built on top of it
right no different from any other but
it's a modernized version of an
open-source NAS on the other side you
have a dash which is a distributed API
for sonic host think of it as sonic
light uh built for things that can run
on DPUs FPGAs smart nicks what have you
And here you have an API that is really
programming an overlay services
application in a similar manner as what
the NPU would do and and that runs
primarily on DPUs. So together you have
sonic on the NPU and Dash on the DPU.
You could use either depending on your
you know model of operation. Okay. So
why sonic VPP right? Given that we are
doing an open source on open source VPP
is an open source project also a Linux
foundation effort. Uh it's under the
umbrella of fd.io. IO Sonic is running
as a Linux on Foundation project as part
of uh you know Sonic work group. Uh
really Sonic does not have a compelling
software data plane. That's the bottom
line. It has something called a VS which
is a non-existent set of features. Uh
it's a very lightweight you know
testable infrastructure not really
useful beyond you know base
functionality. So WPP gives that
credence. So VPP is a really a
high-erforming fairly heavy feature set
uh fully compliant uh for various
interoperability testing uh also open
source and communitydriven and can run
on any commodity hardware right whether
it's x86 or an ARMbased CPU or anything
you can compile this and run on any
environment
what this really does is it gives you
the tools to really you know do anything
it empowers everybody and democratizes
is the equation for testability with an
environment that is completely built on
open source on open source software data
plane. Okay, there's no vendor
dependency. You have a real platform and
you can extend both Sonic and BTP all by
yourself with no help. Okay, that's the
beauty of having Sonic PPP in your
environments in your service provider
environments or any environments that
you're doing testing with a few set of
you know nodes. you can build out your
network and you can you know run all of
the you know test and the automation and
the sonic CLI provisioning mechanisms
all in your in your in your home network
right so pretty much that's what gives
you so so today what we are trying to do
is build you that that story for what
this engine can do as a as a development
vehicle for data center testing um and
beyond that I see a multiple sets of
users and BDP has played in all those
areas on the bottom right side uh in
various ways with different set of NOS
and control planes. VPP has already kind
of seen the light of day in production
network. So you really can't do any of
these with Sonic obviously and I'm not
expecting Sonic to do all of these from
the northbound. But there are ways that
you can enable yourself to do any of
these functionality. But let's keep the
focus today to essentially the the
sandbox for data center testing.
A quick background. Uh VPP stands for
vector packet processing and it's an
FD.io IO project like I mentioned it's
fully extensionable uh very highly
optimized uh for performance primarily
and it it builds on x86 architecture
caching and memory infra infrastructure
makes packet processing extremely
efficient and uh it has all these
vanilla bread and butter features
already wired into uh the current
version of community VPP and you can
always extend that right all layer 2
layer 3 protocols everything is ready
for you in terms of um VPP's
capabilities today. Uh it works on an
architecture of directed graph nodes and
you can skip from one node to the other
and you can you know traverse a whole
set of you know pipeline of
functionality and it can be you know
programmed uh through any of these you
know models.
Why is it so good? It's so good because
it can really pump 20 gigs of, you know,
bandwidth on a single core of x86.
Today, if you're doing a 64 by uh, you
know, partial drop rate test, uh, with
a, you know, say 0.5% of drop, you can
get up to 20 gigs or more uh, just for
V4 processing. And if you're doing IPS,
you can get up to 40 gigs with an IMX,
you know, traffic flow. And it's really
scalable. It's you know it's got low
latency and it uses all the x86 uh you
know cache very efficiently. Uh it uses
AVX instructions. uh it does batch
processing and and no context switches
uh so that it can really perform as as a
real world uh router would right and so
so VPP is your your engine here and and
we are leveraging our work on top of uh
this engine and clearly we have
capability uh with our skill set to be
able to extend VPP and and it's actually
not very hard okay
so today like I mentioned VPP has all
these features we bring in Sonic on top
of It it does vanilla layer 2, layer
three switching and routing and and you
know Len here who is like the know for
father of all of these capabilities
really you know would appreciate that
this is still running and this is what
is you know gluing all our networks
today. uh in this version of Sonic VPP
that we're doing, we're adding all VXLAN
capabilities. We are bringing SRV6 uh
capabilities into the network and my
colleague Fred's going to talk a little
bit more about those. Um essentially
with this it'll complete a base set of
vanilla router that you want to have for
testing and building out your
infrastructure, validating your uh you
know your sonic work or data plane or
your you know automation infrastructure
on top of this. And since both of these
projects are running on their own
trains, they have their own release
modules uh release schedules. Uh Sonic,
you know, comes out twice a year. VPP
comes out twice a year and we are
constantly, you know, working with the
leading edge versions of either one of
these and it's all, you know, upstreamed
and we have uh git repos where you can
download the packages and run with it.
Right.
in terms of uh maturity, build footprint
and things like that. All of that is
streamlined, packaged and and actually
documented in in the community on the
GitHub. You can go ahead and you know
read for yourself. Uh in terms of
footprint for sonic VPP, you know, on a
typical six core Intel CPU, you can take
two cores for Sonic and four cores for
VPP. You will have a fully functioning
router. You can optimize it depending on
number of ports you want to expose
yourself with. So it's ways to you know
slice and dice your computer
infrastructure and build out a series of
you know these routers on a
infrastructure that you want to test
with. Right? So all of this is very well
documented. We'll show you pointers
before we you know we get off the stage.
But this is basically where we are today
and and and I will hand it off to Fred
for the rest of the conversation. Thank
you.
>> Hello everyone. Uh next we are uh going
to deep dive into the Sonic VPP uh
design and implementation. Uh so for
Sonic um
uh it expects the uh data plane uh the
async SDK of the platform to implement
the SA API. SA stands for uh switch
abstraction interface. it uh uh provides
fixed operation on a set of objects uh
object types. The fixed operation
basically could uh create update uh
delete and get it has about 100 object
types. Uh on the VPP uh data plane side,
it use uh uh it's a binary API uh as a
message passing uh API. Um it has uh
three main uh message types uh request
and reply with one request and one reply
and dump uh detail uh with one request
multiple replies for example to get all
the interfaces in the system
and also events. uh event is one request
uh for the event registration and one
reply for uh the registration
uh result
and then followed by uh uh multiple
notifications with one occurrence of the
event one notification for one
occurrence of the event. Uh it has a
large number of APIs.
So um here's the uh we use IP route as
an example to show how we translate
between uh SI API and VPP API. Uh in
this table on the left uh the first
column shows the uh sonic operation uh
add modify delete and get. Uh the center
column is the SI API and on the right is
the corresponding uh VPP API. Uh we can
see um for uh for the modify VPP does
not support uh route uh modify. So we
have to use deleted and add.
Um the translation of uh from S API to
the VPP API is uh sometimes not trivial.
Um it requires intelligent uh glue
logic. uh for example we need to keep
this state where or sometimes we need to
map multiple site API calls to one uh
VPP API.
Uh here's the view of um uh the high
level components in Sony VPP. At the top
is the control plane components and uh
the control plane uh it programs VPP
data plane through the uh sonic VPP site
implementation the SI API and internally
in the SI uh API um it use VPP binary
API to uh talk to VPP. VPP runs in a
separate process and uh uh VPP uh use
DPDK uh to send and receive packet from
interface. Uh sonic VPP can be packaged
as a container or as a uh virtual
machine. We have uh streamlined sonic
VPP build uh for x86. Now it can build
directly from uh sonic build image repo.
um just like any other platforms
and uh Sonic use uh it use uh GMI
supports GMI and the rest for
configuration. Um also we introduce VPP
data plane API uh to existing sonic API
um for uh data plane manageability and
uh debugability.
Um also we use Sonic management test
framework uh to run Sonic VPP validation
and uh sanities.
Here's another view of the Sonic VPP
architecture. Uh at the top uh the blue
uh box uh has the Sonic control plane.
Um it uh it has uh CLI, GMI and uh rest
for configuration. It use a radius
database to uh store application tables
and uh configuration config table and uh
SWSS is like a ri uh fib in sonic. Um it
programs data plane uh through syncd and
the syncd uh programs uh VPP data plane
using the libai vpp.o So that's the uh
uh sign implementation for VPP
and uh internally
uh it is a uh IPC call to VPP
uh through the uh binary API because VPP
runs in its own process.
uh inside the VPP process typically it
has one master thread uh for control
plane and multiple worker threads. Uh
worker threads uh is responsible for uh
packet data packet processing depending
on the number of interfaces and the
throughput required. Uh there can be uh
multiple uh worker threads. uh in the
example uh for example we in the Sonic
uh uh Azora uh T1 lag topology uh we run
Sonic VPP uh with 32 interfaces there we
use three worker threads
and then uh VPP use uh Linux CP or Linux
control plane plugin to uh for p and
inject packet to sonic control plane
And uh net link plugin is for uh
interface to listen net link event for
interface up and down. Um we also add a
VPP platform service into sonic control
play. It is responsible uh to for
generating the uh startup configuration
uh to be consumed by VPP. in the startup
startup configuration. Um it has uh
memory configuration, CPU core
configuration and also uh most
importantly the uh mapping from uh VM
interfaces to the data ports in VPP.
Um here's the V VXLAN feature we
introduced recently uh to Sun VPP. Um it
has it supports V4, V6, VX1 incap decap.
Uh it's primarily for VNET pairing. Uh
it supports V4 in V4, V4 in V6. Uh V6 uh
in V4, V6 in V6. Um
and it supports um multihop BFD for
route uh uh protection. Also it supports
uh vxlan uh emp path and the primary and
backup path.
We also as uh add uh sonic uh SRV6
support to sonic recently. Um it
supports UNUA
uh SRV6 policy traffic steering over
SRV6 policy and uh UDT4 UDT6.
Um because uh VPP data plane already
support uh uh the SRV6 functionality. Um
the introduce of this feature into Sonic
VPP took one developer about uh 3 weeks
to finish the uh implementation.
That's roughly uh how long it take to
add a feature if uh data plane is ready.
And uh here's the uh we use sonic
management test framework for son VPP
sanity and valid uh validation. Uh we
have created a CI pipeline for image
build and we also create a CI pipeline
to run sanity test uh that uh include
interface uh routing routing protocol
like BGP vxlan uh SRV6
and uh it uses uh PTF uh topology uh
it's a PTF driven topology uh for T0 and
T1 and uh currently we are uh primary
primarily focusing on uh T1 T1 live
topology.
Um all the um uh the DOT and all the uh
supporting uh peers like T0, T2 uh and
PTF they can run in one uh VM
>> and you can say that it probably takes
about 16 cores uh in Azure at least for
for any of these topologies to
>> Right. Yeah.
>> Right. Uh in Azor uh environment it use
uses uh about 128 gig uh memory and uh
16 cores. Uh for WPP it use um VPP use
four cores. Uh the uh sonic control
plane uses two cores and uh the
remaining are for the uh for the T0 T2
simulation.
And uh here's another example of using
Sony VPP in topology. Uh last year last
year in the OCP we gave a demo on the
virtual smart switch. Um we built the
virtual smart switch using sonic VPP as
the MPU and BMV2 to implement the uh DPU
data play. um is a sandbox for smart
switch uh development and testing with
all open-source components
and uh verification traffic can be sent
from PTF and sniffed for PT from PTF. Um
uh on top of that we can add T0 T2 top
uh peers to create a larger topology and
run sonic management test. So we will uh
use the um this uh topology to run
virtual smart to run smart switch uh
signing test
in in sonic management.
Um here's the status of the son VPP
running in Azour. Uh our goal is to
enable Sonic VPP as a uh PR tracker uh
to to complement Sonic VS. Um Sonic VS
does not have a real data play. So uh
many uh uh tests that requires um data
plane feature either skipped or uh only
run but run without traffic.
uh that means there's some uh uh code
coverage uh issue.
Uh actually recently when we work on a
sonic VPP test failure uh we discovered
uh a bug in the sonic management test
because of the lacking of this code
coverage and um if sonic VPP had been
enabled as a PR tracker uh such issue
would not would have been caught
earlier.
Um currently uh Sony VPP still runs uh
nightly. Um it uh we have uh uh two
pipelines for build one for 255 branch
and the other for master branch and um
we also have a test pipeline um uh to
that runs sonic management test on T1 L
topology and the recent test result uh
is uh we got 460 passed uh and one
failure. The failure um we found is uh
debatable because um it uh it caused by
a memory check failure in the one of the
component. Um the memory usage is
calculated using the memory used by the
uh process uh divided by the total
system memory. But uh because Sony VPV
has lower um um footprint memory
footprint, it has less memory. So the
same memory usage appear appears uh with
higher usage uh percentage. Um we will
work with Sonic team to improve the test
case.
Next step uh we will uh we are trying to
we are working to promote build and test
pipeline to PR tracker for both uh sonic
build image repo and sonic management
and we are going to expand to other
topologies T0 multiac smart switch and
with more feature coverage
next I will handle it
>> yeah thank you for
>> sorry thank you
>> so yeah primarily we are trying to
encourage people who are going to look
at Sonic uh in some level of
seriousness. I know you know SPs may or
may not be on it yet but the large
enterprises looking at Sonic. So having
this environment would essentially you
know speed up PR validation and you know
of and getting people on boarded to
Sonic a lot more easily and that's one
of the you know key goals essentially
for any data center roles uh that you
may have in your networks. Uh this slide
is primarily showcasing all the current
kind of users of Sonic VPP. Um like Fred
described uh Microsoft is using Sonic
VPP as part of the Azure CI. So are we
within Cisco uh eating our dock foot for
quick setup validation and and and tear
down of various topologies. And when we
say T1 and T0, these are kind of leaf
and spine topologies in a you know
typical IP you know fabric. Uh so the
that sonic test management suite that we
talked about is geared towards that and
definitely extendable to any other
topologies and you can you know add your
own cases in the scenarios that you deem
fit. Uh all of those is you know
perfectly you know flexible in the way
you want to enhance either the test you
want to enhance sonic you want to
enhance the data plane through VPP you
have full power to do that.
um with our other friends in Arista guys
like Steve who was here uh from Nexttop
and others Bell Canada Pantheon uh
they're all kind of using Sonic for
various uh Sonic PPP for various uh test
scenarios in their environments. Arista
is trying to make some enhancements to
do site test coverage which is another
gap in the community and Sonic VPP
provides a sandbox for validating that.
They're adding more capabilities in
addition to VXLAN SRV6 with IP and IP
again you know other bread and butter
data center feature. Uh Bell Canada is
also using Sonic VPP for various
infrastructure testing roles in their
network. Uh Pantheon a startup out in
Eastern Europe is also using and demoing
you know BGP multipath capabilities
using IPvxan or EVPN multihoming with
Sonic uh Sonic VPP here. Um, one of the
power users of SonicVP that we have come
across is Equinex. Um, they are using
that in a hybrid multi cloud uh,
connectivity platform. Uh, essentially
you know they're trying to do this
gateway between all the you know cloud
providers. So this uh, Sonic VPP is
really playing a good role for them and
they have a laundry list of features
they want enhancements on and we are
going to be working closely with them.
uh it kind of enhances uh the the
infrastructure with the Quinix uh
running infrastructure as a service. So
these are some of the current set of
users out there with Sonic PPP and um if
any of you are interested uh I would
encourage us uh you guys to you know
come touch base with us. Uh we do run
this workg group every week. Uh we meet
on Tuesdays uh online. Uh it's part of
the Sonic you know work group. If you go
on the Sonic wiki, you will see uh a
pointer to this particular work group,
the virtual data plane work group. Our
current focus is primarily related to
infrastructure enhancements and
features. Uh and we want to complete the
test coverage. We are not completely
done with validating everything that's
there in T0 and T1 roles that we
mentioned as part of Sonic management.
Um we have some more to go. We solicit
you know help and you can jump on it.
It's a quick way for you to learn, make
sure all those tests are passing in
Sonic VPP and then we have a real, you
know, productive uh uh engagement after
that to work on features and other
enhancements uh that we want to, you
know, build into the the ecosystem. Um
like I said, we meet every Tuesday 9:00
a.m. Uh just sign up on the mailer and
or reach out to me or Fred um offline or
today.
Open to any questions, collaborations,
ideas. Uh I did meet with I think Naveen
somebody there uh in his team did want
to uh look at Sonic uh so as a good
starting point for somebody there. Uh
Sonic PPP will get you you know your
feet wet to get on the Sonic bandwagon.
Sooner or later you will have to deal
with Sonic. So you know this is one way
to get engaged and thanks to NANO
committee for taking this talk in. I
really appreciate that. um maybe
slightly outside the comfort zone of
many other technologies in discussion
here but um Sonic is going to be uh a
little bit more uh spreading fast and
wide fairly soon. So thank you Nanok
team.
Any questions?
>> Sure.
>> Yeah. Do we support warm rebbit feature
uh with VPP?
Not yet, I would say.
>> Sorry.
>> W reboot.
>> No, no. W reboot. Not
>> right. Not yet.
>> Yeah.
>> Yeah. With warm reboot, you require um
data plane runs separately from the
control plane. Exact right.
>> Yeah.
>> So I'm from LinkedIn. So we use 20K box.
So we use W reboot with F4.
>> So we I'm considering this VP. So very
good. But W reboot is the main feature.
>> But let's talk about that. Yeah,
>> maybe there is a way forward for that.
Yes,
>> with the Google Alpine, it's possible
that going forward uh because they run
in a different container, the data plane
and the control plane. Right.
>> Right. So there's another project as
part of virtual data plane work group
which is the Alpine uh project that's
run by Google and that's a split model
of functionality between the control and
data plane and there a warm report test
with VPP is potentially possible. Makes
>> sense. Thank you.
>> Any other questions?
>> Thanks everyone. 30 minutes to be.
>> Thank you.
[Music]
This talk highlights SONiC-VPP architecture improvements, new features, and validation results using standard SONiC-Mgmt test frameworks. SONiC-VPP, capable of functioning as a switch-router, has already been adopted by cloud providers for various gateway functions. We share enhancements in infrastructure, new features, and tools fostering faster development. New capabilities can now be validated in a platform-agnostic environment, reinforcing SONiC-VPP's alignment with the SONiC community. Key topics include: -New features like VxLAN, SRv6, LAG -Infrastructure enhancements with combined lib-SAI support -Improved tooling for build processes and VPP upgrades -SONiC-Mgmt T1/T0 test suites with results dashboards -Azure CI integration for Nightly Builds/PR checkers -Enabling new use cases for SONiC-VPP Additionally, we introduce a Virtual SmartSwitch combining SONiC-VPP with DASH-enabled BMV2 (representing DPU), facilitating the creation of advanced DPU functions. Enhancements to SONiC test topologies and suites for the Virtual SmartSwitch will also be discussed. Murali Venkat: Murali is a Systems Architect and Principal Engineer at Cisco, leading the engineering architecture and product development for Cisco’s SONiC SmartSwitch. He also chairs the SONiC Virtual Dataplane Workgroup, advancing innovation in open networking software. Murali brings expertise across IP Host Services, Carrier Ethernet, and Data Center Networking technologies. He has worked on multiple generations of NPU ASICs and platforms for Data Center and Service Provider Access/PreAgg markets. His background also includes software dataplanes for Mobility and Broadband Network Gateway (BnG) domains, with contributions to 3GPP/SA3LI standards.