[j-nsp] EX-series automation, NETCONF woes

Ross Vandegrift ross at kallisti.us
Tue Jan 27 16:23:28 EST 2009


Hello everyone,

I've spent the past few weeks developing automation software for the
JUNOS EX-series switches.  During this time, I have come to miss IOS
for its SNMP-based configuration.  In case anyone from Juniper is
reading, I'd like to describe why I have found NETCONF to be such a
painful experience.

If anyone has come up with practical solutions to these problems, or
if I have overlooked some solution, I'd be very interested in hearing.
I'd love to hear that I'm barking up the wrong tree!

I'll also note that we're using the EX-4200 as a top-of-cabinet switch
for datacenter deployments.  Config is relatively simple layer 2
switching with LACP and MSTP.  This strongly shapes how we interact
with the devices.


----- Four central problems -----

1) NETCONF performs very poorly.  Running over SSH means that I'll
always have TCP session and SSH key exchange to wait for.  I can't
start a NETCONF session in less than 3-5 seconds.  SNMP latency is a
fraction of this and it's very easy to make interactive applications
that make realtime changes.

Furthermore, I have to manage connection lifecycles.  My only real
optimization recourse is to try and reuse open connections, which is
tricky to do without leaking connections.


2) NETCONF is inefficient.  Want to enumerate the switch ports and
their assigned VLANs for the server guys?  You'll need
<get-interface-information/> as well as <get-vlan-information/>.  In
my lab setup, this is over 150K of data so I can produce a table like:

Interface	VLAN ID
ge-0/0/0	5
ge-0/0/1	6
...

In a production VC of a few members and a hundred or so customers,
this is like half a meg.  What's that going to look like when we have
400 servers on a VC of ten switches?

Almost all of that data is waste - tags and namespaces have no
operational value.  I only wanted a fraction of the info I got back,
but there's no way to make my queries more specific.

I can produce the above info with SNMP by walking IF-MIB::ifName and
either Q-BRIDGE-MIB::dot1qPVid or CISCO-VLAN-MEMBERSHIP-MIB::vmVlan.
Juniper doesn't expose dot1q tags via SNMP - only the internal VLAN
index that JUNOS uses.


3) XML is far more complicated than SNMP with marginal benefits to a
switching environment.  If you're automating MPLS VPN creation on a
router, XML makes sense - you need to load a fairly decent amount of
structured config, and you want to do so atomically.

But in a datacenter switching environment, the vast majority of the
changes are single attribute changes: changing an interface
description or updating an access vlan.  Automating these two things
means 99% of our manual switch work goes away.  Want to update the
description?  Here ya go:

<rpc>
  <lock>
    <target>
      <candidate/>
    </target>
  </lock>
</rpc>
]]>]]>
<!-- wait for acknowledgement of the lock -->
<rpc>
  <edit-config>
    <target>
      <candidate/>
    </target>
    <config>
      <configuration>
        <interfaces>
          <interface>
            <name>ge-0/0/10</name>
            <description>test</description>
          </interface>
        </interfaces>
      </configuration>
    </config>
  </edit-config>
</rpc>
]]>]]>
<rpc>
  <commit/>
</rpc>
]]>]]>

With SNMP, all I need to do is set ifAlias.  One action, it's atomic.
No need to generate documents describing minor operational work in
horrifying detail.  Same deal with changing an access vlan.


4) Juniper seems to change the XML namespaces with every release.
This means that the XML responses from two different versions of JUNOS
require different parsing rules.  The XML patterns that are widely
implemented assume that the namesapce is a granted thing.  Mucking
about with it at runtime, based on responses from a switch, has
required a very ugly stack of hacks.

Now I'm stuck with a very complicated software to do two very basic
things.  If I had to start over - I'd use expect.


----- How I would address these -----

1) The Q-BRIDGE-MIB on the EX-series is great except for one crucial
oversight: the dot1q VLAN ID (arguably the most important piece of
data) is not exposed anywhere.  This is easy to fix - just add a table
mapping the JUNOS VLAN Index to the dot1q VLAN ID.

2) Permit read-write access to the IF-MIB and Q-BRIDGE-MIB.  Single
attribute changes become trivial.  This would probably be a harder
change, since JUNOS has a candidate config model.


If you've made it this far, thanks for reading,
Ross Vandegrift

-- 
Ross Vandegrift
ross at kallisti.us

"If the fight gets hot, the songs get hotter.  If the going gets tough,
the songs get tougher."
	--Woody Guthrie


More information about the juniper-nsp mailing list