[j-nsp] Network automation vs. manual config

Sun Aug 19 11:17:28 EDT 2018

Hi,

Thank you all for your comments both on and off the list. A lot of food for thoughts. I can see that many of you have been evidently thinking about the same dilemmas. Niall, yes it is Funet we are talking about, although not directly conneted go Geant but to NORDUnet instead. I might contact you unicast later.

There has been some comments about using (or not using) apply-groups for managing and organizing the configuration. As discussed earlier, they might cause quite some confusion when used incorrectly. For example, if one has a group containing some common inet/inet6 specific configuration and that group is then applied at physical interface level, one might be confused why commit check fails when trying to add e.g. a CCC interface under the same port and the CCC interface inherits L3 relevant stuff. Well, one could justifiably say that it is not a fault of the apply-groups concept itself but the incorrect application of it, but anyway these kind of situations can cause unnecessary confusion. Further, heavy use of apply-groups with wildcards used to increase the commit time quite a lot, at least when the persist-groups-inheritance was not yet available.

One more comment. Depending on which tools are being used, it might be rather easy to add new configuration to the network, but the system should also support removing unnecessary elements from the configuration. This is where replacing the entire configuration or at least entire configuration hierarchy helps, as the unused configuration won't be added to the config in the first place and no separate cleaning task is needed.

Antti

----- On 17 Aug, 2018, at 17:15, Niall Donaghy niall.donaghy at geant.org wrote:

> Hi Antti, folks,
> 
> @Antti: Feel free to reach out directly if we can be of assistance. I understand
> you are in CSC behind FUNET, connected to GÉANT?
> 
> Here in GÉANT we have 31 x MX480/960 routers, all acting as PE devices (no P
> devices), spanning Europe.
> 
> We run a large set of protocols and services (ie: some logical systems, many
> routing-instances, carrier-of-carriers, dual-stack, LDP, RSVP-TE, MSDP, PIM,
> etc. etc.).
> We shift over 1 Tbps and though our number of 'customers' is few - maybe 5-10
> homed per box - we're running upto 50,000 lines of config on some boxes.
> 
> So where do we stand on config automation?
> 
> Whilst we do use configuration templates, our ['customers'] requirements
> necessitate some exceptions in places.
> Given our central position in connecting EU Research and Education networks
> together, and to the world, we are running quite a mix of services -
> production, pilot, experimental - and manual configuration direct on the CLI is
> the only game in town; why automate disposable config?
>	** Not to be confused with pure lab work - we have several labs, too, and
>	appropriate divisions between lab and production.
> 
> We are moving toward Ansible/git/Napalm/Bash glue scripts for chunks of
> configuration which seldom change, eg: chassis, routing-options, snmp, standard
> policies and filters, etc.
> IE: We're going to automate the low-hanging fruit first, and expand from there.
> 
> RE: manual overwrites - What I'm going to POC is using the Junos 'protect'
> feature to block CLI users from futzing with what lives in git: when git repo
> is pushed to the routers, we'll unprotect and re-protect those stanzas. So, in
> an out of hours emergency, our NOC can still unprotect and overwrite anything
> they need to.
> Alternatively, fixes they may wish to implement can be updated in git. The key
> thing at the outset is choice - you can do it the way you're used to, while you
> learn the new procedures, and there is no negative impact.
> 
> To ease the migration, learning, training, we plan to start slow and have the
> git push triggered by hand, rather than, say, cron.
> We will have quasi-realtime automated diff reports so deviations are spotted
> same-day and can be addressed.
> The idea is anyone making a change updates git then does a push (which also
> verifies).
> 
> Until we have that, we continue with our partial automation:
> 
> I've authored numerous scripts - the most commonly used have a web frontend -
> which take user input, populate templates, and offer to push to the chosen
> router(s).
> For instance, all public and private peerings are 100% automated and populate
> data from peeringdb.com.
> 
> NB: 	The above is our current position and plan of action. Please consider our
> needs are different from most SP networks.
>	In a more commercial operation with large scale cookie-cutter customers (who
>	don't get special treatment - just a service catalogue), database-is-master is
>	the way to go.
> 
> I'll finish by saying that the FOSS tools out there do a fantastic job - pick
> the toolchain you like/can understand, and don't be afraid to use it!
> 
> Off-topic:	Almost all change management is performed by automation - config
> application and verification checks.
>		This means during the PM window, we need only concentrate on verification and
>		'what went wrong', should something happen.
>		IE: We don't burn brainpower or time making the changes - all that is done weeks
>		in advance, and peer-reviewed if appropriate.
>		
> 
> Br,
> Niall
> 
> Niall Donaghy
> Senior Network Engineer
> GÉANT
> T: +44 (0)1223 371393
> M: +44 (0) 7557770303
> Skype: niall.donaghy-dante
> PGP Key ID: 0x77680027
> nic-hdl: NGD-RIPE
> 
> Networks • Services • People
> Learn more at www.geant.org
> 
> GÉANT Vereniging (Association) is registered with the Chamber of Commerce in
> Amsterdam with registration number 40535155 and operates in the UK as a branch
> of GÉANT Vereniging. Registered office: Hoekenrode 3, 1102BR Amsterdam, The
> Netherlands. UK branch address: City House, 126-130 Hills Road, Cambridge CB2
> 1PQ, UK.
> 
> 
> 
> 
> -----Original Message-----
> From: juniper-nsp [mailto:juniper-nsp-bounces at puck.nether.net] On Behalf Of
> Michael Still
> Sent: 17 August 2018 14:06
> To: juniper-nsp at puck.nether.net
> Subject: Re: [j-nsp] Network automation vs. manual config
> 
> Side note on apply groups and display inheritance. I've submitted a Juniper ER
> for an enhancement to have the ability to have ' | display inheritance'
> a 'default' cli behavior (configurable via 'set cli display-inheritance'
> option that is defaulted to off). I've also asked for a login-class option to
> enable this for specific user role such as front line NOC users who may benefit
> from having it on by default. This is ER-077163 if you want to poke your
> Juniper SE about it.
> 
> The reason I've asked for this is specifically because I've seen NOC personnel
> spend many cycles investigating an issue not realizing that particular hidden
> apply-group config was affecting their investigation.
> 
> I have a couple other semi-related (to automation / configuration
> enhancement) ER's going if folks are interested and would like to chat about
> those directly.
> 
> 
> On Fri, Aug 17, 2018 at 8:20 AM Nathan Ward <juniper-nsp at daork.net> wrote:
> 
>>
>> > On 17/08/2018, at 10:54 PM, Antti Ristimäki <antti.ristimaki at csc.fi>
>> wrote:
>> >
>> > Another option is to apply the auto-generated configuration via
>> apply-groups and apply all manual configurations explicitly so that
>> the automatic and manual configurations merge with each other. The
>> positive side of this approach is that it makes easy to develop the
>> automation tools so that manual configs are not overridden by
>> auto-generated config, but I personally see somewhat inconvenient that
>> one really doesn't see the effective running-config when using
>> apply-groups, unless one remembers to display inheritance.
>>
>> We’ve implemented this at a network I support, seems to be going well.
>> We approach it slightly differently though, in a way which may help
>> solve your usability problem, in a bit of a roundabout way. In short,
>> we build groups in to almost everything so people are used to doing
>> display inheritance if they need to look deeper at things. It’s not
>> perfect, but it’s the best way I’ve found to manage large bits of JunOS config.
>>
>> We have 3 types of groups:
>> Global* - common on every router they exist on, applied at top level
>> only
>> Local* - unique to this router, applied at any level
>> * - common on every router they exist on, applied at any level
>>
>> All our groups have apply-flags omit;
>>
>> Local* groups are only used when something is re-used several times on
>> the one router - for example on our BNGs, a list of DHCP interfaces in
>> each of the routing-instances we might push a subscriber in to.
>>
>> So, for example:
>>  - GlobalDualREMX sets up whatever our standard things are for an MX
>> with
>> 2 REs, applied at top level.
>>  - “MPLS" is applied at `interfaces blah` and `protocols rsvp
>> interface blah`, etc and includes our per-interface MPLS config.
>>  - VRFCustomers includes our import/export policies for our Customers
>> VRF (applied inside a routing-instance), and the loopback filter
>> config for the Customers VRF loopback (applied inside an interface).
>>
>> The only config that’s outside groups is config unique to that router
>> - so, IP addressing, routing-instance names and RDs, interfaces
>> (though they have apply-groups within them for many settings), hostname, etc.
>>
>> This means:
>> 1) Config is short because of apply-flags omit. Seeing things unique
>> to this router is easy. It’s easy to spot differences as apply-groups
>> are different - and that’s all you generally need to look for. I just
>> looked, our BNGs are all about 500 lines of config, and all have
>> identical group config on them. Most of the config is rsvp-te tunnels,
>> and access network interfaces.
>> 2) When we want to look deeper, we know to do `| display inheritance |
>> except #` and it becomes muscle memory - this really is the bit that
>> helps your use case, haha.
>> 3) We can copy our groups from a git repository, load replace (in our
>> git reply they all have replace tags) and commit. Keeping the common
>> config consistent is super easy. Automating this is one “leg” of
>> automation and solves almost all of our automation requirements.
>> 4) We can do bespoke mucking about outside the groups, and it’s
>> obvious what those things are, and what things need to be tidied up in
>> to groups, or what is junk temp config that needs to be thrown out.
>>
>> I think where this could work for you, is to have your automation
>> apply any router-specific config just like a human would - outside the
>> groups, but leveraging the groups as much as possible. If you want to
>> keep your manual/automated config seperate, stick the automated config
>> in a big single group - that way, manual config will override it, and
>> it’ll be very clear that it’s there and where it’s come from.
>>
>> --
>> Nathan Ward
>>
>> _______________________________________________
>> juniper-nsp mailing list juniper-nsp at puck.nether.net
>> https://puck.nether.net/mailman/listinfo/juniper-nsp
>>
> 
> 
> --
> [stillwaxin at gmail.com ~]$ cat .signature
> cat: .signature: No such file or directory [stillwaxin at gmail.com ~]$
> _______________________________________________
> juniper-nsp mailing list juniper-nsp at puck.nether.net
> https://puck.nether.net/mailman/listinfo/juniper-nsp