Continuing from Networking Part 3,
I forgot to include some text on very important Spanning Tree Protocol and also some basics on routing.
So let us make this another small essay and then next part, we will delve into the configuration steps for relevant networking on Linux.
And I will also like to mention that when I was talking about Service provider datacenters with Leaf and Spine switches and how they are linked via L3 on the underlay, but when you buy a VM or Instance at Microsoft Azure or Amazon AWS and then add another one or several, you will be generally in the same private subnet. But these machines can be on a physical server (or a blade in a chassis server) in any rack in the large datacenter. So we use L2 / VLAN bridging the same subnet throughout the datacenter over L3 linkages. And I also forgot to mention that Vnet as indicated in previous part (mapping of VLANs to a virtual network to travel over L3 links) is also called L2VNI, L2 Virtual Network ID, while L3VNI is for the virtual tunnels that link various VTEPs using L3 VRF (virtual routing and forwarding) that could be assigned to a large organization (to carry across their multiple L2 VLANs in between the many VTEPs) or generic VRFs based on VPS / VM / Instance types. The traffic in between VTEPs (or local to datacenter) is called East West Traffic and Traffic ingressing / egressing the datacenter to others (including Internet and other clouds) is called North South Traffic.
The L3 or routed links are added in large networks to avoid the Spanning Tree Protocol (STP) that if not correctly configured can bring down the datacenter or any business network in seconds. STP was devised to implement resilient networks wherein a single downstream edge switch in a network closet (or for that matter, even several of these switches in a stack) will be uplinked via a single cable ( could be 1 Gig or 10Gig or even higher speeds). And if that sole link were to fail / get unplugged by error, of if any end Ethernet port to fail (or a fiber transceiver, called SFP were to fail), then that downstream network closet will be disconnected from rest of network and Internet. Naturally we will like to have two or more links for uplink and then to connect to two separate Core (or distribution) switches. These core switches can be setup as a VRRP (virtual router redundancy protocol) pair so that downstream devices connected to various edge switches, will see both switches as a single IP / gateway for them to use. This way even if one of core switch were to fail and with all downstream switches connected to both upstream switches, users wont see any issues for their network and internet communications.
The above scheme could not be setup as the moment we connect a switch to an upper switch (or pair of switches) via additional cable(s), there was a physical loop and within seconds the whole network will shutdown with CPU and memory usage almost 100%. The reason behind this was that there are many protocols that need broadcast to do its function, as we discussed before, like ARP and DHCP. The broadcast packet when received on a port on a switch, is sent to all other ports, other than this receiving port. However, this packet is then circulated back to the same port thru second set of ports on the switches and then cycles back in and this quickly multiplies the amount of broadcast packets on the network trying to go thru the looped ports / cabling. Broadcast though in itself is not a high CPU or memory user, but is high priority interrupt, so everyone looks at it and then process it as needed, but with multiplication of one packet into thousands and maybe millions wont take that time and depending upon the capacity of network switches, switches will stop working.
So STP protocol was introduced to auto-magically detect and resolve a loop by exchange of STP BPDU packets, which will disable one port to break physical loop as soon as we connect cabling that will cause a loop. BPDU is bridge packet data unit. Bridge is a technical name for network switch, which acts like a large highway bridge interconnecting various on-ramps to the highway or to bypass it. You can research further on STP if you need to. But recommendation is that to turn on STP on any network switch (or even a Linux appliance like firewall that has multi port Ethernet switch built into it). This way, if someone accidentally short circuits two ports on any switch (directly or thru face plates that terminate multiple Ethernet cable drops), then that does not bring down the whole network. The STP will simply disable those two ports to protect rest of the network. There are more details into that in terms of configuration of STP for things like BPDU protection etc, and some lower end switches, that tries to simplify STP from its complexity of configuration, offer things like Loop protection. So use whatever loop prevention features are available on the switches.
STP results into blocked uplinks and thus waste of equipment and cabling with its active / passive design. Newer solutions were introduced for making use of both uplink by load sharing the VLAN based traffic split across the two uplinks. And then other technologies were introduced to actually team or bundle the ports together for active / active design where any and all traffic could load balance using some hashing mechanism around source / destination IP addresses and TCP ports etc. The technology is called Port-Channel or Ether-Channel or LACP (Link Aggregation Control Process) or LAG (Link Aggregation Group) etc. We discussed about this in previous section also.
So modern recommendations normally say to aggregate / team the uplinks to remove the loss of links that will otherwise result from use of STP. And still use STP for protection of edge switch level short circuiting of ports (directly or even thru use of some unmanaged dumb mini 5 / 8 port switch and then connecting two of the ports there via a patch cable will also result into loop).
In large networks, which can afford to use expensive network switches, uplinks can be routed and thus VLANs / L2 remain at edge switch level.
L3 or routing has been explained briefly in earlier parts for things like Inter-VLAN routing or for routing the traffic from private network / LAN to Internet by NAT masquerading the LAN private IPs to ISP allocated Public IP using firewall or a NAT router. So routing essentially means to send traffic to a destination IP address that does not belong to sender IP network. We simply send such out of our locality traffic to our gateway which then knows how to route it to Internet or WAN to other sites etc.
In the firewall or NAT router, if we have Internet or WAN port set up as default DHCP client, then it gets the IP address and default route to send traffic to the ISP gateway which sits at next higher level (next hop). And that next higher level ISP router could be at a business site, or in the neighborhood ISP node / cabinet and then next will be ISP / SP POP (point of presence or CO, for old name referring to central office for phone lines to terminate, also called Telephone Exchange outside North America) and then on to ISP core network. ISP core then links into bigger Tier ISP and / or meet with other ISPs clouds, at Peering Points that are large ISPs meeting datacenters. So Internet as we know is all distributed, not owned by any single country or company and is simply interconnection of various ISPs / SPs / Cloud Service Providers / Universities, Research institutions and other large and small companies that have public facing web servers/ FTP servers / Email servers etc on site.
All this hop by hop transmission of traffic from sender to target and response back, can be set up by static routes wherein we specify the destination network, reachable via a gateway into that ISP / SP network based entry routers. But this static configuration is not at all manageable in any network of anything beyond a small / mid size network.
Coming back to our small firewall or NAT router, we normally don’t have to do anything other than plugging the Internet / WAN port to the ISP hand-off Ethernet cable from their modem or router. Most times, we don’t even have any firewall at home as the ISP modem comes with its own built in small NAT router to make use of. But for better control and security and for things like VPN access, we will like to have our own firewall behind ISP router/modem.
In mid to large business networks and in ISP / SP networking, no static routes are configured. Simplest form of static route will be in small / mid size network will be as a default route (as catch-all) route to send any traffic that has no local or in other branches, over to the ISP as that is then considered a destination somewhere at Internet. This default route is auto installed in case of DHCP obtained IP from ISP or statically configured on firewall or NAT router if we have static Public IP (or a block of Public IP subnet) from ISP. Do note that static routes need both ends to have static configuration.
Small and mid businesses may also have some static routes for some subnets that are no linked via L2 VLAN on the core switch. This could be a third party equipment or another site that uses MPLS or Private leased lines from a telco / SP.
Large networks will use dynamic routing protocols which automatically will learn the subnets / IP Prefixes from other end and advertize your own subnets to the other end (called neighbor relationship). So we can freely add any new subnet or change a subnet with minimal change at our end, the neighbor will not need to do anything. Dynamic routing also afford multipathing for resiliency against a link failure, where a failed intermediate hop / router fails and then traffic will reroute around that failed node and will fail-back when that failed link / cable or equipment is restored.Dynamic routing also allows for things like preferential use of certain links for certain applications for reasons of costs and reliability. Also there is a concept of ECMP, equal cost multipathing wherein we can have multiple circuits to team together for increased bandwidth and resiliency.
Typical modern dynamic routing protocols are OSPF, EIGRP in mid to large networks and RIP in very small networks using low cost equipment that only supports RIP. ISP / SP network have BGP and IS-IS in core and in datacentres. All this is well beyond the scope of this series, but armed with this knowledge, you can do your own research for your needs. Linux implements all these protocols and is used by many switch and router vendors as network operating system.
With all this basics and rather more than basics behind us, next part will cover about our Linux machine networking configuration for IP addressing, DNS, DHCP, Teaming etc.