User Tools

Site Tools


networking:bonding

Differences

This shows you the differences between two versions of the page.

Link to this comparison view

Both sides previous revision Previous revision
Next revision
Previous revision
networking:bonding [2016/08/11 12:09]
AliceSmith [Resources and Links]
networking:bonding [2018/11/14 05:47] (current)
AliceSmith [Resources and Links]
Line 13: Line 13:
  
 For new versions of the driver, updated userspace tools, and For new versions of the driver, updated userspace tools, and
-who to ask for help, please follow the links at the end of this file. +who to ask for help, please follow the  links at the end of this file.
-=====Contents===== +
- +
-  * [[https://​www.linuxfoundation.org/#​Installation|1 Installation]] +
-    * [[https://​www.linuxfoundation.org/#​Configure_and_build_the_kernel_with_bonding|1.1 Configure and build the kernel with bonding]] +
-    * [[https://​www.linuxfoundation.org/#​Install_ifenslave_Control_Utility|1.2 Install ifenslave Control Utility]] +
-  * [[https://​www.linuxfoundation.org/#​Bonding_Driver_Options|2 Bonding Driver Options]] +
-  * [[https://​www.linuxfoundation.org/#​Configuring_Bonding_Devices|3 Configuring Bonding Devices]] +
-    * [[https://​www.linuxfoundation.org/#​Configuration_with_sysconfig_support|3.1 Configuration with sysconfig support]] +
-      * [[https://​www.linuxfoundation.org/#​Using_DHCP_with_sysconfig|3.1.1 Using DHCP with sysconfig]] +
-      * [[https://​www.linuxfoundation.org/#​Configuring_Multiple_Bonds_with_sysconfig|3.1.2 Configuring Multiple Bonds with sysconfig]] +
-    * [[https://​www.linuxfoundation.org/#​Configuration_with_initscripts_support|3.2 Configuration with initscripts support]] +
-      * [[https://​www.linuxfoundation.org/#​Using_DHCP_with_initscripts|3.2.1 Using DHCP with initscripts]] +
-      * [[https://​www.linuxfoundation.org/#​Configuring_Multiple_Bonds_with_initscripts|3.2.2 Configuring Multiple Bonds with initscripts]] +
-    * [[https://​www.linuxfoundation.org/#​Configuring_Bonding_with_.2Fetc.2Fnet|3.3 Configuring Bonding with /​etc/​net]] +
-    * [[https://​www.linuxfoundation.org/#​Configuring_Bonding_Manually|3.4 Configuring Bonding Manually]] +
-      * [[https://​www.linuxfoundation.org/#​Configuring_Multiple_Bonds_Manually|3.4.1 Configuring Multiple Bonds Manually]] +
-  * [[https://​www.linuxfoundation.org/#​Querying_Bonding_Configuration|4 Querying Bonding Configuration]] +
-    * [[https://​www.linuxfoundation.org/#​Bonding_Configuration|4.1 Bonding Configuration]] +
-    * [[https://​www.linuxfoundation.org/#​Network_configuration|4.2 Network configuration]] +
-  * [[https://​www.linuxfoundation.org/#​Switch_Configuration|5 Switch Configuration]] +
-  * [[https://​www.linuxfoundation.org/#​802.1q_VLAN_Support|6 802.1q VLAN Support]] +
-  * [[https://​www.linuxfoundation.org/#​Link_Monitoring|7 Link Monitoring]] +
-    * [[https://​www.linuxfoundation.org/#​ARP_Monitor_Operation|7.1 ARP Monitor Operation]] +
-    * [[https://​www.linuxfoundation.org/#​Configuring_Multiple_ARP_Targets|7.2 Configuring Multiple ARP Targets]] +
-    * [[https://​www.linuxfoundation.org/#​MII_Monitor_Operation|7.3 MII Monitor Operation]] +
-  * [[https://​www.linuxfoundation.org/#​Potential_Sources_of_Trouble|8 Potential Sources of Trouble]] +
-    * [[https://​www.linuxfoundation.org/#​Adventures_in_Routing|8.1 Adventures in Routing]] +
-    * [[https://​www.linuxfoundation.org/#​Ethernet_Device_Renaming|8.2 Ethernet Device Renaming]] +
-    * [[https://​www.linuxfoundation.org/#​Painfully_Slow_Or_No_Failed_Link_Detection_By_Miimon|8.3 Painfully Slow Or No Failed Link Detection By Miimon]] +
-  * [[https://​www.linuxfoundation.org/#​SNMP_agents|9 SNMP agents]] +
-    * [[https://​www.linuxfoundation.org/#​Promiscuous_mode|9.1 Promiscuous mode]] +
-  * [[https://​www.linuxfoundation.org/#​Configuring_Bonding_for_High_Availability|10 Configuring Bonding for High Availability]] +
-    * [[https://​www.linuxfoundation.org/#​High_Availability_in_a_Single_Switch_Topology|10.1 High Availability in a Single Switch Topology]] +
-    * [[https://​www.linuxfoundation.org/#​High_Availability_in_a_Multiple_Switch_Topology|10.2 High Availability in a Multiple Switch Topology]] +
-      * [[https://​www.linuxfoundation.org/#​HA_Bonding_Mode_Selection_for_Multiple_Switch_Topology|10.2.1 HA Bonding Mode Selection for Multiple Switch Topology]] +
-      * [[https://​www.linuxfoundation.org/#​HA_Link_Monitoring_Selection_for_Multiple_Switch_Topology|10.2.2 HA Link Monitoring Selection for Multiple Switch Topology]] +
-  * [[https://​www.linuxfoundation.org/#​Configuring_Bonding_for_Maximum_Throughput|11 Configuring Bonding for Maximum Throughput]] +
-    * [[https://​www.linuxfoundation.org/#​Maximizing_Throughput_in_a_Single_Switch_Topology|11.1 Maximizing Throughput in a Single Switch Topology]] +
-      * [[https://​www.linuxfoundation.org/#​MT_Bonding_Mode_Selection_for_Single_Switch_Topology|11.1.1 MT Bonding Mode Selection for Single Switch Topology]] +
-      * [[https://​www.linuxfoundation.org/#​MT_Link_Monitoring_for_Single_Switch_Topology|11.1.2 MT Link Monitoring for Single Switch Topology]] +
-    * [[https://​www.linuxfoundation.org/#​Maximum_Throughput_in_a_Multiple_Switch_Topology|11.2 Maximum Throughput in a Multiple Switch Topology]] +
-      * [[https://​www.linuxfoundation.org/#​MT_Bonding_Mode_Selection_for_Multiple_Switch_Topology|11.2.1 MT Bonding Mode Selection for Multiple Switch Topology]] +
-      * [[https://​www.linuxfoundation.org/#​MT_Link_Monitoring_for_Multiple_Switch_Topology|11.2.2 MT Link Monitoring for Multiple Switch Topology]] +
-  * [[https://​www.linuxfoundation.org/#​Switch_Behavior_Issues|12 Switch Behavior Issues]] +
-    * [[https://​www.linuxfoundation.org/#​Link_Establishment_and_Failover_Delays|12.1 Link Establishment and Failover Delays]] +
-    * [[https://​www.linuxfoundation.org/#​Duplicated_Incoming_Packets|12.2 Duplicated Incoming Packets]] +
-  * [[https://​www.linuxfoundation.org/#​Hardware_Specific_Considerations|13 Hardware Specific Considerations]] +
-    * [[https://​www.linuxfoundation.org/#​IBM_BladeCenter|13.1 IBM BladeCenter]] +
-    * [[https://​www.linuxfoundation.org/#​JS20_network_adapter_information|13.2 JS20 network adapter information]] +
-    * [[https://​www.linuxfoundation.org/#​BladeCenter_networking_configuration|13.3 BladeCenter networking configuration]] +
-    * [[https://​www.linuxfoundation.org/#​Requirements_for_specific_modes|13.4 Requirements for specific modes]] +
-    * [[https://​www.linuxfoundation.org/#​Link_monitoring_issues|13.5 Link monitoring issues]] +
-    * [[https://​www.linuxfoundation.org/#​Other_concerns|13.6 Other concerns]] +
-  * [[https://​www.linuxfoundation.org/#​Frequently_Asked_Questions|14 Frequently Asked Questions]] +
-    * [[https://​www.linuxfoundation.org/#​Is_it_SMP_safe.3F|14.1 Is it SMP safe?]] +
-    * [[https://​www.linuxfoundation.org/#​What_type_of_cards_will_work_with_it.3F|14.2 What type of cards will work with it?]] +
-    * [[https://​www.linuxfoundation.org/#​How_many_bonding_devices_can_I_have.3F|14.3 How many bonding devices can I have?]] +
-    * [[https://​www.linuxfoundation.org/#​How_many_slaves_can_a_bonding_device_have.3F|14.4 How many slaves can a bonding device have?]] +
-    * [[https://​www.linuxfoundation.org/#​What_happens_when_a_slave_link_dies.3F|14.5 What happens when a slave link dies?]] +
-  * [[https://​www.linuxfoundation.org/#​Can_bonding_be_used_for_High_Availability.3F|15 Can bonding be used for High Availability?​]] +
-  * [[https://​www.linuxfoundation.org/#​Which_switches.2Fsystems_does_it_work_with.3F|16 Which switches/​systems does it work with?]] +
-    * [[https://​www.linuxfoundation.org/#​Where_does_a_bonding_device_get_its_MAC_address_from.3F|16.1 Where does a bonding device get its MAC address from?]] +
-  * [[https://​www.linuxfoundation.org/#​Resources_and_Links|17 Resources and Links]] +
-  * [[https://​www.linuxfoundation.org/#​History|18 History]]+
  
 ===== Installation===== ===== Installation=====
Line 97: Line 33:
 "make config"​),​ then select "​Bonding driver support"​ in the "​Network "make config"​),​ then select "​Bonding driver support"​ in the "​Network
 device support"​ section. ​ It is recommended that you configure the device support"​ section. ​ It is recommended that you configure the
-driver as module since it is currently the only way to pass parameters+driver as module since that is currently the only way to pass parameters
 to the driver or configure more than one bonding device. to the driver or configure more than one bonding device.
  
Line 157: Line 93:
 The parameters are as follows: The parameters are as follows:
  
-  * ** arp_interval ** +  * ** arp_interval **\\ Specifies the ARP link monitoring frequency in milliseconds. If ARP monitoring is used in an etherchannel compatible mode (modes 0 and 2), the switch should be configured in a mode that evenly distributes packets across all links. If the switch is configured to distribute the packets in an XOR fashion, all replies from the ARP targets will be received on the same link which could cause the other team members to fail.  ARP monitoring should not be used in conjunction with miimon. ​ A value of 0 disables ARP monitoring. ​ The default value is 0. 
-  *  ​Specifies the ARP link monitoring frequency in milliseconds. If ARP monitoring is used in an etherchannel compatible mode (modes 0 and 2), the switch should be configured in a mode that evenly distributes packets across all links. If the switch is configured to distribute the packets in an XOR fashion, all replies from the ARP targets will be received on the same link which could cause the other team members to fail.  ARP monitoring should not be used in conjunction with miimon. ​ A value of 0 disables ARP monitoring. ​ The default value is 0. +  * **arp_ip_target **\\ Specifies the IP addresses to use as ARP monitoring peers when arp_interval is > 0.  These are the targets of the ARP request sent to determine the health of the link to the targets. Specify these values in //​ddd.ddd.ddd.ddd//​ format. ​ Multiple IP addresses must be separated by a comma. ​ At least one IP address must be given for ARP monitoring to function. ​ The maximum number of targets that can be specified is 16.  The default value is no IP addresses. 
-  * **arp_ip_target ** +  * ** downdelay **\\ Specifies the time, in milliseconds,​ to wait before disabling a slave after a link failure has been detected. ​ This option is only valid for the miimon link monitor. ​ The downdelay value should be a multiple of the miimon value; if not, it will be rounded down to the nearest multiple. ​ The default value is 0. 
-  *  ​Specifies the IP addresses to use as ARP monitoring peers when arp_interval is > 0.  These are the targets of the ARP request sent to determine the health of the link to the targets. Specify these values in //​ddd.ddd.ddd.ddd//​ format. ​ Multiple IP addresses must be separated by a comma. ​ At least one IP address must be given for ARP monitoring to function. ​ The maximum number of targets that can be specified is 16.  The default value is no IP addresses. +  * ** lacp_rate **\\ Option specifying the rate in which we'll ask our link partner to transmit LACPDU packets in 802.3ad mode.  Possible values are: 
-  * ** downdelay ** +    * ** slow or 0 **\\ Request partner to transmit LACPDUs every 30 seconds. 
-  *  ​Specifies the time, in milliseconds,​ to wait before disabling a slave after a link failure has been detected. ​ This option is only valid for the miimon link monitor. ​ The downdelay value should be a multiple of the miimon value; if not, it will be rounded down to the nearest multiple. ​ The default value is 0. +    * ** fast or 1 **\\ Request partner to transmit LACPDUs every 1 second The default is slow. 
-  * ** lacp_rate ** +    * ** max_bonds **\\ Specifies the number of bonding devices to create for this instance of the bonding driver. E.g., if max_bonds is 3, and the bonding driver is not already loaded, then bond0, bond1 and bond2 will be created. ​ The default value is 1. 
-  *  ​Option specifying the rate in which we'll ask our link partner to transmit LACPDU packets in 802.3ad mode.  Possible values are: +  * ** miimon **\\ Specifies the MII link monitoring frequency in milliseconds. ​ This determines how often the link state of each slave is inspected for link failures. ​ A value of zero disables MII link monitoring. ​ A value of 100 is a good starting point. ​ The use_carrier option, below, affects how the link state is determined. ​ See the High Availability section for additional information. ​ The default value is 0. 
- +  * ** mode **\\ Specifies one of the bonding policies. The default is balance-rr (round robin). Possible values are: 
-  *  +    * ** balance-rr or 0 **\\ Round-robin policy: Transmit packets in sequential order from the first available slave through the last.  This mode provides load balancing and fault tolerance. 
-  ​* ** slow or 0 ** +    * ** active-backup or 1 **\\ Active-backup policy: Only one slave in the bond is active. ​ A different slave becomes active if, and only if, the active slave fails. ​ The bond's MAC address is externally visible on only one port (network adapter) to avoid confusing the switch.\\ In bonding version 2.6.2 or later, when a failover occurs in active-backup mode, bonding will issue one or more gratuitous ARPs on the newly active slave. One gratutious ARP is issued for the bonding master interface and each VLAN interfaces configured above it, provided that the interface has at least one IP address configured.\\ Gratuitous ARPs issued for VLAN interfaces are tagged with the appropriate VLAN id. This mode provides fault tolerance. The primary option, documented below, affects the behavior of this mode. 
-  *  ​Request partner to transmit LACPDUs every 30 seconds. +    * ** balance-xor or 2 **\\ XOR policy: Transmit based on the selected transmit hash policy. The default policy is a simple {{https://​wiki.linuxfoundation.org/​images/​math/​c/​b/​5/​cb5b8cfaf5d544efdc37d12c461083e6.png?​nolink|}}\\ Alternate transmit policies may be selected via the **xmit_hash_policy** option.\\ This mode provides load balancing and fault tolerance. 
- +    * **broadcast or 3**\\ Broadcast policy: transmits everything on all slave interfaces. ​ This mode provides fault tolerance. 
-  *  +    * **802.3ad or 4**\\ IEEE 802.3ad Dynamic link aggregation. ​ Creates aggregation groups that share the same speed and duplex settings. ​ Utilizes all slaves in the active aggregator according to the 802.3ad specification.\\ Slave selection for outgoing traffic is done according to the transmit hash policy, which may be changed from the default simple XOR policy via the **xmit_hash_policy** option, documented below. ​ Note that not all transmit policies may be 802.3ad compliant, particularly in regards to the packet mis-ordering requirements of section 43.2.4 of the [[http://​en.wikipedia.com/​wiki/​802.3|802.3ad]] standard. ​ Differing peer implementations will have varying tolerances for noncompliance. 
-  ​* ** fast or 1 ** +      *  Prerequisites:​ 
-  *  ​Request partner to transmit LACPDUs every 1 second The default is slow. +        -  Ethtool support in the base drivers for retrieving the speed and duplex of each slave. 
- +        -  A switch that supports IEEE 802.3ad Dynamic link aggregation. 
-  ​* ** max_bonds ** +      * Most switches will require some type of configuration to enable 802.3ad mode. 
-  *  ​Specifies the number of bonding devices to create for this instance of the bonding driver. E.g., if max_bonds is 3, and the bonding driver is not already loaded, then bond0, bond1 and bond2 will be created. ​ The default value is 1. +    * ** balance-tlb or 5**\\ Adaptive transmit load balancing: channel bonding that does not require any special switch support. ​ The outgoing traffic is distributed according to the current load (computed relative to the speed) on each slave. ​ Incoming traffic is received by the current slave. ​ If the receiving slave fails, another slave takes over the MAC address of the failed receiving slave. 
-  * ** miimon ** +      *  Prerequisite:​ 
-  *  ​Specifies the MII link monitoring frequency in milliseconds. ​ This determines how often the link state of each slave is inspected for link failures. ​ A value of zero disables MII link monitoring. ​ A value of 100 is a good starting point. +        -  Ethtool support in the base drivers for retrieving the speed of each slave. 
-  *  The use_carrier option, below, affects how the link state is determined. ​ See the High Availability section for additional information. ​ The default value is 0. +    * ** balance-alb or 6 **\\ Adaptive load balancing: includes balance-tlb plus receive load balancing (rlb) for IPV4 traffic, and does not require any special switch support. ​ The receive load balancing is achieved by ARP negotiation. 
-  * ** mode ** +      *  The bonding driver intercepts the ARP Replies sent by the local system on their way out and overwrites the source hardware address with the unique hardware address of one of the slaves in the bond such that different peers use different hardware addresses for the server. 
-  *  ​Specifies one of the bonding policies. The default is balance-rr (round robin). +      *  Receive traffic from connections created by the server is also balanced. ​ When the local system sends an ARP Request the bonding driver copies and saves the peer's IP information from the ARP packet. 
- +      *  When the ARP Reply arrives from the peer, its hardware address is retrieved and the bonding driver initiates an ARP reply to this peer assigning it to one of the slaves in the bond. 
-Possible values are: +      *  A problematic outcome of using ARP negotiation for balancing is that each time that an ARP request is broadcast it uses the hardware address of the bond.  Hence, peers learn the hardware address of the bond and the balancing of receive traffic collapses to the current slave. ​ This is handled by sending updates (ARP Replies) to all the peers with their individually assigned hardware address such that the traffic is redistributed. ​ Receive traffic is also redistributed when a new slave is added to the bond and when an inactive slave is re-activated. ​ The receive load is distributed sequentially (round robin) among the group of highest speed slaves in the bond. 
- +      *  When a link is reconnected or a new slave joins the bond the receive traffic is redistributed among all active slaves in the bond by initiating ARP Replies with the selected mac address to each of the clients. The updelay parameter (detailed below) must be set to a value equal or greater than the switch'​s forwarding delay so that the ARP Replies sent to the peers will not be blocked by the switch. 
-  *  +      *  Prerequisites:​ 
-  ​* ** balance-rr or 0 ** +        -  Ethtool support in the base drivers for retrieving the speed of each slave. 
-  *  ​Round-robin policy: Transmit packets in sequential order from the first available slave through the last.  This mode provides load balancing and fault tolerance. +        -  Base driver support for setting the hardware address of a device while it is open.  This is required so that there will always be one slave in the team using the bond hardware address (the curr_active_slave) while having a unique hardware address for each slave in the bond.  If the curr_active_slave fails its hardware address is swapped with the new curr_active_slave that was chosen. 
- +  * ** primary **\\ A string (eth0, eth2, etc) specifying which slave is the primary device. ​ The specified device will always be the active slave while it is available. ​ Only when the primary is off-line will alternate devices be used.  This is useful when one slave is preferred over another, e.g., when one slave has higher throughput than another. The primary option is only valid for active-backup mode. 
-  *  +  * ** updelay **\\ Specifies the time, in milliseconds,​ to wait before enabling a slave after a link recovery has been detected. ​ This option is only valid for the miimon link monitor. ​ The updelay value should be a multiple of the miimon value; if not, it will be rounded down to the nearest multiple. ​ The default value is 0. 
-  ​* ** active-backup or 1 ** +  * ** use_carrier **\\ Specifies whether or not miimon should use MII or ETHTOOL ioctls vs. netif_carrier_ok() to determine the link status. The MII or ETHTOOL ioctls are less efficient and utilize a deprecated calling sequence within the kernel. ​ The netif_carrier_ok() relies on the device driver to maintain its state with netif_carrier_on/​off;​ at this writing, most, but not all, device drivers support this facility. 
-  *  ​Active-backup policy: Only one slave in the bond is active. ​ A different slave becomes active if, and only if, the active slave fails. ​ The bond's MAC address is externally visible on only one port (network adapter) to avoid confusing the switch. +    *  If bonding insists that the link is up when it should not be, it may be that your network device driver does not support netif_carrier_on/​off. ​ The default state for netif_carrier is "​carrier on," so if a driver does not support netif_carrier,​ it will appear as if the link is always up.  In this case, setting use_carrier to 0 will cause bonding to revert to the MII / ETHTOOL ioctl method to determine the link state. 
- +    *  A value of 1 enables the use of netif_carrier_ok(),​ a value of 0 will use the deprecated MII / ETHTOOL ioctls. ​ The default value is 1. 
-  *  ​In bonding version 2.6.2 or later, when a failover occurs in active-backup mode, bonding will issue one or more gratuitous ARPs on the newly active slave. One gratutious ARP is issued for the bonding master interface and each VLAN interfaces configured above it, provided that the interface has at least one IP address configured. +  * ** xmit_hash_policy **\\ Selects the transmit hash policy to use for slave selection in balance-xor and 802.3ad modes. ​ Possible values are: 
-  *  ​Gratuitous ARPs issued for VLAN interfaces are tagged with the appropriate VLAN id. This mode provides fault tolerance. The primary option, documented below, affects the behavior of this mode. +    * ** layer2 **\\ Uses XOR of hardware MAC addresses to generate the hash.  The formula is {{https://​wiki.linuxfoundation.org/​images/​math/​8/​f/​6/​8f6eed397d6bee56f08b1fe20aadfee6.png?​nolink|}}\\ This algorithm will place all traffic to a particular network peer on the same slave.\\ This algorithm is 802.3ad compliant. 
- +    * ** layer3+4**\\ This policy uses upper layer protocol information,​ when available, to generate the hash.  This allows for traffic to a particular network peer to span multiple slaves, although a single connection will not span multiple slaves.\\ The formula for unfragmented TCP and UDP packets is {{https://​wiki.linuxfoundation.org/​images/​math/​8/​b/​d/​8bdca0ddf73e1cb38c4195ce511616d7.png?​nolink|}}\\ For fragmented TCP or UDP packets and all other IP protocol traffic, the source and destination port information is omitted. ​ For non-IP traffic, the formula is the same as for the layer2 transmit hash policy.\\ This policy is intended to mimic the behavior of certain switches, notably Cisco switches with PFC2 as well as some Foundry and IBM products.\\ This algorithm is not fully 802.3ad compliant. ​ A single TCP or UDP conversation containing both fragmented and unfragmented packets will see packets striped across two interfaces. ​ This may result in out of order delivery. ​ Most traffic types will not meet this criteria, as TCP rarely fragments traffic, and most UDP traffic is not involved in extended conversations. ​ Other implementations of 802.3ad may or may not tolerate this noncompliance.\\ The default value is layer2. ​ This option was added in bonding version 2.6.3. ​ In earlier versions of bonding, this parameter does not exist, and the layer2 policy is the only policy.
-  *  +
-  ​* ** balance-xor or 2 ** +
-  *  ​XOR policy: Transmit based on the selected transmit hash policy. The default policy is a simple +
- +
-  *  ​{{https://​wiki.linuxfoundation.org/​images/​math/​c/​b/​5/​cb5b8cfaf5d544efdc37d12c461083e6.png?​nolink|}} +
- +
-  *  ​Alternate transmit policies may be selected via the **xmit_hash_policy** option. +
-  *  ​This mode provides load balancing and fault tolerance. +
- +
-  *  +
-  ​* **broadcast or 3** +
-  *  ​Broadcast policy: transmits everything on all slave interfaces. ​ This mode provides fault tolerance. +
- +
-  *  +
-  ​* **802.3ad or 4** +
-  *  ​IEEE 802.3ad Dynamic link aggregation. ​ Creates aggregation groups that share the same speed and duplex settings. ​ Utilizes all slaves in the active aggregator according to the 802.3ad specification. +
- +
-  *  ​Slave selection for outgoing traffic is done according to the transmit hash policy, which may be changed from the default simple XOR policy via the **xmit_hash_policy** option, documented below. ​ Note that not all transmit policies may be 802.3ad compliant, particularly in regards to the packet mis-ordering requirements of section 43.2.4 of the [[http://​en.wikipedia.com/​wiki/​802.3|802.3ad]] standard. ​ Differing peer implementations will have varying tolerances for noncompliance. +
- +
-    ​*  Prerequisites:​ +
-      -  Ethtool support in the base drivers for retrieving the speed and duplex of each slave. +
-      -  A switch that supports IEEE 802.3ad Dynamic link aggregation. +
- +
-  ​* Most switches will require some type of configuration to enable 802.3ad mode. +
- +
-  *  +
-  ​* ** balance-tlb or 5** +
-  *  ​Adaptive transmit load balancing: channel bonding that does not require any special switch support. ​ The outgoing traffic is distributed according to the current load (computed relative to the speed) on each slave. ​ Incoming traffic is received by the current slave. ​ If the receiving slave fails, another slave takes over the MAC address of the failed receiving slave. +
- +
-    ​*  Prerequisite:​ +
-      -  Ethtool support in the base drivers for retrieving the speed of each slave. +
- +
-\\  +
- +
-  *  +
-  ​* ** balance-alb or 6 ** +
-  *  ​Adaptive load balancing: includes balance-tlb plus receive load balancing (rlb) for IPV4 traffic, and does not require any special switch support. ​ The receive load balancing is achieved by ARP negotiation. +
- +
-  ​*  The bonding driver intercepts the ARP Replies sent by the local system on their way out and overwrites the source hardware address with the unique hardware address of one of the slaves in the bond such that different peers use different hardware addresses for the server. +
-  *  Receive traffic from connections created by the server is also balanced. ​ When the local system sends an ARP Request the bonding driver copies and saves the peer's IP information from the ARP packet. +
-  *  When the ARP Reply arrives from the peer, its hardware address is retrieved and the bonding driver initiates an ARP reply to this peer assigning it to one of the slaves in the bond. +
-  *  A problematic outcome of using ARP negotiation for balancing is that each time that an ARP request is broadcast it uses the hardware address of the bond.  Hence, peers learn the hardware address of the bond and the balancing of receive traffic collapses to the current slave. ​ This is handled by sending updates (ARP Replies) to all the peers with their individually assigned hardware address such that the traffic is redistributed. ​ Receive traffic is also redistributed when a new slave is added to the bond and when an inactive slave is re-activated. ​ The receive load is distributed sequentially (round robin) among the group of highest speed slaves in the bond. +
-  *  When a link is reconnected or a new slave joins the bond the receive traffic is redistributed among all active slaves in the bond by initiating ARP Replies with the selected mac address to each of the clients. The updelay parameter (detailed below) must be set to a value equal or greater than the switch'​s forwarding delay so that the ARP Replies sent to the peers will not be blocked by the switch. +
- +
-    ​*  Prerequisites:​ +
-      -  Ethtool support in the base drivers for retrieving the speed of each slave. +
-      -  Base driver support for setting the hardware address of a device while it is open.  This is required so that there will always be one slave in the team using the bond hardware address (the curr_active_slave) while having a unique hardware address for each slave in the bond.  If the curr_active_slave fails its hardware address is swapped with the new curr_active_slave that was chosen. +
- +
-  * ** primary ** +
-  *   A string (eth0, eth2, etc) specifying which slave is the primary device. ​ The specified device will always be the active slave while it is available. ​ Only when the primary is off-line will alternate devices be used.  This is useful when one slave is preferred over another, e.g., when one slave has higher throughput than another. The primary option is only valid for active-backup mode. +
-  * ** updelay ** +
-  *  ​Specifies the time, in milliseconds,​ to wait before enabling a slave after a link recovery has been detected. ​ This option is only valid for the miimon link monitor. ​ The updelay value should be a multiple of the miimon value; if not, it will be rounded down to the nearest multiple. ​ The default value is 0. +
-  * ** use_carrier ** +
-  *  ​Specifies whether or not miimon should use MII or ETHTOOL ioctls vs. netif_carrier_ok() to determine the link status. The MII or ETHTOOL ioctls are less efficient and utilize a deprecated calling sequence within the kernel. ​ The netif_carrier_ok() relies on the device driver to maintain its state with netif_carrier_on/​off;​ at this writing, most, but not all, device drivers support this facility. +
- +
-  ​*  If bonding insists that the link is up when it should not be, it may be that your network device driver does not support netif_carrier_on/​off. ​ The default state for netif_carrier is "​carrier on," so if a driver does not support netif_carrier,​ it will appear as if the link is always up.  In this case, setting use_carrier to 0 will cause bonding to revert to the MII / ETHTOOL ioctl method to determine the link state. +
- +
-  ​*  A value of 1 enables the use of netif_carrier_ok(),​ a value of 0 will use the deprecated MII / ETHTOOL ioctls. ​ The default value is 1. +
- +
-  * ** xmit_hash_policy ** +
-  *  ​Selects the transmit hash policy to use for slave selection in balance-xor and 802.3ad modes. ​ Possible values are: +
- +
-  *  +
-  ​* ** layer2 ** +
-  *  ​Uses XOR of hardware MAC addresses to generate the hash.  The formula is +
-<​code> ​{{https://​wiki.linuxfoundation.org/​images/​math/​8/​f/​6/​8f6eed397d6bee56f08b1fe20aadfee6.png?​nolink|}}</​code>​ +
- +
-This algorithm will place all traffic to a particular network peer on the same slave. +
-This algorithm is 802.3ad compliant. +
- +
-  *  +
-  ​* ** layer3+4** +
-  *  ​This policy uses upper layer protocol information,​ when available, to generate the hash.  This allows for traffic to a particular network peer to span multiple slaves, although a single connection will not span multiple slaves. +
- +
-The formula for unfragmented TCP and UDP packets is +
-<​code> ​{{https://​wiki.linuxfoundation.org/​images/​math/​8/​b/​d/​8bdca0ddf73e1cb38c4195ce511616d7.png?​nolink|}}</​code>​ +
- +
-For fragmented TCP or UDP packets and all other IP protocol traffic, the source and destination port information is omitted. ​ For non-IP traffic, the formula is the same as for the layer2 transmit hash policy. +
- +
-This policy is intended to mimic the behavior of certain switches, notably Cisco switches with PFC2 as well as some Foundry and IBM products. +
- +
-This algorithm is not fully 802.3ad compliant. ​ A single TCP or UDP conversation containing both fragmented and unfragmented packets will see packets striped across two interfaces. ​ This may result in out of order delivery. ​ Most traffic types will not meet this criteria, as TCP rarely fragments traffic, and most UDP traffic is not involved in extended conversations. ​ Other implementations of 802.3ad may or may not tolerate this noncompliance.+
  
-The default value is layer2. ​ This option was added in bonding version 2.6.3. ​ In earlier versions of bonding, this parameter does not exist, and the layer2 policy is the only policy. 
 ===== Configuring Bonding Devices===== ===== Configuring Bonding Devices=====
  
Line 1551: Line 1404:
  
 Donald Becker'​s Ethernet Drivers and diag programs may be found at : [[http://​www.scyld.com/​network/​|http://​www.scyld.com/​network/​]] Donald Becker'​s Ethernet Drivers and diag programs may be found at : [[http://​www.scyld.com/​network/​|http://​www.scyld.com/​network/​]]
-You will also find a lot of [[https://​www.linkedin.com/​company/​redgage-llc |information]] regarding Ethernet, NWay, MII,+You will also find a lot of information regarding Ethernet, NWay, MII,
 etc. at www.scyld.com. etc. at www.scyld.com.
 ===== History===== ===== History=====
networking/bonding.1470917350.txt.gz · Last modified: 2016/08/11 12:09 by AliceSmith