White Rabbit High-availability Seamless Redundancy (HSR)
WR-HSR is a research project to implement the High availability Seamless
Ring (HSR) protocol (IEC 62439-3 Clause 5) on White Rabbit switches and
dual-port end nodes.
The implementation is not part of the roadmap of the White Rabbit
project.
Introduction
HSR guarantees zero-time recovery in case of single point of failure. Including the protocol in WR elements, we could extend HSR features to time and frequency distribution in WR ring networks.
The nodes (devices) in an HSR network are attached by two Ethernet ports. A source node sends the same frame over both ports. A destination should receive, in the fault-free state, two identical frames within a certain time skew, forward the first frame to the application and discard the second frame when (and if) it comes. A sequence number is used to recognize such duplicates.
Fig1: HSR typical network topology. (non-wr)
HSR nodes are arranged into a ring, which allows the network to operate
without dedicated switches, since every node is able to forward frames
from port to port. HSR originally meant "High-availability Seamless
Ring", but HSR is not limited to a simple ring topology. Redundant
connections to other HSR rings and to PRP networks are possible.
Since the forwarding delay of every node in a HSR ring adds to the total
network latency, it is important that frames are forwarded quickly. In
practice, special hardware support is required to bring down the per-hop
latency to a reasonable value, often using cut-through switching.
Another property of a HSR ring is that only about half of the network
bandwidth is available to applications (compared to RSTP). This is
because all frames are sent twice over the same network, even when there
is no failure.
WR-HSR Implementation
The implementation of WR-HSR relies on the development of a peer-to-peer mechanism instead of the current end-to-end method to measure the link delay between the master node of the ring and end-nodes. The nodes that conform the ring must implement a Transparent Clock (TC), able to forward and consume sync and follow_up messages with an HSR tag. In case an untag PTP message reaches a TC, it assumes it comes from a master clock so that it tags the frame, duplicates it, and sends it out through the two HSR ports following different paths in the ring.
To sum up, this development implies:
- the development of peer delay message exchange to measure the delay between two adjacent nodes (Fig.2 & Fig.3).
- the development of peer-to-peer for sync and follow_up messages over TC (Fig.4).
- computing the residence time of sync messages on each node to be added in the correction field of follow_up messages. This correction field, accumulated from all nodes the frame passes by, together with the link delay measured, will be used by the end node to get synchronized with the master (Fig.5).
- each PTP message includes a HSR tag, used to drop duplicated messages from the network and check possible errors in transmission.
- PTP is computed per port, applying a Best Master Clock algorithm on each.
- in case of node failure, a switchover to the other path must be performed to keep sinchronization to the master node active.
Fig2: Link delay measurement using peer delay mechanisim
Fig3: Peer Delay message exchange diagram
Fig4: Peer-to-Peer Sync/Follow_up message exchange
Fig5: Residence Time Measurement for PTP Correction Field
Project information
- Official production documentation: EDMS EDA-02408
- Software
- Frequently Asked Questions
- Users
Contacts
Project
- José Luis Gutiérrez - University of Granada - General questions about the project.
- Javier Díaz - University of Granada - General questions about the project.
Project Status
Date | Event |
01-07-2014 | Start of brainstorming to increase reliability in WR networks. |
01-10-2014 | HSR as first candidate for adding reliability in WR networks. |
12-10-2014 | Starting to study of the HSR protocol and how to implement it for WR |
08-04-2015 | Starting implementation on the WRS |
17-04-2015 | Problem TBD: what about WR syntonization using pure TC? |
Possible solution: fake TC = BC + forwarding sync/follow_up. Problem: switchover will not be 0 time recover | |
20-04-2015 | Porting César's PeerDelay implementation to current version of PPSi |
27-04-2015 | HSR: Adding HSR tag to PTP frames on PPSi |
06-05-2015 | P2P: Starting "TC" implementation (forwarding) |
18-05-2015 | P2P: Residence time for follow_up messages in 2-step clocks implementation |
8 April 2015