ptp4l runs clockcheck on an incoming PTP message before checking its
domain number. If the time on another domain is different, then
clockcheck will trigger spurious synchronization faults.
This patch reorders the logic so that clockcheck only runs on messages
in the same time domain.
Reported-By: Filip Perich <perich@google.com>
Signed-off-by: Cliff Spradlin <cspradlin@google.com>
With increasing unicast support, the code needs to identify unicast
messages more often. This patch replaces the open coded bit field
tests with a more readable in line helper function call.
Signed-off-by: Richard Cochran <richardcochran@gmail.com>
According to the standard, unicast Sync messages are to be sent with
the interval field set to 127. This patch adds a test to avoid
incorrectly adopting that value as a new interval.
Signed-off-by: Richard Cochran <richardcochran@gmail.com>
Richard Hill reported an occasional NULL pointer deference in
port_delay_request() when in hybrid mode.
if (p->hybrid_e2e) {
struct ptp_message *dst = TAILQ_FIRST(&p->best->messages);
msg->address = dst->address;
...
}
The code assumes that the p->best->messages list can't be empty
because:
The function, port_delay_request(), is called only when
FD_DELAY_TIMER expires. That timer is only set by the function,
port_set_delay_tmo(), which is called:
1. from process_delay_resp(), but only when state is UNCALIBRATED
or SLAVE.
2. from port_e2e_transition(), but only when state is UNCALIBRATED
or SLAVE.
Looking at handle_state_decision_event(), a port can only enter
UNCALIBRATED or SLAVE when it has a valid foreign master record,
ie p->best->messages is not null.
A port also only clears p->best->messages when it leaves
UNCALIBRATED or SLAVE, at which point the FD_DELAY_TIMER is also
cleared.
*However* the p->best->messages list *can* be empty if the
FD_ANNOUNCE_TIMER and the FD_DELAY_TIMER expire at the same time. In
this case, the poll() call indicates events on both file descriptors.
The announce timeout is handled like this:
case FD_ANNOUNCE_TIMER:
case FD_SYNC_RX_TIMER:
if (p->best)
fc_clear(p->best);
So then the port_delay_request() call de-references the null
TAILQ_FIRST message pointer.
This patch fixes the issue by re-ordering the timer file descriptors
within the polling list.
Signed-off-by: Richard Cochran <richardcochran@gmail.com>
Reported-by: Richard Hill <plonta@gmx.de>
According to 1588, PTP message loops are simply someone else's problem
with respect to transparent clocks. Since we are running the BMCA for
syntonization anyway, we might as well go ahead and implement the spanning
tree for PTP messages.
Signed-off-by: Richard Cochran <richardcochran@gmail.com>
The E2E TC forwards Announce, Delay_Req, Delay_Resp, Management,
Signaling, and Sync messages, and drops P2P Delay messages.
This implementation tracks the GM using the BMCA in order
to syntonize (or possibly even synchronize) with it.
Signed-off-by: Richard Cochran <richardcochran@gmail.com>
The P2P TC forwards Announce, Management, Signaling, and Sync
messages, consumes P2P Delay messages, and drops E2E Delay messages.
This implementation tracks the GM using the BMCA in order
to syntonize (or possibly even synchronize) with it.
Signed-off-by: Richard Cochran <richardcochran@gmail.com>
This patch adds code that sends an event messages received on one port out
all the other ports and calculates the residence time. The correction,
ingress port, and the original message are remembered in a TC transmit
descriptor. These descriptors are recycled in a memory pool in a similar
way to the message buffers.
Signed-off-by: Richard Cochran <richardcochran@gmail.com>
The transparent clock code will want to set qualification timeouts and
perform end to end delay measurements. This patch exposes the needed
methods.
Signed-off-by: Richard Cochran <richardcochran@gmail.com>
When masterOnly is true, the port always returns NULL when computing
its best foreign master. As a result, the port will never enter the
SLAVE state, and the clock will ignore Announce messages received on
that port.
This attribute is specifically called out in G.8275.1 and G.8275.2,
and it is implied by the "master only" mode G.8265.1. In addition,
this option will probably appear in the next revision of IEEE 1588.
Signed-off-by: Richard Cochran <richardcochran@gmail.com>
The Telecom Profiles G.8275.1 and G.8275.2 have invented a new
per-port and per-clock attribute, not in 1588, called "localPriority".
The use of this attribute is a distinguishing feature of the telecom
data set comparison algorithm.
This patch adds the attribute, hard coded to its default value.
Signed-off-by: Richard Cochran <richardcochran@gmail.com>
There is no need to keep two copies of the data set comparison
function. This patch adds a method that allows the port code to
obtain the function from the clock code.
Signed-off-by: Richard Cochran <richardcochran@gmail.com>
The majority of the callers of transport_send() use hard coded magic
numbers. This patch fixes them to use the corresponding enumerated
values instead.
Signed-off-by: Richard Cochran <richardcochran@gmail.com>
The PortAddress structure has no space for the actual address and should
be used only as a pointer to a larger buffer.
The issue was reported by gcc with enabled source fortification.
[ RC: Replace magic number with sizeof() macro. ]
Signed-off-by: Miroslav Lichvar <mlichvar@redhat.com>
Signed-off-by: Richard Cochran <richardcochran@gmail.com>
This patch makes a number of subroutines into global functions in order
to share code with the TC implementations to come.
Signed-off-by: Richard Cochran <richardcochran@gmail.com>
This patch places the internal port data structure into a common header
for use by the original BC and the new TC code.
Signed-off-by: Richard Cochran <richardcochran@gmail.com>
When computing the next port state based on a FSM event, much of the logic
will stay the same for OC, BC, and TC nodes.
- handling a fault ASAP
- INITIALIZING state handling
- showing the transition in the log
- sending notifications
This patch moves this common code into a global port method, making it
available to future TC implementations.
Signed-off-by: Richard Cochran <richardcochran@gmail.com>
This paves the way to allow different implementations for the upcoming
Transparent Clock code.
Signed-off-by: Richard Cochran <richardcochran@gmail.com>
The 1588 standard defines one step operation for both Sync and
PDelay_Resp messages. Up until now, hardware with P2P one step has
been rare, and kernel support was lacking. This patch adds support of
the mode in anticipation of new kernel and hardware developments.
Signed-off-by: Richard Cochran <richardcochran@gmail.com>
On the transmit path, the port-level code calls msg_sots_missing()
directly, but on receive this check is buried in the message layer.
With the coming addition of peer to peer one step, the ingress check
will need knowledge of the configured time stamping option. This
patch moves the check in order to accommodate the exceptional case.
Signed-off-by: Richard Cochran <richardcochran@gmail.com>
In a ptp unaware network (like the telecom profile for frequency sync
G.8265.1), both the RTD and the PDV can be substantially higher than
in a ptp aware network. To achieve more accurate measurements, the
rate may need to be configured higher to get more data and increase
the chance of lucky packets.
In a combination of a high configured rate of delay_req and high
RTD/PDV in network, the risk that the response from the previously
sent delay_req have not been received before a new delay_req is sent
also become high. In that case, the need of storing more than the
latest sent delay_req arise.
This patch adds a queue for sent delay requests so several request can
be ongoing in parallel. When a delay response is received, a matching
request will be searched for in the queue and after processed removed
from the queue.
The stored delay_req will be removed if older than 5 seconds. Check is
made before a new delay_req is sent or announce receipt tmo expires.
Signed-off-by: Anders Selhammer <anders.selhammer@est.tech>
The function ts_to_Timestamp() is now just a wrapper around
tmv_to_Timestamp(). Simplify code by using tmv_to_Timestamp()
directly.
Signed-off-by: Michael Brown <mbrown@fensystems.co.uk>
Convert a software timestamp to the internal tmv_t representation at
the earliest possible opportunity, to match the behaviour for hardware
timestamps.
Signed-off-by: Michael Brown <mbrown@fensystems.co.uk>
Convert a hardware timestamp to the internal tmv_t representation at
the earliest possible opportunity. This allows us to:
- eliminate multiple redundant calls to timespec_to_tmv()
- use tmv_add() instead of open-coded manipulation of a struct
timespec in ts_add()
- use tmv_to_Timestamp() instead of open-coded manipulation of a
struct timespec and struct Timestamp in ts_to_Timestamp()
- use tmv_is_zero() instead of open-coded manipulation of a struct
timespec in msg_sots_valid()
Signed-off-by: Michael Brown <mbrown@fensystems.co.uk>
The function ts_to_timestamp() currently performs open-coded
manipulation of a struct timespec and struct Timestamp instead of
using the tmv_t abstractions.
Prepare for the removal of this code by matching the calling
convention for tmv_to_Timestamp(): returning a struct Timestamp rather
than accepting a pointer to a struct Timestamp.
Signed-off-by: Michael Brown <mbrown@fensystems.co.uk>
The function ts_add() currently performs open-coded manipulation of a
struct timespec instead of using the tmv_t abstractions.
Prepare for the removal of this code by storing ingressLatency and
egressLatency as corrections (matching the behaviour for
delayAsymmetry).
Signed-off-by: Michael Brown <mbrown@fensystems.co.uk>
The function clock_check_ts() performs open-coded manipulation of a
struct timespec instead of using the tmv_t abstractions.
Use the existing tmv_t abstractions to convert from struct timespec to
nanoseconds, and modify the prototype of clock_check_ts() to match
that of the underlying clockcheck_sample().
Signed-off-by: Michael Brown <mbrown@fensystems.co.uk>
When NSM is enabled on a given port, that port always replies to a NSM
delay request with a delay response, sync, and follow up, regardless
of the current state of the port.
Signed-off-by: Richard Cochran <richardcochran@gmail.com>
The port will need to send unicast Sync messages in order to support
the NSM protocol. Besides that, we will need this ability anyhow if
we ever want to implement unicast operation.
Signed-off-by: Richard Cochran <richardcochran@gmail.com>
Path trace TLVs and Follow-Up info TLVs might be mixed in among other
random TLVs. This patch fixes the parsing code to find these TLVs even
when multiple other TLVs are present.
Signed-off-by: Richard Cochran <richardcochran@gmail.com>
The current code uses an ad hoc method of appending TLVs. When
constructing a message, the code computes the total PDU length by adding
the message size to the TLV size. By using the new API, this patch
simplifies message construction, letting each TLV add its own length
to the total.
As a result of the this change, the return value for the helper
functions, follow_up_info_append() and path_trace_append(), has
changed meaning. Instead of returning the TLV length, these functions
now provide an error code.
Signed-off-by: Richard Cochran <richardcochran@gmail.com>
This patch changes the receive message parsing code to place each TLV
into the list. A method is introduced that allows attaching TLVs to
the end of the list.
In addition, msg.last_tlv is converted into a pointer to the last item
in the list. Because of this change, the transmit code that uses this
field now allocates a TLV before using it.
Signed-off-by: Richard Cochran <richardcochran@gmail.com>
Time values are compared using an inequality test in mmedian.c
Generalise tmv_eq() to tmv_cmp() (by analogy with memcmp()) and
replace existing uses of tmv_eq().
Signed-off-by: Michael Brown <mbrown@fensystems.co.uk>
The code uses a local variable for program flow control in a silly way.
This patch simplifies the logic by using the common switch/case/default
pattern instead.
Signed-off-by: Richard Cochran <richardcochran@gmail.com>
Up until now the transportSpecific field has been treated according to
802.1AS, namely as a field that must match exactly on receive.
However, 1588 mandates ignoring this field for some transports, and
there is equipment in the wild that does in fact set the reserved
bits.
This patch adds an option to ignore the field on receive completely.
Signed-off-by: Richard Cochran <richardcochran@gmail.com>
Reported-by: Petr Kulhavy <brain@jikos.cz>
When the minimum delay request interval is changed after processing a
delay response, update the current timeout to immediately follow the new
interval.
Signed-off-by: Miroslav Lichvar <mlichvar@redhat.com>
Now the ts label will be either the bond active slave or the interface
name, which is the exactly interface we need to get ts info.
When the link down/up or there is a fail over and ts_label changed, the
phc index may also changed. So we need to check get new ts info and check
clock_required_modes. We will set the link to LINK_DOWN by force if
the new ts_label's timestamp do not support required mode.
If all good, then we set phc index to new one. Also sync clock interval
after switch phc.
Signed-off-by: Hangbin Liu <liuhangbin@gmail.com>
Besides link up and down, we may also receive other rtnl messages, like
bond slave changed info, which link state keeps the same.
So we should return EV_FAULT_CLEARED only when both LINK_UP and
LINK_STATE_CHANGED.
When the link state keep the same, we should return EV_NONE.
Signed-off-by: Hangbin Liu <liuhangbin@gmail.com>
Update function rtnl_link_status to get bond slave info. Pass the slave index
to call back functions. i.e. port_link_status.
Also check the interface index of rtnl message in function rtnl_link_status.
Then we don't need to check it in port_link_status.
Add ifndef IFLA_BOND_MAX in case we build linuxptp on kernel before v3.13-rc1.
Signed-off-by: Hangbin Liu <liuhangbin@gmail.com>
The previous function use general message and will dump all interfaces'
information. Now update with ifinfomsg so we could get specific interface's
information.
We still could get all interfaces' info if set device to NULL.
Signed-off-by: Hangbin Liu <liuhangbin@gmail.com>
With rtnl socket we can track link status per port(except UDS port).
We can make sure we get the correct interface and latest status with function
port_link_status().
At the same time we need to set clock sde after link down. But we return
EV_FAULT_DETECTED in port_event(), which will not set clock sde. So we need
to set it in port_link_status().
Signed-off-by: Hangbin Liu <liuhangbin@gmail.com>
The sequence of port_nrate_calculate() and tsproc_update_delay()
in port_peer_delay() is mixed up.
The peer delay depends on the nrate ratio so the nrate ratio
shall be updated before peer delay is calculated.
Signed-off-by: Burkhard Ilsen <burkhardilsen@gmail.com>
This global function used to return an error code, but now it always
returns zero. This patch converts the function signature to return void
and simplifies the main clock loop by removing the useless test.
Signed-off-by: Richard Cochran <richardcochran@gmail.com>
The state machines in 1588 do not specify an event that causes a transition
out of the initializing state. This was left as a local issue. For this
transition, the current code assigns the next state outside of the FSM. But
doing so prevents an alternative FSM to handle this transition differently.
By introducing a new event, this patch places this transition where it
belongs, namely under the control of the FSM code,
Signed-off-by: Richard Cochran <richardcochran@gmail.com>
Although leaving the INITIALIZING state and clearing the FAULTY state
ASAP both result in a port entering the LISTENING state, still there
is no benefit from conflating the two. In the FAULTY case, the
current code actually skips the INITIALIZING state altogether.
This patch separates the two cases resulting in two benefits. First,
the check for ASAP fault status is only made when a fault is actually
present, unlike the present unconditional check. Second, this change
will allow us to cleanly support alternative state machines later on.
Signed-off-by: Richard Cochran <richardcochran@gmail.com>
The code that decides whether a fault qualifies for ASAP treatment is
a tangle of logical operators. This patch replaces the open coded
logic with a helper function whose name makes the intent clear. This
is a cosmetic change only.
Signed-off-by: Richard Cochran <richardcochran@gmail.com>
Looking at the fault logic in port_dispatch(), you might think that
the function, fault_interval(), checks whether a fault is active, but
you would be wrong, since that function always returns zero.
This patch removes the superfluous input error checking inside of
fault_interval() and changes the return type to void, making the
actual behavior explicit. Dropping the input check is safe because
that function has exactly two callers, both of whom always provide
valid inputs.
Signed-off-by: Richard Cochran <richardcochran@gmail.com>
If a non-slave port on a boundary clock see an announce message, then it
must decide whether it should take on the MASTER or the PASSIVE role. When
the GM fields from the local clock are identical to those in the announce,
then the sender/receiver ports are used as a tie breaker.
Following a typographical error in 1588, the code wrongly uses the port
identity of the upstream parent as the "receiver" id. As a result, a port
that should be PASSIVE may choose MASTER instead. This patch fixes the
code to use local port id.
Signed-off-by: Richard Cochran <richardcochran@gmail.com>
During the configuration rework, the announce span was wrongly converted
into a hard coded macro. In addition, the announceReceiptTimeout option
inadvertently became non-zero for the UDS port. As a result, the UDS port
sets a useless announce message timer, causing the code to close and reopen
the UDS port every few seconds.
This bug has an interesting history. It was first reported and fixed in
commit f36af8e0 ("uds: disable the accidentally enabled announce timer.").
That very fix was wrongly removed in commit 54f45063 ("port: change
'announce_span' into a macro."). Because of various code changes, this
bad commit cannot be simply reverted now.
This patch re-introduces the 'announce_span' variable and clears both it
and 'announceReceiptTimeout' for the UDS port, effectively disabling the
announce message timer.
Signed-off-by: Richard Cochran <richardcochran@gmail.com>
The port code is not interested in the number of ports but rather the
clock type. Since the polymorphic clock object will be able to report
its own type, this patch changes the clock interface accordingly.
Signed-off-by: Richard Cochran <richardcochran@gmail.com>
The message lists are implemented using a TAILQ from queue(3). The heads
of the list must be initialized using the provided macros, since the field
called 'tqh_last' is non-zero in the initial state. This patch fixes a
potential null pointer dereference by properly initializing the queues.
Note that there is no actual bug in the current code, because it uses the
lists in such a way as to initialize 'tqh_last' before any dereference.
Signed-off-by: Richard Cochran <richardcochran@gmail.com>
Upgrade the message level to info so the user can see it, but print it
at most once per 5 minutes to not spam the syslog too much.
Signed-off-by: Miroslav Lichvar <mlichvar@redhat.com>
The draft Enterprise Profile [1] specifies a hybrid E2E delay mechanism,
where the delay response message is sent "in kind". That is, if the
request is unicast, then the response is also unicast. Apparently this
scheme is already in widespread use in some industries. Also, it makes
sense, because those messages are of no interest to the other slaves in
the PTP network.
Because of the address work already in place, in turns out that adding
this mode is almost trivial. This patch introduces an "hybrid_e2e" option
that enabled the new mode.
1. https://datatracker.ietf.org/doc/draft-ietf-tictoc-ptp-enterprise-profile
Signed-off-by: Richard Cochran <richardcochran@gmail.com>
Buggy or mis-configured masters can place bogus logMessageInterval values
in their delay response messages. This patch places reasonable limits on
the range of values that we will accept.
Signed-off-by: Richard Cochran <richardcochran@gmail.com>
The logMessageInterval field has an improbable range from 2^-128 to 2^127
seconds. The extreme ends cause an integer overflow in the calculation
of the "foreign master time window". Buggy or mis-configured foreign
masters advertising extreme values will cause incorrect announce message
aging.
This patch fixes the issue by adding thresholds for the bogus extremes.
Signed-off-by: Richard Cochran <richardcochran@gmail.com>
This conversion is not straightforward due to the fact that these options
can take a value of "ASAP" or a number. We check for the special ASAP
case in a helper function and leave the numbers to the generic code.
Signed-off-by: Richard Cochran <richardcochran@gmail.com>
Add new time stamp processing modes to return raw delay and offset based
on the raw delay instead of the long-term filtered delay, and to return
also a weight of the sample. The weight is set to the ratio between the
two delays. This gives smaller weight to samples where the sync and/or
delay messages were delayed significantly in the network and possibly
include a large error.
Signed-off-by: Miroslav Lichvar <mlichvar@redhat.com>
Introduce a time stamp processor for offset/delay calculations and use
it in the clock and port modules.
Signed-off-by: Miroslav Lichvar <mlichvar@redhat.com>
Convert time stamps to tmv_t and apply all corrections before passing
them to clock/port functions to reduce the number of parameters.
Signed-off-by: Miroslav Lichvar <mlichvar@redhat.com>
If the user has configured the appropriate option, then simply warn
about the clock device mismatch, and then go on in "JBOD" mode.
Whenever the port enters the uncalibrated state, it tells the clock
to switch to the new PHC device.
Signed-off-by: Richard Cochran <richardcochran@gmail.com>