From 5cca24bc07ff9ff1a5eb2665d62334ec70de1a1d Mon Sep 17 00:00:00 2001 From: Vladimir Oltean Date: Mon, 15 Jun 2020 18:23:19 +0300 Subject: [PATCH] ptp4l: call recvmsg() with the MSG_DONTWAIT flag The application's main event loop (clock_poll) is woken up by poll() and dispatches the socket receive queue events to the corresponding ports as needed. So it is a bug if poll() wakes up the process for data availability on a socket's receive queue, and then recvmsg(), called immediately afterwards, goes to sleep trying to retrieve it. This patch will generate an error that will be propagated to the user if this condition happens. Can it happen? As of this patch, ptp4l uses the SO_SELECT_ERR_QUEUE socket option, which means that poll() will wake the process up, with revents == (POLLIN | POLLERR), if data is available in the error queue. But clock_poll() does not check POLLERR, just POLLIN, and draws the wrong conclusion that there is data available in the receive queue (when it is in fact available in the error queue). When the above condition happens, recvmsg() will sleep typically for a whole sync interval waiting for data on the event socket, and will be woken up when the new real frame arrives. It will not dequeue follow-up messages during this time (which are sent to the general message socket) and when it does, it will already be late for them (their seqid will be out of order). So it will drop them and everything that comes after. The synchronization process will fail. The above condition shouldn't typically happen, but exceptional kernel events will trigger it. It helps to be strict in ptp4l in order for those events to not blow up in even stranger symptoms unrelated to the root cause of the problem. Signed-off-by: Vladimir Oltean --- raw.c | 2 +- udp.c | 2 +- udp6.c | 2 +- 3 files changed, 3 insertions(+), 3 deletions(-) diff --git a/raw.c b/raw.c index 15c9756..0bd15b0 100644 --- a/raw.c +++ b/raw.c @@ -279,7 +279,7 @@ static int raw_recv(struct transport *t, int fd, void *buf, int buflen, buflen += hlen; hdr = (struct eth_hdr *) ptr; - cnt = sk_receive(fd, ptr, buflen, addr, hwts, 0); + cnt = sk_receive(fd, ptr, buflen, addr, hwts, MSG_DONTWAIT); if (cnt >= 0) cnt -= hlen; diff --git a/udp.c b/udp.c index 36802fb..826bd12 100644 --- a/udp.c +++ b/udp.c @@ -210,7 +210,7 @@ no_event: static int udp_recv(struct transport *t, int fd, void *buf, int buflen, struct address *addr, struct hw_timestamp *hwts) { - return sk_receive(fd, buf, buflen, addr, hwts, 0); + return sk_receive(fd, buf, buflen, addr, hwts, MSG_DONTWAIT); } static int udp_send(struct transport *t, struct fdarray *fda, diff --git a/udp6.c b/udp6.c index 744a5bc..ba5482e 100644 --- a/udp6.c +++ b/udp6.c @@ -227,7 +227,7 @@ no_event: static int udp6_recv(struct transport *t, int fd, void *buf, int buflen, struct address *addr, struct hw_timestamp *hwts) { - return sk_receive(fd, buf, buflen, addr, hwts, 0); + return sk_receive(fd, buf, buflen, addr, hwts, MSG_DONTWAIT); } static int udp6_send(struct transport *t, struct fdarray *fda,