For When You Can't Have The Real Thing
[ start | index | login ]
start > CentOS > 7 > VM Interface Bouncing

VM Interface Bouncing

Created by dave. Last edited by dave, 4 years and 188 days ago. Viewed 1,462 times. #1
[edit] [rdf]
labels
attachments
(2019-09-13)

Problem

VM transplanted between two VMware clusters. Ran fine in old cluster for two years. In the new cluster, the VM goes unavailable for seemingly random periods of time. In the VM, syslog is logging:

Sep  9 13:46:10 n2 NetworkManager[695]: <info>  [1568051170.0695] device (ens160): state change: activated -> deactivating (reason 'connection-removed', sys-iface-state: 'managed')
Sep  9 13:46:10 n2 NetworkManager[695]: <info>  [1568051170.0698] manager: NetworkManager state is now DISCONNECTING
Sep  9 13:46:10 n2 NetworkManager[695]: <info>  [1568051170.0732] device (ens160): state change: deactivating -> disconnected (reason 'connection-removed', sys-iface-state: 'managed')
Sep  9 13:46:10 n2 dbus[689]: [system] Activating via systemd: service name='org.freedesktop.nm_dispatcher' unit='dbus-org.freedesktop.nm-dispatcher.service'
Sep  9 13:46:10 n2 systemd: Starting Network Manager Script Dispatcher Service…
Sep  9 13:46:10 n2 NetworkManager[695]: <info>  [1568051170.0805] manager: NetworkManager state is now DISCONNECTED
Sep  9 13:46:10 n2 dbus[689]: [system] Successfully activated service 'org.freedesktop.nm_dispatcher'
Sep  9 13:46:10 n2 systemd: Started Network Manager Script Dispatcher Service.
Sep  9 13:46:10 n2 nm-dispatcher: req:1 'connectivity-change': new request (4 scripts)
Sep  9 13:46:10 n2 nm-dispatcher: req:1 'connectivity-change': start running ordered scripts…
Sep  9 13:46:10 n2 nm-dispatcher: req:2 'down' [ens160]: new request (4 scripts)

[...]

Sep 9 14:01:01 n2 NetworkManager[695]: <info> [1568052061.0809] policy: auto-activating connection 'ens160' (6ea5c518-e427-4665-8c6d-eae95fe5f3a7) Sep 9 14:01:01 n2 NetworkManager[695]: <info> [1568052061.0815] device (ens160): Activation: starting connection 'ens160' (6ea5c518-e427-4665-8c6d-eae95fe5f3a7) Sep 9 14:01:01 n2 NetworkManager[695]: <info> [1568052061.0816] device (ens160): state change: disconnected -> prepare (reason 'none', sys-iface-state: 'managed') Sep 9 14:01:01 n2 NetworkManager[695]: <info> [1568052061.0819] manager: NetworkManager state is now CONNECTING Sep 9 14:01:01 n2 NetworkManager[695]: <info> [1568052061.0821] device (ens160): state change: prepare -> config (reason 'none', sys-iface-state: 'managed') Sep 9 14:01:01 n2 NetworkManager[695]: <info> [1568052061.0826] device (ens160): state change: config -> ip-config (reason 'none', sys-iface-state: 'managed') Sep 9 14:01:01 n2 NetworkManager[695]: <info> [1568052061.0837] device (ens160): state change: ip-config -> ip-check (reason 'none', sys-iface-state: 'managed') Sep 9 14:01:01 n2 NetworkManager[695]: <info> [1568052061.0844] device (ens160): state change: ip-check -> secondaries (reason 'none', sys-iface-state: 'managed') Sep 9 14:01:01 n2 NetworkManager[695]: <info> [1568052061.0846] device (ens160): state change: secondaries -> activated (reason 'none', sys-iface-state: 'managed') Sep 9 14:01:01 n2 NetworkManager[695]: <info> [1568052061.0848] manager: NetworkManager state is now CONNECTED_LOCAL Sep 9 14:01:01 n2 NetworkManager[695]: <info> [1568052061.0973] manager: NetworkManager state is now CONNECTED_SITE Sep 9 14:01:01 n2 NetworkManager[695]: <info> [1568052061.0974] policy: set 'ens160' (ens160) as default for IPv4 routing and DNS Sep 9 14:01:01 n2 NetworkManager[695]: <info> [1568052061.1002] device (ens160): Activation: successful, device activated. Sep 9 14:01:01 n2 NetworkManager[695]: <info> [1568052061.1007] manager: NetworkManager state is now CONNECTED_GLOBAL

Frequency and length of outages appear to be random. Can be seconds to hours; can happen every 5 minutes or not for 18 hours.

There are no indications of any trouble in the VMware logs or consoles. The problem follows the vm around the cluster in question.

Solution

Disable Network Manager.

Commentary

Thing worked fine for two years. There's obviously some inconsistency between the two clusters. I'd speculate on some unfortunately-concurrent change to CentOS, but there are twins of the VM still running properly in the old cluster with no issues.

I also tried disabling the vmtoolsd and vgauthd daemons on the assumption that NetworkManager was getting bad info from VMware somehow; this didn't work.

no comments | post comment
This is a collection of techical information, much of it learned the hard way. Consider it a lab book or a /info directory. I doubt much of it will be of use to anyone else.

Useful:


snipsnap.org | Copyright 2000-2002 Matthias L. Jugel and Stephan J. Schmidt