...
For many years (since CentOS 5) we at ITEG (also the main force behind Clazzes.org) have used DRBD to mirror the partition of each virtual machine (then OpenVZ, now mostly LXC).
Kernel 4.6 hickups
...
caused by DRBD 8.4.6
...
to 8.4.9.-1
After shredding some data last year (probably caused by a combination of aggressive settings and unfortunate module versions), it now (Debian jessie, Kernel 4.6 started to make big troubles again when a VLAN/Bridging bug forced us to upgrade our Debian jessie machines to kernels from jessie-backports , DRBD (kernels 4.4 to now 4.7) with DRBD module 8.4.6 ) seemed to make troubles againup to upstream version 8.4.9-1.
For details see TBL Ever-rising load on Debian jessie + DRBD8 + LXC host pairs.
Debianizing the most-recent upstream release of the kernel modul was our way to divide (as in divide and conquer).
We are currently as unsure as ever about wether DRBD is a cause or a trigger or just the first victim that manages to "cry out when hit".Nevertheless, maintaining DKMS debs providing the most recent upstream kernel modul seems a good idea in general, so we'll keep it maintained for a while at leastnow know that a wrong kernel 4.0+ adaption in DRBD8 was the root cause of that problem - and we had a kernel expert fix it for us (and hence everybody), see this post in the drbd-dev mailing list and followups.
As of this commit in Linus' linux.git the patch we triggered (and financed) made it into the kernel mainstream, i.e. 4.9. It will also be part of all older 4.x kernels still maintained. The next LTS kernel to be, 4.10, also chosen by Debian for stretch
, will get a rewrite of the whole section and hence also be free of that bug.
But as long as the fix does no make it into Debian jessie-backports
(via upstream kernel upgrades) and as long as stretch
does not become stable
we will continue to provide the DKMS-debianized DRBD module as described below, #OpenSourceRules.
Up-to-date & FIXED DRBD 8 packages in Clazzes.org' Debian repository
We are using the packages below on 5 nodes so far (as of 2016-09-09), with 3 more nodes eventually going to use them after the next reboot (although we are thinking about changing that).
So far it runs as good but also as bad as the module that comes with the Debian kernel 4.6.
The packages below contain the most recent DRBD8 module plus the "kernel_sendmsg()
usage" patch, and are in permanent and successful use on all 10 nodes we maintain.
Packages
Clazzes.org's Deb server deb.clazzes.org contains a repository "jessie-drbdpkg-8
" providing 2 DRBD packages:
drbd8-dkms 8.4.
...
9-
...
1clazzes6
This packages contains the most-recently available sources of the DRBD8 module, along with with , the "kernel_sendmsg()
usage" patch, and DKMS integration for Debian jessie (probably usable for jessie-based derivates too).
On installation of drbd8-dkms_8.4.9-11clazzes6 (or later installation of a new ) alongside matching pairs of linux-header-*-amd64
alongside the matching and linux-image-*-amd64
package) packages the up-to-date DRBD module is automatically built and installed.
Remark 1: On some nodes dkms
triggered the installation of a linux-image-3.2.0-4-rt-amd64
which can be removed afterwards.
Remark 2: DKMS sucks at building modules for kernel versions that are installed but not yet active, and we haven't got it under perfect control yet. See bottom of this page.
We'll update this package as long as stretch
does not become stable
. and jessie-backports
does not inlcude the patch (or goes to 4.10).
drbd-utils 8.9.
...
9-1
We 'cheated' a bit here: drbd-utils_8.9.79-1 is the package from Debian's unstable/experimental
repositories (i.e. here), re-integrated in our DRBD8-repository.
This way any Debian jessie
installation has access to up-to-date DRBD8 packages without the need to care about compiling manually or adding Debian unstable
to sources.list
.
We'll update this package once the mailing list settles down about the next versionas long as stretch
does not become stable
.
DKMS
...
Problems, Solutions, hints
Problem
1 of 5 hosts rebooted so far, going from 4.4 to 4.6 through the reboot, failed to load the DRBD module, claimingDKMS somehow fails to build the DRBD8 module correctly for newer kernels that are installed but not yet active, so after reboot the DRBD module can't be loaded:
No Format |
---|
modprobe: ERROR: could not insert 'drbd': Exec format error |
Possible root cause
This could be due to ABI changes in the kernel, from 4.4 to 4.6. Our other reboots so far have been without changing the kernel version, 4.6 before and after.
However 2 more of our 8 nodes are running 4.5, and their /var/lib/dkms/drbd/8.4.8-1clazzes1/4.6.0-0.bpo.1-amd64/x86_64/module/drbd.ko
has a different size from the those nodes with 4.6. Although, /lib/modules/4.6.0-0.bpo.1-amd64/kernel/drivers/block/drbd/drbd.ko
has the same size and MD5 sum on machines running 4.5 or 4.6.
Solution used
I solved it with ...
...
Solution used
We solved it with ...
No Format |
---|
dpkg-reconfigure drbd8-dkms # ... or ... apt-get remove drbd8-dkms apt-get install drbd8-dkms # ... and finally ... /etc/init.d/drbd start |
Suspected faster solution
...
Version checks
To check which drbd8 module version is loaded:
No Format |
---|
dkmscat autoinstall /etc/init.d/proc/drbd start |
Proposed check command
It might be a good idea to perform this every now and then:
No Format |
---|
dkms status |
...
|head -1 |
To check which version the kernel will load next time it loads the drbd8 module:
No Format |
---|
modinfo drbd |egrep -i "^version:" |