Versions Compared

Key

  • This line was added.
  • This line was removed.
  • Formatting was changed.
Comment: Major update with links to fix and the like

...

For many years (since CentOS 5) we at ITEG (also the main force behind Clazzes.org) have used DRBD to mirror the partition of each virtual machine (then OpenVZ, now mostly LXC).

Kernel 4.6 hickups

...

caused by DRBD 8.4.6

...

to 8.4.9.-1

After shredding some data last year (probably caused by a combination of aggressive settings and unfortunate module versions), it now (Debian jessie, Kernel 4.6 started to make big troubles again when a VLAN/Bridging bug forced us to upgrade our Debian jessie machines to kernels from jessie-backports , DRBD (kernels 4.4 to now 4.7) with DRBD module 8.4.6 ) seemed to make troubles againup to upstream version 8.4.9-1.

For details see TBL Ever-rising load on Debian jessie + DRBD8 + LXC host pairs.

Debianizing the most-recent upstream release of the kernel modul was our way to divide (as in divide and conquer).

We are currently as unsure as ever about wether DRBD is a cause or a trigger or just the first victim that manages to "cry out when hit".Nevertheless, maintaining DKMS debs providing the most recent upstream kernel modul seems a good idea in general, so we'll keep it maintained for a while at leastnow know that  a wrong kernel 4.0+ adaption in DRBD8 was the root cause of that problem - and we had a kernel expert fix it for us (and hence everybody), see this post in the drbd-dev mailing list and followups.

As of this commit in Linus' linux.git the patch we triggered (and financed) made it into the kernel mainstream, i.e. 4.9. It will also be part of all older 4.x kernels still maintained. The next LTS kernel to be, 4.10, also chosen by Debian for stretch, will get a rewrite of the whole section and hence also be free of that bug.

But as long as the fix does no make it into Debian jessie-backports (via upstream kernel upgrades) and as long as stretch does not become stable we will continue to provide the DKMS-debianized DRBD module as described below, #OpenSourceRules.

Up-to-date & FIXED DRBD 8 packages in Clazzes.org' Debian repository

We are using the packages below on 5 nodes so far (as of 2016-09-09), with 3 more nodes eventually going to use them after the next reboot (although we are thinking about changing that).
So far it runs as good but also as bad as the module that comes with the Debian kernel 4.6.

The packages below contain the most recent DRBD8 module plus the "kernel_sendmsg() usage" patch, and are in permanent and successful use on all 10 nodes we maintain.

Packages

Clazzes.org's Deb server deb.clazzes.org contains a repository "jessie-drbdpkg-8" providing 2 DRBD packages:

drbd8-dkms 8.4.

...

9-

...

1clazzes6

This packages contains the most-recently available sources of the DRBD8 module, along with with , the  "kernel_sendmsg() usage" patch, and DKMS integration for Debian jessie (probably usable for jessie-based derivates too).

On installation of drbd8-dkms_8.4.9-11clazzes6 (or later installation of a new ) alongside matching pairs of linux-header-*-amd64 alongside the matching  and linux-image-*-amd64 package) packages the up-to-date DRBD module is automatically built and installed.

Remark 1: On some nodes dkms triggered the installation of a linux-image-3.2.0-4-rt-amd64 which can be removed afterwards.

Remark 2: DKMS sucks at building modules for kernel versions that are installed but not yet active, and we haven't got it under perfect control yet. See bottom of this page.

We'll update this package as long as stretch does not become stable. and jessie-backports does not inlcude the patch (or goes to 4.10).

drbd-utils 8.9.

...

9-1

We 'cheated' a bit here: drbd-utils_8.9.79-1 is the package from Debian's unstable/experimental repositories (i.e. here), re-integrated in our DRBD8-repository.

This way any Debian jessie installation has access to up-to-date DRBD8 packages without the need to care about compiling manually or adding Debian unstable to sources.list.

We'll update this package once the mailing list settles down about the next versionas long as stretch does not become stable.

DKMS

...

Problems, Solutions, hints

Problem

1 of 5 hosts rebooted so far, going from 4.4 to 4.6 through the reboot, failed to load the DRBD module, claimingDKMS somehow fails to build the DRBD8 module correctly for newer kernels that are installed but not yet active, so after reboot the DRBD module can't be loaded:

No Format
modprobe: ERROR: could not insert 'drbd': Exec format error
Possible root cause

This could be due to ABI changes in the kernel, from 4.4 to 4.6. Our other reboots so far have been without changing the kernel version, 4.6 before and after.
However 2 more of our 8 nodes are running 4.5, and their /var/lib/dkms/drbd/8.4.8-1clazzes1/4.6.0-0.bpo.1-amd64/x86_64/module/drbd.ko has a different size from the those nodes with 4.6. Although, /lib/modules/4.6.0-0.bpo.1-amd64/kernel/drivers/block/drbd/drbd.ko has the same size and MD5 sum on machines running 4.5 or 4.6.

Solution used

I solved it with ...

...

Solution used

We solved it with ...

No Format
dpkg-reconfigure drbd8-dkms
 
# ... or ...
apt-get remove drbd8-dkms
apt-get install drbd8-dkms
 
# ... and finally ...
/etc/init.d/drbd start
Suspected faster solution

...

Version checks

To check which drbd8 module version is loaded:

No Format
dkmscat autoinstall
/etc/init.d/proc/drbd start
Proposed check command

It might be a good idea to perform this every now and then:

No Format
dkms status

 

...

|head -1

To check which version the kernel will load next time it loads the drbd8 module:

No Format
modinfo drbd |egrep -i "^version:"