Versions Compared

Key

  • This line was added.
  • This line was removed.
  • Formatting was changed.
Comment: Case 6

The Problem

...

Code Block
titleKernel problem case 2
collapsetrue
Aug  6 05:00:14 host8host18 kernel: [730487.320583] RIP: 0010:[<ffffffff813201e6>]  [<ffffffff813201e6>] memcpy_erms+0x6/0x10
Aug  6 05:00:14 host8host18 kernel: [730487.339579] FS:  0000000000000000(0000) GS:ffff88103fb00000(0000) knlGS:0000000000000000
Aug  6 05:00:14 host8host18 kernel: [730487.353038]  000000000000faf0 ffff88203650fc48 0000000000000000 ffffffff81517a12
Aug  6 05:00:14 host8host18 kernel: [730487.369167]  [<ffffffffc063e119>] ? drbd_send+0xc9/0x1e0 [drbd]
Aug  6 05:00:14 host8host18 kernel: [730487.387607]  [<ffffffffc063c2f0>] ? drbd_destroy_connection+0xf0/0xf0 [drbd]
Aug  6 05:00:14 host8host18 kernel: [730487.403314]  RSP <ffff88203650fb40>

...

Code Block
titleKernel problem case 3
collapsetrue
Aug 27 05:25:43 host9host19 kernel: [2547757.533648] RSP: 0018:ffff8801c93b3b40  EFLAGS: 00010292
Aug 27 05:25:43 host9host19 kernel: [2547757.579013]  00004000000005b4 00000000000005b4 00000000000008a0 0000000000000800
Aug 27 05:25:43 host9host19 kernel: [2547757.623329]  [<ffffffffc060a2f0>] ? drbd_destroy_connection+0xf0/0xf0 [drbd]
Case 4

Debian kernel 4.6, RSP in or around DRBD module:

Code Block
titleKernel problem case 4
collapsetrue
Aug 27 04:26:29 host4host14 kernel: [478357.442244] Modules linked in: pci_stub(E) vboxpci(OE) vboxnetadp(OE) vboxnetflt(OE) vboxdrv(OE) nfsv3(E) rpcsec_gss_krb5(E) nfsv4(E) dns_resolver(E) tcp_diag(E) inet_diag(E) ipt_REJECT(E) nf_reject_ipv4(E) nf_log_ipv6(E) ip6t_rt(E) veth(E) drbd(E) ipmi_devintf(E) xt_multiport(E) nf_log_ipv4(E) nf_log_common(E) xt_LOG(E) xt_limit(E) xt_tcpudp(E) nf_conntrack_ipv4(E) nf_defrag_ipv4(E) ip6table_filter(E) xt_conntrack(E) xt_state(E) iptable_filter(E) ip_tables(E) nf_conntrack_ipv6(E) nf_defrag_ipv6(E) nf_conntrack(E) ip6_tables(E) x_tables(E) nfsd(E) auth_rpcgss(E) nfs_acl(E) nfs(E) lockd(E) grace(E) fscache(E) sunrpc(E) 8021q(E) garp(E) mrp(E) bridge(E) stp(E) llc(E) lru_cache(E) libcrc32c(E) crc32c_generic(E) x86_pkg_temp_thermal(E) intel_powerclamp(E) coretemp(E) kvm_intel(E) kvm(E) irqbypass(E)<4>[478357.455870] Hardware name: Thomas-Krenn.AG X9DR3-F/X9DR3-F, BIOS 3.0a 07/31/2013
Aug 27 04:26:29 host4host14 kernel: [478357.460567] RSP: 0018:ffff8808534f7b40  EFLAGS: 00010292
Aug 27 04:26:29 host4host14 kernel: [478357.465530] RBP: ffff8808534f7c58 R08: ffff880858d28af0 R09: 0000000000000000
Aug 27 04:26:29 host4host14 kernel: [478357.470727] FS:  0000000000000000(0000) GS:ffff88085fa00000(0000) knlGS:0000000000000000
Aug 27 04:26:29 host4host14 kernel: [478357.476167] Stack:
Aug 27 04:26:29 host4host14 kernel: [478357.483832] Call Trace:
Aug 27 04:26:29 host4host14 kernel: [478357.489826]  [<ffffffff814acf80>] ? sock_sendmsg+0x30/0x40
Aug 27 04:26:29 host4host14 kernel: [478357.498117]  [<ffffffffc07b9efd>] ? w_send_dblock+0x9d/0x1c0 [drbd]
Aug 27 04:26:29 host4host14 kernel: [478357.506710]  [<ffffffffc07d02f0>] ? drbd_destroy_connection+0xf0/0xf0 [drbd]
Aug 27 04:26:29 host4host14 kernel: [478357.515569] Code: 90 90 90 90 90 eb 1e 0f 1f 00 48 89 f8 48 89 d1 48 c1 e9 03 83 e2 07 f3 48 a5 89 d1 f3 a4 c3 66 0f 1f 44 00 00 48 89 f8 48 89 d1 <f3> a4 c3 0f 1f 80 00 00 00 00 48 89 f8 48 83 fa 20 72 7e 40 38 
Aug 27 04:26:29 host4host14 kernel: [478357.538514] ---[ end trace 0d23089f3d6f0d23 ]---

...

Debian kernel 4.6 with DRBD module 4.8.4-1. DRBD again or notRAM ("unable to handle kernel paging request")?

Code Block
titleKernel problem case 5
collapsetrue
Sep  9 04:26:34 host14 kernel: [1056355.854359] BUG: unable to handle kernel paging request at 0000000000001000
Sep  9 04:26:34 host14 kernel: [1056355.854480] PGD 0
Sep  9 04:26:34 host14 kernel: [1056355.854536] Modules linked in: drbd(OE) ipt_REJECT(E) nf_reject_ipv4(E) tcp_diag(E) inet_diag(E) nfsv3(E) rpcsec_gss_krb5(E) nfsv4(E) dns_resolver(E) ip6t_rt(E) veth(E) ipmi_devintf(E) xt_multiport(E) pci_stub(E) vboxpci(OE) vboxnetadp(OE) vboxnetflt(OE) nf_log_ipv4(E) nf_log_common(E) xt_LOG(E) xt_limit(E) vboxdrv(OE) xt_tcpudp(E) nf_conntrack_ipv4(E) nf_defrag_ipv4(E) xt_conntrack(E) xt_state(E) ip6table_filter(E) nf_conntrack_ipv6(E) nf_defrag_ipv6(E) nf_conntrack(E) ip6_tables(E) iptable_filter(E) ip_tables(E) x_tables(E) nfsd(E) auth_rpcgss(E) nfs_acl(E) nfs(E) lockd(E) grace(E) fscache(E) sunrpc(E) 8021q(E) garp(E) mrp(E) bridge(E) stp(E) llc(E) lru_cache(E) libcrc32c(E) crc32c_generic(E) x86_pkg_temp_thermal(E) intel_powerclamp(E) coretemp(E) kvm_intel(E) kvm(E) irqbypass(E) crct10dif_pclmul(E)<4>[1056355.855837] CPU: 0 PID: 21195 Comm: drbd_w_bs Tainted: G           OE   4.6.0-0.bpo.1-amd64 #1 Debian 4.6.4-1~bpo8+1
Sep  9 04:26:34 host14 kernel: [1056355.864470] RIP: 0010:[<ffffffff81320246>]  [<ffffffff81320246>] memcpy_erms+0x6/0x10
Sep  9 04:26:34 host14 kernel: [1056355.873051] RDX: 00000000000004d8 RSI: 0000000000001000 RDI: ffff88084f40b9e8
Sep  9 04:26:34 host14 kernel: [1056355.881789] R13: 00000000000004d8 R14: ffff88084f40bec0 R15: ffff880859cdfc60
Sep  9 04:26:34 host14 kernel: [1056355.890683] CR2: 0000000000001000 CR3: 0000000001a06000 CR4: 00000000001426f0
Sep  9 04:26:34 host14 kernel: [1056355.899824]  000000000000faf0 ffff880859cdfc40 0000000000000000 ffffffff81517ae2
Sep  9 04:26:34 host14 kernel: [1056355.909042]  [<ffffffff81324dcf>] ? copy_from_iter+0x22f/0x250
Sep  9 04:26:34 host14 kernel: [1056355.918209]  [<ffffffffc0aa3e49>] ? drbd_send+0xc9/0x1e0 [drbd]
Sep  9 04:26:34 host14 kernel: [1056355.927262]  [<ffffffffc0aa4027>] ? __send_command.isra.42+0xc7/0x1b0 [drbd]
Sep  9 04:26:34 host14 kernel: [1056355.936213]  [<ffffffffc0aa1f50>] ? drbd_destroy_connection+0xf0/0xf0 [drbd]
Sep  9 04:26:34 host14 kernel: [1056355.944993]  [<ffffffff81099ecf>] ? kthread+0xdf/0x100
Sep  9 04:26:34 host14 kernel: [1056355.953598] Code: 90 90 90 90 90 eb 1e 0f 1f 00 48 89 f8 48 89 d1 48 c1 e9 03 83 e2 07 f3 48 a5 89 d1 f3 a4 c3 66 0f 1f 44 00 00 48 89 f8 48 89 d1 <f3> a4 c3 0f 1f 80 00 00 00 00 48 89 f8 48 83 fa 20 72 7e 40 38
Sep  9 04:26:34 host14 kernel: [1056355.965431] CR2: 0000000000001000
[...]
Sep  9 04:27:52 host14 kernel: [1056433.917513] block drbd60: Remote failed to finish a request within 84108ms > ko-count (7) * timeout (60 * 0.1s)

...

Case 6

Debian kernel 4.6 with DRBD module 4.8.4-1. DRBD again:

Code Block
Sep 28 04:27:25 host14 kernel: [1628982.714636] task: ffff881058992fc0 ti: ffff881057e6c000 task.ti: ffff881057e6c000
Sep 28 04:27:25 host14 kernel: [1628982.721567] RAX: ffff880667953600 RBX: 0000000000000590 RCX: 00000000000000c0
Sep 28 04:27:25 host14 kernel: [1628982.730672] R13: 00000000000000c0 R14: ffff8806679536c0 R15: ffff881057e6fc60
Sep 28 04:27:25 host14 kernel: [1628982.742133]  ffffffff81324dcf ffff88042083a080 ffff880775c7bc00 0000000000000590
Sep 28 04:27:25 host14 kernel: [1628982.753478]  [<ffffffff81517ae2>] ? tcp_sendmsg+0x5f2/0xb00
Sep 28 04:27:25 host14 kernel: [1628982.764619]  [<ffffffffc05b5027>] ? __send_command.isra.42+0xc7/0x1b0 [drbd]
Sep 28 04:27:25 host14 kernel: [1628982.775427]  [<ffffffffc05b2f50>] ? drbd_destroy_connection+0xf0/0xf0 [drbd]
Sep 28 04:27:25 host14 kernel: [1628982.788215] RIP  [<ffffffff81320246>] memcpy_erms+0x6/0x10

This goes on DRBD and LXC mailing lists when I'm awake again.

Dismissed solution ideas (after case 4): DRBD9? Commercial support?

...