Discussion:
[Bug 213896] when starting vimage jails the kernel crashes
(too old to reply)
b***@freebsd.org
2016-10-30 00:57:27 UTC
Permalink
Raw Message
https://bugs.freebsd.org/bugzilla/show_bug.cgi?id=213896

Mark Linimon <***@FreeBSD.org> changed:

What |Removed |Added
----------------------------------------------------------------------------
CC|freebsd-***@FreeBSD.org |
Assignee|freebsd-***@FreeBSD.org |freebsd-***@FreeBSD.org
--
You are receiving this mail because:
You are the assignee for the bug.
b***@freebsd.org
2016-10-30 14:57:44 UTC
Permalink
Raw Message
https://bugs.freebsd.org/bugzilla/show_bug.cgi?id=213896

Joe Barbish <***@a1poweruser.com> changed:

What |Removed |Added
----------------------------------------------------------------------------
CC| |***@a1poweruser.com

--- Comment #1 from Joe Barbish <***@a1poweruser.com> ---
I am running FreeBSD 11-RELEASE-p1 installed from scratch using cdrom.iso.
I have tested ipfw on the host and in a vimage jail with out any problems. My
custom kernel only has vimage compiled in. The host is running ipfw without
usimg DUMMYNET, IPDIVERT or IPFIREWALL_NAT. The vimage jail is also running
ipfw without using those same functions.

The only problem with ipfw is the vimage jails ipfw log messages get
intermingled into the host's ipfw log file.

I also tested with
options VIMAGE
options IPFIREWALL
options IPFIREWALL_NAT # ipfw kernel nat support
options IPDIVERT # divert sockets
options LIBALIAS # required by IPFIREWALL_NAT

compiled into the kernel and the host system booted fine with ipfw on the host
and the vimage jail worked the same as NOT compiling in ipfw. Did not test ipfw
using using those "functions listed above" on the host or vimage jail.

The only reason to compile ipfw into the kernel is if the host is not running
ipfw. A vimage jail does not kldload modules on first reference like the host
does so you have to compile them into the kernel. An alternative is to
configure your vimage jail's jail.conf with a exec.prestart option to kldload
the ipfw modules used by the vimage jail.

I didn't get any error messages from installkernel task during the vimage
kernel compile. My guess is ***@ofloo.net has problem with his upgrade to
11.0 or had existing kernel compile problems before the upgrade which left his
updated system messed up.

Suggest a install of 11.0 to a blank disk will correct this problem.
--
You are receiving this mail because:
You are the assignee for the bug.
b***@freebsd.org
2016-11-01 13:18:40 UTC
Permalink
Raw Message
https://bugs.freebsd.org/bugzilla/show_bug.cgi?id=213896

--- Comment #2 from ***@ofloo.net ---
No I reinstalled freebsd10.3 and I had the exact same error, I'll try to
install freebsd11 and see what happens, however I'd rather work on helping
finding the problem then going around it. When I read your answer on reinstall,
I got windows flash backs. where you reboot if something doesn't work and when
you think your system is a little slow you just reinstall. But I guess that's
just me.

So nothing previous is going on.
--
You are receiving this mail because:
You are the assignee for the bug.
b***@freebsd.org
2016-11-01 22:45:04 UTC
Permalink
Raw Message
https://bugs.freebsd.org/bugzilla/show_bug.cgi?id=213896

--- Comment #3 from ***@ofloo.net ---
The issue also happens when i compile vimage jail under freebsd11 however this
time no compile errors.

If you like have a video of the boot process if it's useful. Let me know.
--
You are receiving this mail because:
You are the assignee for the bug.
b***@freebsd.org
2016-11-01 23:26:40 UTC
Permalink
Raw Message
https://bugs.freebsd.org/bugzilla/show_bug.cgi?id=213896

--- Comment #4 from ***@ofloo.net ---
% kldstat
Id Refs Address Size Name
1 47 0xffffffff80200000 200eb88 kernel
2 1 0xffffffff82210000 30aec0 zfs.ko
3 2 0xffffffff8251b000 adc0 opensolaris.ko
4 2 0xffffffff82526000 9d50 bridgestp.ko
5 1 0xffffffff82530000 127b0 if_bridge.ko
6 1 0xffffffff82543000 15af8 if_lagg.ko
7 1 0xffffffff82559000 1620 accf_data.ko
8 1 0xffffffff8255b000 2710 accf_http.ko
9 1 0xffffffff8255e000 4c60 coretemp.ko
10 1 0xffffffff82563000 b3e8 aesni.ko
11 3 0xffffffff8256f000 2e20 smbus.ko
12 1 0xffffffff82572000 6688 ichsmb.ko
13 1 0xffffffff82579000 115b8 ipmi.ko
14 1 0xffffffff82621000 10582 geom_eli.ko
15 1 0xffffffff82632000 587b fdescfs.ko
16 1 0xffffffff82638000 3710 ums.ko
17 1 0xffffffff8263c000 4485 if_epair.ko


also it appears it's only one jail in particular that has issues, not entirely
sure why though.

I don't see that much difference between the jails only what is different is
that the one that is crashing has 2 vlans running rather then one, not sure how
this can be an issue though.
--
You are receiving this mail because:
You are the assignee for the bug.
b***@freebsd.org
2016-11-01 23:37:47 UTC
Permalink
Raw Message
https://bugs.freebsd.org/bugzilla/show_bug.cgi?id=213896

--- Comment #5 from ***@ofloo.net ---
I disabled all the daemons except sshd and still it crashed, ..
--
You are receiving this mail because:
You are the assignee for the bug.
b***@freebsd.org
2017-07-16 14:33:18 UTC
Permalink
Raw Message
https://bugs.freebsd.org/bugzilla/show_bug.cgi?id=213896

Heinz N. Gies <***@project-fifo.net> changed:

What |Removed |Added
----------------------------------------------------------------------------
CC| |***@project-fifo.net

--- Comment #6 from Heinz N. Gies <***@project-fifo.net> ---
I'm having the same problem and can reliably reproduce it.

I noticed it when testing vmadm (https://github.com/project-fifo/r-vmadm) and
starting and stopping a jail a few times.

The basic steps for start are:

rctl -a jail:d0f4fea3-e368-4346-b44c-50cfbcffa287:memoryuse:deny=1024M
jail:d0f4fea3-e368-4346-b44c-50cfbcffa287:memorylocked:deny=1024M
jail:d0f4fea3-e368-4346-b44c-50cfbcffa287:shmsize:deny=1024M
jail:d0f4fea3-e368-4346-b44c-50cfbcffa287:pcpu:deny=100
jail:d0f4fea3-e368-4346-b44c-50cfbcffa287:maxproc:deny=2000

mount -t devfs devfs
/zroot/jails/d0f4fea3-e368-4346-b44c-50cfbcffa287/root/dev

mount -t devfs devfs
/zroot/jails/d0f4fea3-e368-4346-b44c-50cfbcffa287/root/jail/dev

jail -i -c persist name=d0f4fea3-e368-4346-b44c-50cfbcffa287
path=/zroot/jails/d0f4fea3-e368-4346-b44c-50cfbcffa287/root
host.hostuuid=d0f4fea3-e368-4346-b44c-50cfbcffa287 host.hostname=test
devfs_ruleset=4 securelevel=2 sysvmsg=new sysvsem=new sysvshm=new
allow.raw_sockets children.max=1 vnet=new vnet.interface=epair0b
exec.start="/sbin/ifconfig epair0b name net0p; /sbin/ifconfig net0p.5 create
vlan 5 vlandev net0p; /sbin/ifconfig net0p.5 name net0; /sbin/ifconfig net0
inet 192.168.1.234 255.255.255.0; /sbin/route add default -gateway 192.168.1.1;
/sbin/ifconfig lo0 127.0.0.1 up; jail -c persist
name=d0f4fea3-e368-4346-b44c-50cfbcffa287 host.hostname=test path=/jail
ip4=inherit devfs_ruleset=4 securelevel=2 sysvmsg=new sysvsem=new sysvshm=new
allow.raw_sockets exec.start='sh /etc/rc'"

ifconfig epair0a name j1:net0



and destroying the jail the same way in reverse (stop, unmount, remove rctl
entries dstrouy j1:net0)


kernel is FreeBSD fifo-bsd 11.0-RELEASE-p1 with


which is the standard kenrel config plus

nooptions SCTP # Stream Control Transmission Protocol
options VIMAGE # VNET/Vimage support
options RACCT # Resource containers
options RCTL # same as above


I've uploaded the kernel dump here
https://www.dropbox.com/s/73mb8e64cb7zwbe/crash.tar.xz?dl=0 (it's too big to
attach)
--
You are receiving this mail because:
You are the assignee for the bug.
b***@freebsd.org
2017-07-16 14:38:32 UTC
Permalink
Raw Message
https://bugs.freebsd.org/bugzilla/show_bug.cgi?id=213896

--- Comment #7 from Heinz N. Gies <***@project-fifo.net> ---
Created attachment 184394
--> https://bugs.freebsd.org/bugzilla/attachment.cgi?id=184394&action=edit
core.txt from the kernel panic

I'll attach the core.txt from one of those crashes directly.
--
You are receiving this mail because:
You are the assignee for the bug.
b***@freebsd.org
2017-07-16 15:03:41 UTC
Permalink
Raw Message
https://bugs.freebsd.org/bugzilla/show_bug.cgi?id=213896

--- Comment #8 from Heinz N. Gies <***@project-fifo.net> ---
Adding more context, the bug seems to be the same as discussed here:
http://mpc.lists.freebsd.current.narkive.com/Wotl1Q0o/panic-possibly-on-on-bridge-member-removal
--
You are receiving this mail because:
You are the assignee for the bug.
b***@freebsd.org
2017-07-16 16:06:28 UTC
Permalink
Raw Message
https://bugs.freebsd.org/bugzilla/show_bug.cgi?id=213896

--- Comment #9 from Heinz N. Gies <***@project-fifo.net> ---
Created attachment 184403
--> https://bugs.freebsd.org/bugzilla/attachment.cgi?id=184403&action=edit
Test case to introduce this bug.

Adding a test case, it works nearly 100% reliably for me when run as one of the
first commands on the system.
--
You are receiving this mail because:
You are the assignee for the bug.
b***@freebsd.org
2017-07-24 17:05:04 UTC
Permalink
Raw Message
https://bugs.freebsd.org/bugzilla/show_bug.cgi?id=213896

--- Comment #10 from gronke <***@gronke.net> ---
I used a similar script to reproduce the bug and noticed it only occurs when
the host's epair nic went up before destroying the jail.

This snippet manually attaches the nic to the jail after it was started and
takes "yes" as first argument to change the host's nic state.

$ ./crash-demo.sh no
...
done
$ ./crash-demo.sh yes
crash
--

#!/bin/sh

UPDOWNIF="$1"
BRIDGE_IF=bridge1

ifconfig $BRIDGE_IF create
set -x

for i in $(seq 0 200); do

#jail -c vnet persist path=$RELEASE_FOLDER name=jail-vnet
jail -c vnet persist name=jail-vnet

epair_a="$(ifconfig epair create)"
epair_b="$(echo $epair_a | rev | cut -c2- | rev)b"

mac_a=$(openssl rand -hex 6 | sed 's/\(..\)/\1:/g; s/.$//')

ifconfig $epair_a name a-$i
ifconfig a-$i ether "$mac_a"

if [ "$UPDOWNIF" == "yes" ]; then
ifconfig a-$i up
fi

ifconfig $BRIDGE_IF addm a-$i
ifconfig $epair_b vnet jail-vnet

jexec jail-vnet /sbin/ifconfig $epair_b name vnet0
jexec jail-vnet /sbin/ifconfig vnet0 up
jexec jail-vnet /sbin/ifconfig

jail -r jail-vnet

if [ "$UPDOWNIF" == "yes" ]; then
ifconfig a-$i down
fi
ifconfig $BRIDGE_IF deletem a-$i
ifconfig a-$i destroy

done

echo "done"
--
You are receiving this mail because:
You are the assignee for the bug.
b***@freebsd.org
2017-07-24 21:23:45 UTC
Permalink
Raw Message
https://bugs.freebsd.org/bugzilla/show_bug.cgi?id=213896

--- Comment #11 from Heinz N. Gies <***@project-fifo.net> ---
Given how easy this is to reproduce and we've 3 people in here now any chance
to change the importance from 'affects only me' to 'affects some people'?
--
You are receiving this mail because:
You are the assignee for the bug.
b***@freebsd.org
2017-07-24 21:30:56 UTC
Permalink
Raw Message
https://bugs.freebsd.org/bugzilla/show_bug.cgi?id=213896

Mark Linimon <***@FreeBSD.org> changed:

What |Removed |Added
----------------------------------------------------------------------------
Severity|Affects Only Me |Affects Some People

--- Comment #12 from Mark Linimon <***@FreeBSD.org> ---
Flip the "affects some people" switch.

IIUC FreeBSD really doesn't pay much attention to that field; it's a default
field in Bugzilla. e.g. there's no formal triage procedure.
--
You are receiving this mail because:
You are the assignee for the bug.
b***@freebsd.org
2017-07-25 06:23:38 UTC
Permalink
Raw Message
https://bugs.freebsd.org/bugzilla/show_bug.cgi?id=213896

--- Comment #13 from Ofloo <***@ofloo.net> ---
I'm not sure if this is the cause, but I can't upgrade 2 other machines with 10
jails because of it.

So to me it is important as well.
--
You are receiving this mail because:
You are the assignee for the bug.
b***@freebsd.org
2017-07-25 06:26:39 UTC
Permalink
Raw Message
https://bugs.freebsd.org/bugzilla/show_bug.cgi?id=213896

--- Comment #14 from Ofloo <***@ofloo.net> ---
I should of payed more attention to what was said never mind. Disregard that
last comment.
--
You are receiving this mail because:
You are the assignee for the bug.
b***@freebsd.org
2017-07-30 15:12:20 UTC
Permalink
Raw Message
https://bugs.freebsd.org/bugzilla/show_bug.cgi?id=213896

Mark Linimon <***@FreeBSD.org> changed:

What |Removed |Added
----------------------------------------------------------------------------
Keywords| |patch
--
You are receiving this mail because:
You are the assignee for the bug.
b***@freebsd.org
2017-07-30 16:11:02 UTC
Permalink
Raw Message
https://bugs.freebsd.org/bugzilla/show_bug.cgi?id=213896

Bjoern A. Zeeb <***@FreeBSD.org> changed:

What |Removed |Added
----------------------------------------------------------------------------
CC| |***@FreeBSD.org
Assignee|freebsd-***@FreeBSD.org |***@freebsd.org
--
You are receiving this mail because:
You are the assignee for the bug.
b***@freebsd.org
2017-07-30 12:03:24 UTC
Permalink
Raw Message
https://bugs.freebsd.org/bugzilla/show_bug.cgi?id=213896

Kristof Provost <***@freebsd.org> changed:

What |Removed |Added
----------------------------------------------------------------------------
CC| |***@freebsd.org

--- Comment #16 from Kristof Provost <***@freebsd.org> ---
I've posted a proposed patch in https://reviews.freebsd.org/D11782

The panic in the last comment happens because ifp->if_bpf is NULL, which
happens due to a race in bpf_if cleanup (as described in the patch).
With this patch the script in Comment #10 no longer panics.
--
You are receiving this mail because:
You are the assignee for the bug.
Loading...