settingsLogin | Registersettings

[Openstack] 99.5% of packets are disappearing somewhere between the Linux Bridge (brqxxxxzzzz-yy) and the tap (tapxxxxzzzz-yy).

0 votes

Guys,

What can cause packets to just disappear after arriving the bridge
"brqxxxxzzzz-yy" ???

I'm using "VLAN Provider Networks", Juno on top of Trusty.

With Neutron ML2 + LinuxBridges, setup "all-in-one".

Where:

  • eth0 is the default - api, etc;
  • eth1 is the "external" - floating ip, etc;
  • eth2 is the physical vlan mapped into ML2;
  • eth3 is another physical vlan mapped into ML2;
  • dummy0 is being used by ML2 for VXLAN.

Explaining:

  • I can see the tagged packets arriving at "eth3", by using "tcpdump -eni
    eth2 | grep "vlan 666";

  • I can see the untagged packets arriving at "brq50b13311-fa", by using
    "tcpdump -eni brq50b13311-fa";

  • I CAN NOT see the untagged packets arriving at "tap9a546be0-d6", by
    using "tcpdump -eni tap9a546be0-d6"!

    "tcpdump -eni tap9a546be0-d6" only shows "alien" packets for this "tap",
    like this:

    http://paste.openstack.org/show/356838/ - While what is arriving at
    "brq50b13311-fa" looks completely different!

    For example, I can not see the string "Cisco" while running "tcpdump -eni
    brq50b13311-fa | grep -i cisco", so, where those packets come from (that
    I'm seeing on tap9a546be0-d6 and within its instance - pastebin above) ???

Instance details:


...







...

"brctl show" returns:


bridge name bridge id STP enabled
interfaces
....
brq50b13311-fa 8000.ecf4bbd0417b no
eth3.666
           tap9a546be0-d6

....


"neutron net-show XXX" returns:

http://paste.openstack.org/show/356845/

-

ML2 configuration contains:

http://paste.openstack.org/show/356860/

-

Can someone please, tell me, why ~99.5% of the packets are disappearing
out of nothing?

What is driving me crazy is that, on top of this very same setup
(including e1000 driver), but with different vlan tag, it works!

I already disabled "rp_filter", ebtables, arptables, iptables, also, all
files under "/proc/sys/net/bridge" have "0"...

I really appreciate any help! I'm working on this for about 16 hours
straight...

Thanks,
Thiago


Mailing list: http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack
Post to : openstack@lists.openstack.org
Unsubscribe : http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack
asked Jul 9, 2015 in openstack by Martinx_-_ジェームズ (9,600 points)   6 23 34

22 Responses

0 votes

Since you are using vxlan tunnel, have you increased the MTU on all
compute nodes and network node to accommodate the vxlan tunnel packets?

http://docs.openstack.org/juno/config-reference/content/networking-options-plugins-ml2.html

On 07/08/2015 05:41 PM, Martinx - ジェームズ wrote:
Guys,

What can cause packets to just disappear after arriving the bridge
"brqxxxxzzzz-yy" ???

I'm using "VLAN Provider Networks", Juno on top of Trusty.

With Neutron ML2 + LinuxBridges, setup "all-in-one".

Where:

  • eth0 is the default - api, etc;
  • eth1 is the "external" - floating ip, etc;
  • eth2 is the physical vlan mapped into ML2;
  • eth3 is another physical vlan mapped into ML2;
  • dummy0 is being used by ML2 for VXLAN.

Explaining:

  • I can see the tagged packets arriving at "eth3", by using "tcpdump
    -eni eth2 | grep "vlan 666";

  • I can see the untagged packets arriving at "brq50b13311-fa", by
    using "tcpdump -eni brq50b13311-fa";

  • I CAN NOT see the untagged packets arriving at "tap9a546be0-d6", by
    using "tcpdump -eni tap9a546be0-d6"!

    "tcpdump -eni tap9a546be0-d6" only shows "alien" packets for this
    "tap", like this:

  • While what is arriving at "brq50b13311-fa" looks completely different!

    For example, I can not see the string "Cisco" while running "tcpdump
    -eni brq50b13311-fa | grep -i cisco", so, where those packets come
    from (that I'm seeing on tap9a546be0-d6 and within its instance -
    pastebin above) ???

Instance details:


...







...

"brctl show" returns:


bridge name bridge id STP enabled
interfaces
....
brq50b13311-fa 8000.ecf4bbd0417b no
eth3.666
tap9a546be0-d6
....

"neutron net-show XXX" returns:

http://paste.openstack.org/show/356845/

-

ML2 configuration contains:

-

Can someone please, tell me, why ~99.5% of the packets are
disappearing out of nothing?

What is driving me crazy is that, on top of this very same setup
(including e1000 driver), but with different vlan tag, it works!

I already disabled "rp_filter", ebtables, arptables, iptables, also,
all files under "/proc/sys/net/bridge" have "0"...

I really appreciate any help! I'm working on this for about 16 hours
straight...

Thanks,
Thiago


Mailing list: http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack
Post to : openstack@lists.openstack.org
Unsubscribe : http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack


Mailing list: http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack
Post to : openstack@lists.openstack.org
Unsubscribe : http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack
responded Jul 9, 2015 by Jerry_Xinyu_Zhao (560 points)   2
0 votes

On 8 July 2015 at 22:50, Jerry Zhao xyzjerry@gmail.com wrote:

Since you are using vxlan tunnel, have you increased the MTU on all
compute nodes and network node to accommodate the vxlan tunnel packets?

http://docs.openstack.org/juno/config-reference/content/networking-options-plugins-ml2.html

Yes, all my VXLAN networks have MTU=1450.

I'm not seeing any problems with my VXLAN networks.

My problem lies exclusively within the "VLAN Provider Networks" / related
Neutron LinuxBridges.

I'm attaching my Instances directly to a physical network of the Compute
Node itself (1 tagged VLAN for each Instance). But, it doesn't work as
expected.

Thanks!


Mailing list: http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack
Post to : openstack@lists.openstack.org
Unsubscribe : http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack
responded Jul 9, 2015 by Martinx_-_ジェームズ (9,600 points)   6 23 34
0 votes

I wonder if l2 population is interferring. Can you try disabling l2
population on the agent side so forwarding entries aren't being setup and
see if the issue persists?
On Jul 8, 2015 8:20 PM, "Martinx - ジェームズ" thiagocmartinsc@gmail.com wrote:

On 8 July 2015 at 22:50, Jerry Zhao xyzjerry@gmail.com wrote:

Since you are using vxlan tunnel, have you increased the MTU on all
compute nodes and network node to accommodate the vxlan tunnel packets?

http://docs.openstack.org/juno/config-reference/content/networking-options-plugins-ml2.html

Yes, all my VXLAN networks have MTU=1450.

I'm not seeing any problems with my VXLAN networks.

My problem lies exclusively within the "VLAN Provider Networks" / related
Neutron LinuxBridges.

I'm attaching my Instances directly to a physical network of the Compute
Node itself (1 tagged VLAN for each Instance). But, it doesn't work as
expected.

Thanks!


Mailing list:
http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack
Post to : openstack@lists.openstack.org
Unsubscribe :
http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack


Mailing list: http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack
Post to : openstack@lists.openstack.org
Unsubscribe : http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack
responded Jul 9, 2015 by Kevin_Benton (24,800 points)   3 5 5
0 votes

Hi,

We had something similar once but the cause was a miss config in our LACP
bonding... By your config you don't seem to be using bonding but maybe
something external to neutron is causing you the problem.

If you manually launch a VM connected to the bridge brq50b13311-fa, can you
see the packets ?

2015-07-09 5:43 GMT+02:00 Kevin Benton blak111@gmail.com:

I wonder if l2 population is interferring. Can you try disabling l2
population on the agent side so forwarding entries aren't being setup and
see if the issue persists?
On Jul 8, 2015 8:20 PM, "Martinx - ジェームズ" thiagocmartinsc@gmail.com
wrote:

On 8 July 2015 at 22:50, Jerry Zhao xyzjerry@gmail.com wrote:

Since you are using vxlan tunnel, have you increased the MTU on all
compute nodes and network node to accommodate the vxlan tunnel packets?

http://docs.openstack.org/juno/config-reference/content/networking-options-plugins-ml2.html

Yes, all my VXLAN networks have MTU=1450.

I'm not seeing any problems with my VXLAN networks.

My problem lies exclusively within the "VLAN Provider Networks" / related
Neutron LinuxBridges.

I'm attaching my Instances directly to a physical network of the Compute
Node itself (1 tagged VLAN for each Instance). But, it doesn't work as
expected.

Thanks!


Mailing list:
http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack
Post to : openstack@lists.openstack.org
Unsubscribe :
http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack


Mailing list:
http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack
Post to : openstack@lists.openstack.org
Unsubscribe :
http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack

Cynthia Lopes do Sacramento


Mailing list: http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack
Post to : openstack@lists.openstack.org
Unsubscribe : http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack
responded Jul 9, 2015 by Cynthia_Lopes (800 points)   1
0 votes

Hi Thiago,

  • I can see the untagged packets arriving at "brq50b13311-fa", by using "tcpdump -eni brq50b13311-fa";

Do you mind posting the packet capture from eth3 and the bridge on pastebin?

For example, I can not see the string "Cisco" while running "tcpdump -eni brq50b13311-fa | grep -i cisco", so, where those packets come from (that I'm seeing on tap9a546be0-d6 and within its instance - pastebin above) ???

Those are multicast packets for PVST and OSPF from the switch and router, respectively. You might try filtering by MAC on the bridge instead of using grep to isolate those packets:

tcpdump -eni brq50b13311-fa ether dst 01:00:0c:cc:cc:cd

I would expect to see those packets on eth3 as well.

  • I CAN NOT see the untagged packets arriving at "tap9a546be0-d6", by using "tcpdump -eni tap9a546be0-d6"!

What do your security group rules look like?

What is driving me crazy is that, on top of this very same setup (including e1000 driver), but with different vlan tag, it works!

Is it the same eth3 interface? You may want to avoid vlan 666, anyway. Never known those numbers to be lucky.

James

On Jul 8, 2015, at 7:41 PM, Martinx - ジェームズ thiagocmartinsc@gmail.com wrote:

Guys,

What can cause packets to just disappear after arriving the bridge "brqxxxxzzzz-yy" ???

I'm using "VLAN Provider Networks", Juno on top of Trusty.

With Neutron ML2 + LinuxBridges, setup "all-in-one".

Where:

  • eth0 is the default - api, etc;
  • eth1 is the "external" - floating ip, etc;
  • eth2 is the physical vlan mapped into ML2;
  • eth3 is another physical vlan mapped into ML2;
  • dummy0 is being used by ML2 for VXLAN.

Explaining:

  • I can see the tagged packets arriving at "eth3", by using "tcpdump -eni eth2 | grep "vlan 666";

  • I can see the untagged packets arriving at "brq50b13311-fa", by using "tcpdump -eni brq50b13311-fa";

  • I CAN NOT see the untagged packets arriving at "tap9a546be0-d6", by using "tcpdump -eni tap9a546be0-d6"!

    "tcpdump -eni tap9a546be0-d6" only shows "alien" packets for this "tap", like this:

    http://paste.openstack.org/show/356838/ - While what is arriving at "brq50b13311-fa" looks completely different!

    For example, I can not see the string "Cisco" while running "tcpdump -eni brq50b13311-fa | grep -i cisco", so, where those packets come from (that I'm seeing on tap9a546be0-d6 and within its instance - pastebin above) ???

Instance details:


...







...

"brctl show" returns:


bridge name bridge id STP enabled interfaces
....
brq50b13311-fa 8000.ecf4bbd0417b no eth3.666
tap9a546be0-d6
....

"neutron net-show XXX" returns:

http://paste.openstack.org/show/356845/

-

ML2 configuration contains:

http://paste.openstack.org/show/356860/

-

Can someone please, tell me, why ~99.5% of the packets are disappearing out of nothing?

What is driving me crazy is that, on top of this very same setup (including e1000 driver), but with different vlan tag, it works!

I already disabled "rp_filter", ebtables, arptables, iptables, also, all files under "/proc/sys/net/bridge" have "0"...

I really appreciate any help! I'm working on this for about 16 hours straight...

Thanks,
Thiago


Mailing list: http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack
Post to : openstack@lists.openstack.org
Unsubscribe : http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack


Mailing list: http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack
Post to : openstack@lists.openstack.org
Unsubscribe : http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack

responded Jul 9, 2015 by James_Denton (3,860 points)   2 2
0 votes

Hello James!

On 9 July 2015 at 11:17, James Denton james.denton@rackspace.com wrote:

Hi Thiago,

  • I can see the untagged packets arriving at "brq50b13311-fa", by using
    "tcpdump -eni brq50b13311-fa";

Do you mind posting the packet capture from eth3 and the bridge on
pastebin?

I don't mind, I'll just replace the public IPs before posting (and possibly
MAC)...

  • Actual traffic hitting physical "eth3" with VLAN tag (OK):

http://paste.openstack.org/show/360214/

  • Actual traffic hitting "brq50b13311-fa" without tag (OK):

http://paste.openstack.org/show/360249/

  • Actual traffic hitting "tap9a546be0-d6" without tag (BUGGED - missing
    packets):

http://paste.openstack.org/show/360274/

  • Actual traffic hitting vNIC "eth3" without tag (BUGGED - missing packets):

http://paste.openstack.org/show/360275/

*** Only PVST, OSPF and ICMP are appearing inside the Instance (and its
tap, of course) ***

For example, I can not see the string "Cisco" while running "tcpdump -eni
brq50b13311-fa | grep -i cisco", so, where those packets come from (that
I'm seeing on tap9a546be0-d6 and within its instance - pastebin above) ???

Those are multicast packets for PVST and OSPF from the switch and router,
respectively. You might try filtering by MAC on the bridge instead of using
grep to isolate those packets:

tcpdump -eni brq50b13311-fa ether dst 01:00:0c:cc:cc:cd

I would expect to see those packets on eth3 as well.

You're absolutely right!

The PVST, OSPF (and very rare ICMP) are appearing @ eth3 too (with "vlan
XXXX" tagged), my bad (that grep, "ether dst" is much better, tks).

Look, inside the Instance - vNIC eth3:

tcpdump -eni eth3

http://paste.openstack.org/show/360127/

Only the PVST, OSPF and ICMP packets are hitting the tapxxxxzzzz-yy
interface! As expected, I can see those packets inside of the Instance as
well (Pastebin above).

Why TCP/UDP isn't passing?

  • I CAN NOT see the untagged packets arriving at "tap9a546be0-d6", by
    using "tcpdump -eni tap9a546be0-d6"!

What do your security group rules look like?

I have no Security Groups, no Firewall, no ipset...

ML2 configuration contains:

http://paste.openstack.org/show/356860/

What is driving me crazy is that, on top of this very same setup
(including e1000 driver), but with different vlan tag, it works!

Is it the same eth3 interface? You may want to avoid vlan 666, anyway.
Never known those numbers to be lucky.

Yes, very same eth3.

LOL... I just posted this number here, to not publish private data, actual
VLAN ID is different. :-P

Why it works for "VLAN X", but not for "VLAN Y", is a mystery for me.

Thank you so much for your help!

I'm seeing some debugging progress here...

Hopping to get this fixed! It is very important for the project that I'm
working on.

James

Thiago


Mailing list: http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack
Post to : openstack@lists.openstack.org
Unsubscribe : http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack
responded Jul 10, 2015 by Martinx_-_ジェームズ (9,600 points)   6 23 34
0 votes

Kevin, do you think the L2 Population can interfere with non-VXLAN networks
as well?

I'll try to disable it. And enable vni_group instead, what do you think?

I'm not seeing "l2 pop" having effect on my "VLAN Provider Networks" but, I
can be wrong, I'm not sure...

Thanks for the tip!

On 9 July 2015 at 00:43, Kevin Benton blak111@gmail.com wrote:

I wonder if l2 population is interferring. Can you try disabling l2
population on the agent side so forwarding entries aren't being setup and
see if the issue persists?
On Jul 8, 2015 8:20 PM, "Martinx - ジェームズ" thiagocmartinsc@gmail.com
wrote:

On 8 July 2015 at 22:50, Jerry Zhao xyzjerry@gmail.com wrote:

Since you are using vxlan tunnel, have you increased the MTU on all
compute nodes and network node to accommodate the vxlan tunnel packets?

http://docs.openstack.org/juno/config-reference/content/networking-options-plugins-ml2.html

Yes, all my VXLAN networks have MTU=1450.

I'm not seeing any problems with my VXLAN networks.

My problem lies exclusively within the "VLAN Provider Networks" / related
Neutron LinuxBridges.

I'm attaching my Instances directly to a physical network of the Compute
Node itself (1 tagged VLAN for each Instance). But, it doesn't work as
expected.

Thanks!


Mailing list:
http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack
Post to : openstack@lists.openstack.org
Unsubscribe :
http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack


Mailing list: http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack
Post to : openstack@lists.openstack.org
Unsubscribe : http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack
responded Jul 10, 2015 by Martinx_-_ジェームズ (9,600 points)   6 23 34
0 votes

Hi Cynthia,

That's actually a very good idea! I was thinking about this as well... But
still did not tried it.

Basically I need to stop "nova-compute" and edit Instance configuration
with "virsh edit instance-id"... Sound very easy to try...

Attaching an "alien" instance directly to the "brq50b13311-fa" sounds even
easier.

I'll definitively give it a try!

Thanks!

On 9 July 2015 at 05:20, Cynthia Lopes clsacramento@gmail.com wrote:

Hi,

We had something similar once but the cause was a miss config in our LACP
bonding... By your config you don't seem to be using bonding but maybe
something external to neutron is causing you the problem.

If you manually launch a VM connected to the bridge brq50b13311-fa, can
you see the packets ?

2015-07-09 5:43 GMT+02:00 Kevin Benton blak111@gmail.com:

I wonder if l2 population is interferring. Can you try disabling l2
population on the agent side so forwarding entries aren't being setup and
see if the issue persists?
On Jul 8, 2015 8:20 PM, "Martinx - ジェームズ" thiagocmartinsc@gmail.com
wrote:

On 8 July 2015 at 22:50, Jerry Zhao xyzjerry@gmail.com wrote:

Since you are using vxlan tunnel, have you increased the MTU on all
compute nodes and network node to accommodate the vxlan tunnel packets?

http://docs.openstack.org/juno/config-reference/content/networking-options-plugins-ml2.html

Yes, all my VXLAN networks have MTU=1450.

I'm not seeing any problems with my VXLAN networks.

My problem lies exclusively within the "VLAN Provider Networks" /
related Neutron LinuxBridges.

I'm attaching my Instances directly to a physical network of the Compute
Node itself (1 tagged VLAN for each Instance). But, it doesn't work as
expected.

Thanks!


Mailing list:
http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack
Post to : openstack@lists.openstack.org
Unsubscribe :
http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack


Mailing list:
http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack
Post to : openstack@lists.openstack.org
Unsubscribe :
http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack

Cynthia Lopes do Sacramento


Mailing list: http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack
Post to : openstack@lists.openstack.org
Unsubscribe : http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack
responded Jul 10, 2015 by Martinx_-_ジェームズ (9,600 points)   6 23 34
0 votes

Thanks, Thiago.

Do you mind running a capture across the 3 interfaces (eth, bridge, tap) simultaneously? In particular, traffic generated outside of the node that demonstrates connection attempts to your instance. It will be helpful to see if there are continuous ARP requests without replies, or a reply and continuous TCP SYN packets and whatnot. On the tap interface you should only expect to see broadcast, multicast, and unicast traffic to the MAC address of the instance. Because the MAC addresses are masqueraded in those captures, and they're not related, it's hard to tell what you're seeing. Do you mind not masking them this time around?

Also, what is the IP address of the instance? Seeing that this is an all-in-one, I'm guessing you didn't having issues with DHCP?

Thanks,

James


From: Martinx - ジェームズ thiagocmartinsc@gmail.com
Sent: Thursday, July 9, 2015 8:51 PM
To: James Denton
Cc: openstack@lists.openstack.org
Subject: Re: [Openstack] 99.5% of packets are disappearing somewhere between the Linux Bridge (brqxxxxzzzz-yy) and the tap (tapxxxxzzzz-yy).

Hello James!

On 9 July 2015 at 11:17, James Denton james.denton@rackspace.com wrote:
Hi Thiago,

  • I can see the untagged packets arriving at "brq50b13311-fa", by using "tcpdump -eni brq50b13311-fa";

Do you mind posting the packet capture from eth3 and the bridge on pastebin?

I don't mind, I'll just replace the public IPs before posting (and possibly MAC)...

  • Actual traffic hitting physical "eth3" with VLAN tag (OK):

http://paste.openstack.org/show/360214/

  • Actual traffic hitting "brq50b13311-fa" without tag (OK):

http://paste.openstack.org/show/360249/

  • Actual traffic hitting "tap9a546be0-d6" without tag (BUGGED - missing packets):

http://paste.openstack.org/show/360274/

  • Actual traffic hitting vNIC "eth3" without tag (BUGGED - missing packets):

http://paste.openstack.org/show/360275/

*** Only PVST, OSPF and ICMP are appearing inside the Instance (and its tap, of course) ***

For example, I can not see the string "Cisco" while running "tcpdump -eni brq50b13311-fa | grep -i cisco", so, where those packets come from (that I'm seeing on tap9a546be0-d6 and within its instance - pastebin above) ???

Those are multicast packets for PVST and OSPF from the switch and router, respectively. You might try filtering by MAC on the bridge instead of using grep to isolate those packets:

tcpdump -eni brq50b13311-fa ether dst 01:00:0c:cc:cc:cd

I would expect to see those packets on eth3 as well.

You're absolutely right!

The PVST, OSPF (and very rare ICMP) are appearing @ eth3 too (with "vlan XXXX" tagged), my bad (that grep, "ether dst" is much better, tks).

Look, inside the Instance - vNIC eth3:

tcpdump -eni eth3

http://paste.openstack.org/show/360127/

Only the PVST, OSPF and ICMP packets are hitting the tapxxxxzzzz-yy interface! As expected, I can see those packets inside of the Instance as well (Pastebin above).

Why TCP/UDP isn't passing?

  • I CAN NOT see the untagged packets arriving at "tap9a546be0-d6", by using "tcpdump -eni tap9a546be0-d6"!

What do your security group rules look like?

I have no Security Groups, no Firewall, no ipset...

ML2 configuration contains:

http://paste.openstack.org/show/356860/

What is driving me crazy is that, on top of this very same setup (including e1000 driver), but with different vlan tag, it works!

Is it the same eth3 interface? You may want to avoid vlan 666, anyway. Never known those numbers to be lucky.

Yes, very same eth3.

LOL... I just posted this number here, to not publish private data, actual VLAN ID is different. :-P

Why it works for "VLAN X", but not for "VLAN Y", is a mystery for me.

Thank you so much for your help!

I'm seeing some debugging progress here...

Hopping to get this fixed! It is very important for the project that I'm working on.

James

Thiago


Mailing list: http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack
Post to : openstack@lists.openstack.org
Unsubscribe : http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack
responded Jul 10, 2015 by James_Denton (3,860 points)   2 2
0 votes

Thank you James!

Listen, trying to resume my topology for you, my Instance have two
interfaces.

1- eth0 - its default gateway - vxlan with dhcp (okay);
2- eth1 - vlan provider network (the one that doesn't work) (eth3 if
instance have 4 vNIC).

I can access the Instance via ssh through its namespace router normally (or
VNC Console).

For "eth1", the physical switch port sends a mirrored tagged vlan traffic
to it. My instance just (wants to) consumes that traffic (untagged)... My
instance is a kind of NFV (in "offline" mode)...

Do you still think that makes sense to capture the traffic simultaneously?

You're right, "eth1" have no IP, it is just UP, reading packets...

Thanks!
Thiago

On 10 July 2015 at 00:12, James Denton james.denton@rackspace.com wrote:

Thanks, Thiago.

Do you mind running a capture across the 3 interfaces (eth, bridge,
tap) simultaneously? In particular, traffic generated outside of the
node that demonstrates connection attempts to your instance. It will be
helpful to see if there are continuous ARP requests without replies, or a
reply and continuous TCP SYN packets and whatnot. On the tap interface you
should only expect to see broadcast, multicast, and unicast traffic to the
MAC address of the instance. Because the MAC addresses are masqueraded in
those captures, and they're not related, it's hard to tell what you're
seeing. Do you mind not masking them this time around?

Also, what is the IP address of the instance? Seeing that this is an
all-in-one, I'm guessing you didn't having issues with DHCP?

Thanks,

James



From: Martinx - ジェームズ thiagocmartinsc@gmail.com
Sent: Thursday, July 9, 2015 8:51 PM
To: James Denton
Cc: openstack@lists.openstack.org
Subject: Re: [Openstack] 99.5% of packets are disappearing somewhere
between the Linux Bridge (brqxxxxzzzz-yy) and the tap (tapxxxxzzzz-yy).

Hello James!

On 9 July 2015 at 11:17, James Denton james.denton@rackspace.com wrote:

Hi Thiago,

  • I can see the untagged packets arriving at "brq50b13311-fa", by
    using "tcpdump -eni brq50b13311-fa";

    Do you mind posting the packet capture from eth3 and the bridge on
    pastebin?

I don't mind, I'll just replace the public IPs before posting (and
possibly MAC)...

For example, I can not see the string "Cisco" while running "tcpdump
-eni brq50b13311-fa | grep -i cisco", so, where those packets come from
(that I'm seeing on tap9a546be0-d6 and within its instance - pastebin
above) ???

Those are multicast packets for PVST and OSPF from the switch and
router, respectively. You might try filtering by MAC on the bridge instead
of using grep to isolate those packets:

tcpdump -eni brq50b13311-fa ether dst 01:00:0c:cc:cc:cd

I would expect to see those packets on eth3 as well.

You're absolutely right!

The PVST, OSPF (and very rare ICMP) are appearing @ eth3 too (with "vlan
XXXX" tagged), my bad (that grep, "ether dst" is much better, tks).

Look, inside the Instance - vNIC eth3:

tcpdump -eni eth3

http://paste.openstack.org/show/360127/

Only the PVST, OSPF and ICMP packets are hitting the tapxxxxzzzz-yy
interface! As expected, I can see those packets inside of the Instance as
well (Pastebin above).

Why TCP/UDP isn't passing?

  • I CAN NOT see the untagged packets arriving at "tap9a546be0-d6", by
    using "tcpdump -eni tap9a546be0-d6"!

    What do your security group rules look like?

I have no Security Groups, no Firewall, no ipset...

ML2 configuration contains:

http://paste.openstack.org/show/356860/

What is driving me crazy is that, on top of this very same setup
(including e1000 driver), but with different vlan tag, it works!

Is it the same eth3 interface? You may want to avoid vlan 666, anyway.
Never known those numbers to be lucky.

Yes, very same eth3.

LOL... I just posted this number here, to not publish private data,
actual VLAN ID is different. :-P

Why it works for "VLAN X", but not for "VLAN Y", is a mystery for me.

Thank you so much for your help!

I'm seeing some debugging progress here...

Hopping to get this fixed! It is very important for the project that I'm
working on.

James

Thiago


Mailing list: http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack
Post to : openstack@lists.openstack.org
Unsubscribe : http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack
responded Jul 10, 2015 by Martinx_-_ジェームズ (9,600 points)   6 23 34
...