settingsLogin | Registersettings

[Openstack] VM can receive traffic, but not send it

0 votes

Our info:

Openstack version: Mitaka (using OVS 2.5)
Firewall driver: Openvswitch

Anyone know why VM's that are directly on a Flat Provider Network (so the
VM would have a public IP directly assigned to it) can download data just
fine, but when we try and upload anything (iperf where the VM is the client
or something even like speedtest.net (upload portion)) the VM simply can't
get data out to the intended destination? Again, download works great,
upload doesn't.

If I take that VM and change it's interface to be a tenant network one that
has a Openstack HA virtual router, everything (upload and download) works
perfectly. The problem only seems to be apparent when the VM is directly on
the external network.

It seems like an MTU issue, but I don't see how... Here are the MTU's of
the part's at play:

VM: 1500
br-int (specific interface connecting to VM) - 9216
br-ex - (can't tell what that MTU is set to)

Any help would be GREATLY appreciated.

Steve


Mailing list: http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack
Post to : openstack@lists.openstack.org
Unsubscribe : http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack
asked Mar 24, 2017 in openstack by Sterdnot_Shaken (900 points)   2 4 10

9 Responses

0 votes

Wow! Thanks for answering both of my questions!

So, I did some things you suggested, including setting the MSS in iperf to
something small (1000 bytes) and tested with no improvement. I then changed
the VM running on Openstack to have an MTU of 1000 and retested with no
improvement. I noticed that the node I was testing against was reporting
back to the VM on Openstack that it had an MSS of 8960, so just for the
heck of it, I changed the remote node's (server outside of Openstack) MTU
also to 1000 bytes and retested with no improvement. (The effects of all of
these tests were also validated by checking mss settings in the tcp header
via tcpdump).

To simplify the equation, I ditched the iperf for the time being and just
did a simple "telnet 'remote server' 8080" test from the remote server to
the VM in Openstack, while capturing packets all along the way (4 different
points along the network path). Every point saw the same packets, including
the VM's tap interface as expected. I then reversed the test by initiating
the tcp session on the VM in Openstack to the remote server while running
the packet captures at those same points having set the remote server to
respond with a TCP Reset. From VM to Remote server traffic looked correct
with expected TCP SYN. The TCP Reset that the remote server responded with
passed all 4 points of the network, including the external interface on the
Compute node where the VM resides, but the TAP interface that connects to
the VM NEVER sees the Reset. I can recreate this condition over and over.

So, thanks to your ideas Richard, I'm no longer convinced this is an MTU
issue. What would prevent a TCP related response from being forwarded from
the external interface to the intended VM? The security group we have
applied to this VM is wide open, so I can't imagine that is the cause...

Here are 2 packet captures where I initiated a telnet to the remote server
from the VM in Openstack. As said above, I set the remote server to respond
with a reset. The top one is from the physical interface on the Compute
node where the VM resides and the other, the tap interface to that VM:

[(openstack-mitaka) root@prv-0-18-compute user]# tcpdump -nni eth0 host
x.y.120.23 and host x.y.224.45
tcpdump: WARNING: eth0: no IPv4 address assigned
tcpdump: verbose output suppressed, use -v or -vv for full protocol decode
listening on eth0, link-type EN10MB (Ethernet), capture size 65535 bytes
19:10:13.143931 IP x.y.120.23.53877 > x.y.224.45.8080: Flags [S], seq
3131027441, win 8192, options [mss 960,nop,wscale 8,nop,nop,sackOK], length
0
19:10:13.147951 IP x.y.224.45.8080 > x.y.120.23.53877: Flags [R.], seq 0,
ack 3131027442, win 0, length 0
19:10:16.156520 IP x.y.120.23.53877 > x.y.224.45.8080: Flags [S], seq
3131027441, win 8192, options [mss 960,nop,wscale 8,nop,nop,sackOK], length
0
19:10:16.157693 IP x.y.224.45.8080 > x.y.120.23.53877: Flags [R.], seq 0,
ack 1, win 0, length 0
19:10:22.157407 IP x.y.120.23.53877 > x.y.224.45.8080: Flags [S], seq
3131027441, win 8192, options [mss 960,nop,nop,sackOK], length 0
19:10:22.158682 IP x.y.224.45.8080 > x.y.120.23.53877: Flags [R.], seq 0,
ack 1, win 0, length 0

[(openstack-mitaka) root@prv-0-18-compute user]# tcpdump -nni
tap3bbe0f9d-6b host x.y.120.23 and host x.y.224.45
tcpdump: WARNING: tap3bbe0f9d-6b: no IPv4 address assigned
tcpdump: verbose output suppressed, use -v or -vv for full protocol decode
listening on tap3bbe0f9d-6b, link-type EN10MB (Ethernet), capture size
65535 bytes
19:10:13.143739 IP x.y.120.23.53877 > x.y.224.45.8080: Flags [S], seq
3131027441, win 8192, options [mss 960,nop,wscale 8,nop,nop,sackOK], length
0
19:10:16.156499 IP x.y.120.23.53877 > x.y.224.45.8080: Flags [S], seq
3131027441, win 8192, options [mss 960,nop,wscale 8,nop,nop,sackOK], length
0
19:10:22.157384 IP x.y.120.23.53877 > x.y.224.45.8080: Flags [S], seq
3131027441, win 8192, options [mss 960,nop,nop,sackOK], length 0

Any ideas? Thanks in advance for your help!!

Steve

On Mon, Mar 20, 2017 at 4:17 PM, Richard Jones rjones@suse.com wrote:

You might consider taking a packet trace of the start of an upload to see
what the TCP MSS (Maximum Segment Size) options look like and perhaps
compare between the different configs. Also, you could consider either
using netperf and having it tweak the MSS to a smaller value (test-specific
-G option if I recall correctly), or just try dropping the MTU of your VM
before you try the upload.

Another way to use netperf to "probe" without tweaking MSS or MTU settings
would be to use the TCP_RR test with increasing request/response sizes. If
there is indeed an MTU issue somewhere along the way, as you walk the
request/response size up to the local MTU, you should see the test
performance drop off a cliff if not go fully to zero.

Does the port for the VM have a security group rule permitting ICMP
traffic in? Offhand I wouldn't expect that to be different between the two
network setups you've described because I'd not have expected the virtual
router to pay attention to an arriving ICMP Destination Unreachable,
Datagram Too Big message to have the routed version work, but it seemed a
reasonable straw at which to grasp.

rick jones

PS perhaps iperf has a similar option to set the TCP MSS, I've not looked.

Sterdnot Shaken sterdnotshaken@gmail.com 03/20/17 3:07 PM >>>
Our info:

Openstack version: Mitaka (using OVS 2.5)
Firewall driver: Openvswitch

Anyone know why VM's that are directly on a Flat Provider Network (so the
VM would have a public IP directly assigned to it) can download data just
fine, but when we try and upload anything (iperf where the VM is the client
or something even like speedtest.net (upload portion)) the VM simply can't
get data out to the intended destination? Again, download works great,
upload doesn't.

If I take that VM and change it's interface to be a tenant network one that
has a Openstack HA virtual router, everything (upload and download) works
perfectly. The problem only seems to be apparent when the VM is directly on
the external network.

It seems like an MTU issue, but I don't see how... Here are the MTU's of
the part's at play:

VM: 1500
br-int (specific interface connecting to VM) - 9216
br-ex - (can't tell what that MTU is set to)

Any help would be GREATLY appreciated.

Steve


Mailing list: http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack
Post to : openstack@lists.openstack.org
Unsubscribe : http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack
responded Mar 21, 2017 by Sterdnot_Shaken (900 points)   2 4 10
0 votes

On Tue, Mar 21, 2017 at 1:22 AM Sterdnot Shaken sterdnotshaken@gmail.com
wrote:

Wow! Thanks for answering both of my questions!

So, I did some things you suggested, including setting the MSS in iperf to
something small (1000 bytes) and tested with no improvement. I then changed
the VM running on Openstack to have an MTU of 1000 and retested with no
improvement. I noticed that the node I was testing against was reporting
back to the VM on Openstack that it had an MSS of 8960, so just for the
heck of it, I changed the remote node's (server outside of Openstack) MTU
also to 1000 bytes and retested with no improvement. (The effects of all of
these tests were also validated by checking mss settings in the tcp header
via tcpdump).

To simplify the equation, I ditched the iperf for the time being and just
did a simple "telnet 'remote server' 8080" test from the remote server to
the VM in Openstack, while capturing packets all along the way (4 different
points along the network path). Every point saw the same packets, including
the VM's tap interface as expected. I then reversed the test by initiating
the tcp session on the VM in Openstack to the remote server while running
the packet captures at those same points having set the remote server to
respond with a TCP Reset. From VM to Remote server traffic looked correct
with expected TCP SYN. The TCP Reset that the remote server responded with
passed all 4 points of the network, including the external interface on the
Compute node where the VM resides, but the TAP interface that connects to
the VM NEVER sees the Reset. I can recreate this condition over and over.

So, thanks to your ideas Richard, I'm no longer convinced this is an MTU
issue. What would prevent a TCP related response from being forwarded from
the external interface to the intended VM? The security group we have
applied to this VM is wide open, so I can't imagine that is the cause...

I was fighting with something quite similar to this yesterday afternoon...

RPF? If the TCP response is received on an interface that is different
from the interface that would be used to route to that response's source
IP, RPF could drop it.

Another thing I wonder about - but note that this one is just speculation -
is TTL. Suppose the return path for some reason traverses more routers
than the forwards path (but without triggering an RPF drop) - could the
response be dropped because it had less TTL than was needed for the return
path?

Regards,
Neil


Mailing list: http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack
Post to : openstack@lists.openstack.org
Unsubscribe : http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack
responded Mar 21, 2017 by neil_at_tigera.io (3,740 points)   2 3
0 votes

You can narrow down the point where the packets are being dropped by mirroring and tracing packets on OVS bridge ports. I use a script that does the following (as root):

ip link add name sniff0 type dummy
ip link set dev sniff0 up
ovs-vsctl add-port br1 sniff0
ovs-vsctl -- set Bridge br1 mirrors=@m \
-- --id=@sniff0 get Port sniff0 \
-- --id=@eth0 get Port eth0 \
-- --id=@m create Mirror name=mirror0 \
select-dst-port=@eth0 select-src-port=@eth0 \
output-port=@sniff0 select_all=1

and to delete,
ovs-vsctl clear Bridge br1 mirrors
ovs-vsctl del-port br1 sniff0
ip link del dev sniff0

where eth0 is the point of packet capture and br1 is the bridge eth0 resides in. Then, you can run tcpdump on sniff0.
Create such mirror ports on
1) phy-br-ex on external OVS bridge
2) int-br-ex on integration bridge
3) qvo-xxx on integration bridge
Also capture packets on qvb-xxx on the linux bridge having the tap interface of the VM. Hopefully, this will provide us more clues.

-Kaustubh

From: Sterdnot Shaken [mailto:sterdnotshaken@gmail.com]
Sent: Monday, March 20, 2017 9:17 PM
To: Richard Jones rjones@suse.com
Cc: openstack@lists.openstack.org
Subject: Re: [Openstack] VM can receive traffic, but not send it

Wow! Thanks for answering both of my questions!
So, I did some things you suggested, including setting the MSS in iperf to something small (1000 bytes) and tested with no improvement. I then changed the VM running on Openstack to have an MTU of 1000 and retested with no improvement. I noticed that the node I was testing against was reporting back to the VM on Openstack that it had an MSS of 8960, so just for the heck of it, I changed the remote node's (server outside of Openstack) MTU also to 1000 bytes and retested with no improvement. (The effects of all of these tests were also validated by checking mss settings in the tcp header via tcpdump).
To simplify the equation, I ditched the iperf for the time being and just did a simple "telnet 'remote server' 8080" test from the remote server to the VM in Openstack, while capturing packets all along the way (4 different points along the network path). Every point saw the same packets, including the VM's tap interface as expected. I then reversed the test by initiating the tcp session on the VM in Openstack to the remote server while running the packet captures at those same points having set the remote server to respond with a TCP Reset. From VM to Remote server traffic looked correct with expected TCP SYN. The TCP Reset that the remote server responded with passed all 4 points of the network, including the external interface on the Compute node where the VM resides, but the TAP interface that connects to the VM NEVER sees the Reset. I can recreate this condition over and over.
So, thanks to your ideas Richard, I'm no longer convinced this is an MTU issue. What would prevent a TCP related response from being forwarded from the external interface to the intended VM? The security group we have applied to this VM is wide open, so I can't imagine that is the cause...
Here are 2 packet captures where I initiated a telnet to the remote server from the VM in Openstack. As said above, I set the remote server to respond with a reset. The top one is from the physical interface on the Compute node where the VM resides and the other, the tap interface to that VM:

[(openstack-mitaka) root@prv-0-18-compute user]# tcpdump -nni eth0 host x.y.120.23 and host x.y.224.45
tcpdump: WARNING: eth0: no IPv4 address assigned
tcpdump: verbose output suppressed, use -v or -vv for full protocol decode
listening on eth0, link-type EN10MB (Ethernet), capture size 65535 bytes
19:10:13.143931 IP x.y.120.23.53877 > x.y.224.45.8080: Flags [S], seq 3131027441, win 8192, options [mss 960,nop,wscale 8,nop,nop,sackOK], length 0
19:10:13.147951 IP x.y.224.45.8080 > x.y.120.23.53877: Flags [R.], seq 0, ack 3131027442, win 0, length 0
19:10:16.156520 IP x.y.120.23.53877 > x.y.224.45.8080: Flags [S], seq 3131027441, win 8192, options [mss 960,nop,wscale 8,nop,nop,sackOK], length 0
19:10:16.157693 IP x.y.224.45.8080 > x.y.120.23.53877: Flags [R.], seq 0, ack 1, win 0, length 0
19:10:22.157407 IP x.y.120.23.53877 > x.y.224.45.8080: Flags [S], seq 3131027441, win 8192, options [mss 960,nop,nop,sackOK], length 0
19:10:22.158682 IP x.y.224.45.8080 > x.y.120.23.53877: Flags [R.], seq 0, ack 1, win 0, length 0

[(openstack-mitaka) root@prv-0-18-compute user]# tcpdump -nni tap3bbe0f9d-6b host x.y.120.23 and host x.y.224.45
tcpdump: WARNING: tap3bbe0f9d-6b: no IPv4 address assigned
tcpdump: verbose output suppressed, use -v or -vv for full protocol decode
listening on tap3bbe0f9d-6b, link-type EN10MB (Ethernet), capture size 65535 bytes
19:10:13.143739 IP x.y.120.23.53877 > x.y.224.45.8080: Flags [S], seq 3131027441, win 8192, options [mss 960,nop,wscale 8,nop,nop,sackOK], length 0
19:10:16.156499 IP x.y.120.23.53877 > x.y.224.45.8080: Flags [S], seq 3131027441, win 8192, options [mss 960,nop,wscale 8,nop,nop,sackOK], length 0
19:10:22.157384 IP x.y.120.23.53877 > x.y.224.45.8080: Flags [S], seq 3131027441, win 8192, options [mss 960,nop,nop,sackOK], length 0
Any ideas? Thanks in advance for your help!!
Steve

On Mon, Mar 20, 2017 at 4:17 PM, Richard Jones rjones@suse.com wrote:
You might consider taking a packet trace of the start of an upload to see what the TCP MSS (Maximum Segment Size) options look like and perhaps compare between the different configs. Also, you could consider either using netperf and having it tweak the MSS to a smaller value (test-specific -G option if I recall correctly), or just try dropping the MTU of your VM before you try the upload.

Another way to use netperf to "probe" without tweaking MSS or MTU settings would be to use the TCP_RR test with increasing request/response sizes. If there is indeed an MTU issue somewhere along the way, as you walk the request/response size up to the local MTU, you should see the test performance drop off a cliff if not go fully to zero.

Does the port for the VM have a security group rule permitting ICMP traffic in? Offhand I wouldn't expect that to be different between the two network setups you've described because I'd not have expected the virtual router to pay attention to an arriving ICMP Destination Unreachable, Datagram Too Big message to have the routed version work, but it seemed a reasonable straw at which to grasp.

rick jones

PS perhaps iperf has a similar option to set the TCP MSS, I've not looked.

Sterdnot Shaken sterdnotshaken@gmail.com 03/20/17 3:07 PM >>>
Our info:

Openstack version: Mitaka (using OVS 2.5)
Firewall driver: Openvswitch

Anyone know why VM's that are directly on a Flat Provider Network (so the
VM would have a public IP directly assigned to it) can download data just
fine, but when we try and upload anything (iperf where the VM is the client
or something even like speedtest.net (upload portion)) the VM simply can't
get data out to the intended destination? Again, download works great,
upload doesn't.

If I take that VM and change it's interface to be a tenant network one that
has a Openstack HA virtual router, everything (upload and download) works
perfectly. The problem only seems to be apparent when the VM is directly on
the external network.

It seems like an MTU issue, but I don't see how... Here are the MTU's of
the part's at play:

VM: 1500
br-int (specific interface connecting to VM) - 9216
br-ex - (can't tell what that MTU is set to)

Any help would be GREATLY appreciated.

Steve


Mailing list: http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack
Post to : openstack@lists.openstack.org
Unsubscribe : http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack
responded Mar 21, 2017 by Kaustubh_Kelkar (1,780 points)   2 2 3
0 votes

Thanks for everyone's kind help!

Steve: I will try and turn off the offload features and see if that helps.
Thanks!

Neil: I will also check and make sure neither RPF nor TTL are posing any
issues.

Kaustubh: Is there a reason the mirror approach only seems to work on some
of the OVS bridges, but not others? if I follow your instructions, I can
see traffic when I set up a mirror on some bridges, but not others... Do I
need to put these OVS bridges into promiscuous mode before the mirror will
work?

Thanks!!

On Tue, Mar 21, 2017 at 9:42 AM, Kaustubh Kelkar <
kaustubh.kelkar@casa-systems.com> wrote:

You can narrow down the point where the packets are being dropped by
mirroring and tracing packets on OVS bridge ports. I use a script that does
the following (as root):

ip link add name sniff0 type dummy

ip link set dev sniff0 up

ovs-vsctl add-port br1 sniff0

ovs-vsctl -- set Bridge br1 mirrors=@m \

-- --id=@sniff0 get Port sniff0 \

-- --id=@eth0 get Port eth0 \

-- --id=@m create Mirror name=mirror0 \

select-dst-port=@eth0 select-src-port=@eth0 \

output-port=@sniff0 select_all=1

and to delete,

ovs-vsctl clear Bridge br1 mirrors

ovs-vsctl del-port br1 sniff0

ip link del dev sniff0

where eth0 is the point of packet capture and br1 is the bridge eth0
resides in. Then, you can run tcpdump on sniff0.

Create such mirror ports on

1) phy-br-ex on external OVS bridge

2) int-br-ex on integration bridge

3) qvo-xxx on integration bridge

Also capture packets on qvb-xxx on the linux bridge having the tap
interface of the VM. Hopefully, this will provide us more clues.

-Kaustubh

From: Sterdnot Shaken [mailto:sterdnotshaken@gmail.com]
Sent: Monday, March 20, 2017 9:17 PM
To: Richard Jones rjones@suse.com
Cc: openstack@lists.openstack.org
Subject: Re: [Openstack] VM can receive traffic, but not send it

Wow! Thanks for answering both of my questions!

So, I did some things you suggested, including setting the MSS in iperf to
something small (1000 bytes) and tested with no improvement. I then changed
the VM running on Openstack to have an MTU of 1000 and retested with no
improvement. I noticed that the node I was testing against was reporting
back to the VM on Openstack that it had an MSS of 8960, so just for the
heck of it, I changed the remote node's (server outside of Openstack) MTU
also to 1000 bytes and retested with no improvement. (The effects of all of
these tests were also validated by checking mss settings in the tcp header
via tcpdump).

To simplify the equation, I ditched the iperf for the time being and just
did a simple "telnet 'remote server' 8080" test from the remote server to
the VM in Openstack, while capturing packets all along the way (4 different
points along the network path). Every point saw the same packets, including
the VM's tap interface as expected. I then reversed the test by initiating
the tcp session on the VM in Openstack to the remote server while running
the packet captures at those same points having set the remote server to
respond with a TCP Reset. From VM to Remote server traffic looked correct
with expected TCP SYN. The TCP Reset that the remote server responded with
passed all 4 points of the network, including the external interface on the
Compute node where the VM resides, but the TAP interface that connects to
the VM NEVER sees the Reset. I can recreate this condition over and over.

So, thanks to your ideas Richard, I'm no longer convinced this is an MTU
issue. What would prevent a TCP related response from being forwarded from
the external interface to the intended VM? The security group we have
applied to this VM is wide open, so I can't imagine that is the cause...

Here are 2 packet captures where I initiated a telnet to the remote server
from the VM in Openstack. As said above, I set the remote server to respond
with a reset. The top one is from the physical interface on the Compute
node where the VM resides and the other, the tap interface to that VM:

[(openstack-mitaka) root@prv-0-18-compute user]# tcpdump -nni eth0 host
x.y.120.23 and host x.y.224.45
tcpdump: WARNING: eth0: no IPv4 address assigned
tcpdump: verbose output suppressed, use -v or -vv for full protocol decode
listening on eth0, link-type EN10MB (Ethernet), capture size 65535 bytes
19:10:13.143931 IP x.y.120.23.53877 > x.y.224.45.8080: Flags [S], seq
3131027441, win 8192, options [mss 960,nop,wscale 8,nop,nop,sackOK], length
0
19:10:13.147951 IP x.y.224.45.8080 > x.y.120.23.53877: Flags [R.], seq 0,
ack 3131027442, win 0, length 0
19:10:16.156520 IP x.y.120.23.53877 > x.y.224.45.8080: Flags [S], seq
3131027441, win 8192, options [mss 960,nop,wscale 8,nop,nop,sackOK], length
0
19:10:16.157693 IP x.y.224.45.8080 > x.y.120.23.53877: Flags [R.], seq 0,
ack 1, win 0, length 0
19:10:22.157407 IP x.y.120.23.53877 > x.y.224.45.8080: Flags [S], seq
3131027441, win 8192, options [mss 960,nop,nop,sackOK], length 0
19:10:22.158682 IP x.y.224.45.8080 > x.y.120.23.53877: Flags [R.], seq 0,
ack 1, win 0, length 0

[(openstack-mitaka) root@prv-0-18-compute user]# tcpdump -nni
tap3bbe0f9d-6b host x.y.120.23 and host x.y.224.45
tcpdump: WARNING: tap3bbe0f9d-6b: no IPv4 address assigned
tcpdump: verbose output suppressed, use -v or -vv for full protocol decode
listening on tap3bbe0f9d-6b, link-type EN10MB (Ethernet), capture size
65535 bytes
19:10:13.143739 IP x.y.120.23.53877 > x.y.224.45.8080: Flags [S], seq
3131027441, win 8192, options [mss 960,nop,wscale 8,nop,nop,sackOK], length
0
19:10:16.156499 IP x.y.120.23.53877 > x.y.224.45.8080: Flags [S], seq
3131027441, win 8192, options [mss 960,nop,wscale 8,nop,nop,sackOK], length
0
19:10:22.157384 IP x.y.120.23.53877 > x.y.224.45.8080: Flags [S], seq
3131027441, win 8192, options [mss 960,nop,nop,sackOK], length 0

Any ideas? Thanks in advance for your help!!

Steve

On Mon, Mar 20, 2017 at 4:17 PM, Richard Jones rjones@suse.com wrote:

You might consider taking a packet trace of the start of an upload to see
what the TCP MSS (Maximum Segment Size) options look like and perhaps
compare between the different configs. Also, you could consider either
using netperf and having it tweak the MSS to a smaller value (test-specific
-G option if I recall correctly), or just try dropping the MTU of your VM
before you try the upload.

Another way to use netperf to "probe" without tweaking MSS or MTU settings
would be to use the TCP_RR test with increasing request/response sizes. If
there is indeed an MTU issue somewhere along the way, as you walk the
request/response size up to the local MTU, you should see the test
performance drop off a cliff if not go fully to zero.

Does the port for the VM have a security group rule permitting ICMP
traffic in? Offhand I wouldn't expect that to be different between the two
network setups you've described because I'd not have expected the virtual
router to pay attention to an arriving ICMP Destination Unreachable,
Datagram Too Big message to have the routed version work, but it seemed a
reasonable straw at which to grasp.

rick jones

PS perhaps iperf has a similar option to set the TCP MSS, I've not looked.

Sterdnot Shaken sterdnotshaken@gmail.com 03/20/17 3:07 PM >>>

Our info:

Openstack version: Mitaka (using OVS 2.5)
Firewall driver: Openvswitch

Anyone know why VM's that are directly on a Flat Provider Network (so the
VM would have a public IP directly assigned to it) can download data just
fine, but when we try and upload anything (iperf where the VM is the client
or something even like speedtest.net (upload portion)) the VM simply can't
get data out to the intended destination? Again, download works great,
upload doesn't.

If I take that VM and change it's interface to be a tenant network one that
has a Openstack HA virtual router, everything (upload and download) works
perfectly. The problem only seems to be apparent when the VM is directly on
the external network.

It seems like an MTU issue, but I don't see how... Here are the MTU's of
the part's at play:

VM: 1500
br-int (specific interface connecting to VM) - 9216
br-ex - (can't tell what that MTU is set to)

Any help would be GREATLY appreciated.

Steve


Mailing list: http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack
Post to : openstack@lists.openstack.org
Unsubscribe : http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack
responded Mar 22, 2017 by Sterdnot_Shaken (900 points)   2 4 10
0 votes

From: Sterdnot Shaken [mailto:sterdnotshaken@gmail.com]
Sent: Tuesday, March 21, 2017 8:54 PM
To: Kaustubh Kelkar kaustubh.kelkar@casa-systems.com
Cc: Richard Jones rjones@suse.com; openstack@lists.openstack.org
Subject: Re: [Openstack] VM can receive traffic, but not send it

Thanks for everyone's kind help!
Steve: I will try and turn off the offload features and see if that helps. Thanks!
Neil: I will also check and make sure neither RPF nor TTL are posing any issues.

Kaustubh: Is there a reason the mirror approach only seems to work on some of the OVS bridges, but not others? if I follow your instructions, I can see traffic when I set up a mirror on some bridges, but not others... Do I need to put these OVS bridges into promiscuous mode before the mirror will work?
[Kaustubh] I don’t recall putting the bridge in promiscuous mode, but it has been a while since I had looked at this. How are you setting up the mirrors? You would need to mirror a specific port of the bridge, not the bridge itself.
Thanks!!

On Tue, Mar 21, 2017 at 9:42 AM, Kaustubh Kelkar kaustubh.kelkar@casa-systems.com wrote:
You can narrow down the point where the packets are being dropped by mirroring and tracing packets on OVS bridge ports. I use a script that does the following (as root):

ip link add name sniff0 type dummy
ip link set dev sniff0 up
ovs-vsctl add-port br1 sniff0
ovs-vsctl -- set Bridge br1 mirrors=@m \
-- --id=@sniff0 get Port sniff0 \
-- --id=@eth0 get Port eth0 \
-- --id=@m create Mirror name=mirror0 \
select-dst-port=@eth0 select-src-port=@eth0 \
output-port=@sniff0 select_all=1

and to delete,
ovs-vsctl clear Bridge br1 mirrors
ovs-vsctl del-port br1 sniff0
ip link del dev sniff0

where eth0 is the point of packet capture and br1 is the bridge eth0 resides in. Then, you can run tcpdump on sniff0.
Create such mirror ports on
1) phy-br-ex on external OVS bridge
2) int-br-ex on integration bridge
3) qvo-xxx on integration bridge
Also capture packets on qvb-xxx on the linux bridge having the tap interface of the VM. Hopefully, this will provide us more clues.

-Kaustubh

From: Sterdnot Shaken [mailto:sterdnotshaken@gmail.com]
Sent: Monday, March 20, 2017 9:17 PM
To: Richard Jones rjones@suse.com
Cc: openstack@lists.openstack.org
Subject: Re: [Openstack] VM can receive traffic, but not send it

Wow! Thanks for answering both of my questions!
So, I did some things you suggested, including setting the MSS in iperf to something small (1000 bytes) and tested with no improvement. I then changed the VM running on Openstack to have an MTU of 1000 and retested with no improvement. I noticed that the node I was testing against was reporting back to the VM on Openstack that it had an MSS of 8960, so just for the heck of it, I changed the remote node's (server outside of Openstack) MTU also to 1000 bytes and retested with no improvement. (The effects of all of these tests were also validated by checking mss settings in the tcp header via tcpdump).
To simplify the equation, I ditched the iperf for the time being and just did a simple "telnet 'remote server' 8080" test from the remote server to the VM in Openstack, while capturing packets all along the way (4 different points along the network path). Every point saw the same packets, including the VM's tap interface as expected. I then reversed the test by initiating the tcp session on the VM in Openstack to the remote server while running the packet captures at those same points having set the remote server to respond with a TCP Reset. From VM to Remote server traffic looked correct with expected TCP SYN. The TCP Reset that the remote server responded with passed all 4 points of the network, including the external interface on the Compute node where the VM resides, but the TAP interface that connects to the VM NEVER sees the Reset. I can recreate this condition over and over.
So, thanks to your ideas Richard, I'm no longer convinced this is an MTU issue. What would prevent a TCP related response from being forwarded from the external interface to the intended VM? The security group we have applied to this VM is wide open, so I can't imagine that is the cause...
Here are 2 packet captures where I initiated a telnet to the remote server from the VM in Openstack. As said above, I set the remote server to respond with a reset. The top one is from the physical interface on the Compute node where the VM resides and the other, the tap interface to that VM:

[(openstack-mitaka) root@prv-0-18-compute user]# tcpdump -nni eth0 host x.y.120.23 and host x.y.224.45
tcpdump: WARNING: eth0: no IPv4 address assigned
tcpdump: verbose output suppressed, use -v or -vv for full protocol decode
listening on eth0, link-type EN10MB (Ethernet), capture size 65535 bytes
19:10:13.143931 IP x.y.120.23.53877 > x.y.224.45.8080: Flags [S], seq 3131027441, win 8192, options [mss 960,nop,wscale 8,nop,nop,sackOK], length 0
19:10:13.147951 IP x.y.224.45.8080 > x.y.120.23.53877: Flags [R.], seq 0, ack 3131027442, win 0, length 0
19:10:16.156520 IP x.y.120.23.53877 > x.y.224.45.8080: Flags [S], seq 3131027441, win 8192, options [mss 960,nop,wscale 8,nop,nop,sackOK], length 0
19:10:16.157693 IP x.y.224.45.8080 > x.y.120.23.53877: Flags [R.], seq 0, ack 1, win 0, length 0
19:10:22.157407 IP x.y.120.23.53877 > x.y.224.45.8080: Flags [S], seq 3131027441, win 8192, options [mss 960,nop,nop,sackOK], length 0
19:10:22.158682 IP x.y.224.45.8080 > x.y.120.23.53877: Flags [R.], seq 0, ack 1, win 0, length 0

[(openstack-mitaka) root@prv-0-18-compute user]# tcpdump -nni tap3bbe0f9d-6b host x.y.120.23 and host x.y.224.45
tcpdump: WARNING: tap3bbe0f9d-6b: no IPv4 address assigned
tcpdump: verbose output suppressed, use -v or -vv for full protocol decode
listening on tap3bbe0f9d-6b, link-type EN10MB (Ethernet), capture size 65535 bytes
19:10:13.143739 IP x.y.120.23.53877 > x.y.224.45.8080: Flags [S], seq 3131027441, win 8192, options [mss 960,nop,wscale 8,nop,nop,sackOK], length 0
19:10:16.156499 IP x.y.120.23.53877 > x.y.224.45.8080: Flags [S], seq 3131027441, win 8192, options [mss 960,nop,wscale 8,nop,nop,sackOK], length 0
19:10:22.157384 IP x.y.120.23.53877 > x.y.224.45.8080: Flags [S], seq 3131027441, win 8192, options [mss 960,nop,nop,sackOK], length 0
Any ideas? Thanks in advance for your help!!
Steve

On Mon, Mar 20, 2017 at 4:17 PM, Richard Jones rjones@suse.com wrote:
You might consider taking a packet trace of the start of an upload to see what the TCP MSS (Maximum Segment Size) options look like and perhaps compare between the different configs. Also, you could consider either using netperf and having it tweak the MSS to a smaller value (test-specific -G option if I recall correctly), or just try dropping the MTU of your VM before you try the upload.

Another way to use netperf to "probe" without tweaking MSS or MTU settings would be to use the TCP_RR test with increasing request/response sizes. If there is indeed an MTU issue somewhere along the way, as you walk the request/response size up to the local MTU, you should see the test performance drop off a cliff if not go fully to zero.

Does the port for the VM have a security group rule permitting ICMP traffic in? Offhand I wouldn't expect that to be different between the two network setups you've described because I'd not have expected the virtual router to pay attention to an arriving ICMP Destination Unreachable, Datagram Too Big message to have the routed version work, but it seemed a reasonable straw at which to grasp.

rick jones

PS perhaps iperf has a similar option to set the TCP MSS, I've not looked.

Sterdnot Shaken sterdnotshaken@gmail.com 03/20/17 3:07 PM >>>
Our info:

Openstack version: Mitaka (using OVS 2.5)
Firewall driver: Openvswitch

Anyone know why VM's that are directly on a Flat Provider Network (so the
VM would have a public IP directly assigned to it) can download data just
fine, but when we try and upload anything (iperf where the VM is the client
or something even like speedtest.net (upload portion)) the VM simply can't
get data out to the intended destination? Again, download works great,
upload doesn't.

If I take that VM and change it's interface to be a tenant network one that
has a Openstack HA virtual router, everything (upload and download) works
perfectly. The problem only seems to be apparent when the VM is directly on
the external network.

It seems like an MTU issue, but I don't see how... Here are the MTU's of
the part's at play:

VM: 1500
br-int (specific interface connecting to VM) - 9216
br-ex - (can't tell what that MTU is set to)

Any help would be GREATLY appreciated.

Steve


Mailing list: http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack
Post to : openstack@lists.openstack.org
Unsubscribe : http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack
responded Mar 22, 2017 by Kaustubh_Kelkar (1,780 points)   2 2 3
0 votes

The select_all = 1 is supposed to mirror all the packets.

Referring to the documentation (http://openvswitch.org/support/dist-docs/ovs-vswitchd.conf.db.5.html),

“select_all: boolean

          If true, every packet arriving  or  departing  on  any  port  is

          selected for mirroring.

And for OVS 2.5,

“In Open

   vSwitch 2.5 and later, mirroring  occurs  just  after  a  packet  first

   becomes  eligible, using the packet as it exists at that point; …

in Open vSwitch 2.4, the modifications are never visible to
mirrors, whereas in Open vSwitch 2.5 and later modifications made
before the first output that makes it eligible for mirroring to a par‐
ticular destination are visible.

I believe, if the very first flow is dropping unicast packets, you might not be able to mirror them.

Maybe you can monitor the flow-tables on each OVS bridge while sending traffic and see which flows’ count increases. Something like,
watch –n 2 “ovs-ofctl dump-flows ”

-Kaustubh

From: Sterdnot Shaken [mailto:sterdnotshaken@gmail.com]
Sent: Wednesday, March 22, 2017 12:24 PM
To: Kaustubh Kelkar kaustubh.kelkar@casa-systems.com
Subject: Re: [Openstack] VM can receive traffic, but not send it

Here's was my first mirror setup:
ip link add name dummy3 type dummy
ip link set dev dummy3 up

ovs-vsctl add-port br-ex3 dummy3

ovs-vsctl -- set bridge br-ex3 mirrors=@m \
-- --id=@src get port pat-ex3-bss \
-- --id=@mir get port dummy3 \
-- --id=@m create mirror name=ovs_mirror3 select-dst-port=@src select-src-port=@src output-port=@mir select-all=true

And here's the one I did by copying your example:
ip link add name dummy3 type dummy
ip link set dev dummy3 up

ovs-vsctl add-port br-ex3 dummy3

ovs-vsctl -- set Bridge br-ex3 mirrors=@m \
-- --id=@dummy3 get Port dummy3 \
-- --id=@pat-ex3-bss get Port pat-ex3-bss \
-- --id=@m create Mirror name=mirror0 \
select-dst-port=@pat-ex3-bss select-src-port=@pat-ex3-bss \
output-port=@dummy3 select_all=1

Both yield the same results. When I tcpdump the respective dummy interface attached to br-ex3, I only see broadcast traffic for the VM in question, I never see unicast traffic (case and point, if I ping the broadcast address on the VM, then traffic show's up in the tcpdump). I can do a tcpdump on the external interface and see the unicast traffic though, but I need to see where it's breaking in the OVS bridges.
Is there some trick to mirror unicast dataplane traffic?
Thanks in advance!

On Wed, Mar 22, 2017 at 10:07 AM, Kaustubh Kelkar kaustubh.kelkar@casa-systems.com wrote:

From: Sterdnot Shaken [mailto:sterdnotshaken@gmail.com]
Sent: Tuesday, March 21, 2017 8:54 PM
To: Kaustubh Kelkar kaustubh.kelkar@casa-systems.com
Cc: Richard Jones rjones@suse.com; openstack@lists.openstack.org
Subject: Re: [Openstack] VM can receive traffic, but not send it

Thanks for everyone's kind help!
Steve: I will try and turn off the offload features and see if that helps. Thanks!
Neil: I will also check and make sure neither RPF nor TTL are posing any issues.

Kaustubh: Is there a reason the mirror approach only seems to work on some of the OVS bridges, but not others? if I follow your instructions, I can see traffic when I set up a mirror on some bridges, but not others... Do I need to put these OVS bridges into promiscuous mode before the mirror will work?
[Kaustubh] I don’t recall putting the bridge in promiscuous mode, but it has been a while since I had looked at this. How are you setting up the mirrors? You would need to mirror a specific port of the bridge, not the bridge itself.
Thanks!!

On Tue, Mar 21, 2017 at 9:42 AM, Kaustubh Kelkar kaustubh.kelkar@casa-systems.com wrote:
You can narrow down the point where the packets are being dropped by mirroring and tracing packets on OVS bridge ports. I use a script that does the following (as root):

ip link add name sniff0 type dummy
ip link set dev sniff0 up
ovs-vsctl add-port br1 sniff0
ovs-vsctl -- set Bridge br1 mirrors=@m \
-- --id=@sniff0 get Port sniff0 \
-- --id=@eth0 get Port eth0 \
-- --id=@m create Mirror name=mirror0 \
select-dst-port=@eth0 select-src-port=@eth0 \
output-port=@sniff0 select_all=1

and to delete,
ovs-vsctl clear Bridge br1 mirrors
ovs-vsctl del-port br1 sniff0
ip link del dev sniff0

where eth0 is the point of packet capture and br1 is the bridge eth0 resides in. Then, you can run tcpdump on sniff0.
Create such mirror ports on
1) phy-br-ex on external OVS bridge
2) int-br-ex on integration bridge
3) qvo-xxx on integration bridge
Also capture packets on qvb-xxx on the linux bridge having the tap interface of the VM. Hopefully, this will provide us more clues.

-Kaustubh

From: Sterdnot Shaken [mailto:sterdnotshaken@gmail.com]
Sent: Monday, March 20, 2017 9:17 PM
To: Richard Jones rjones@suse.com
Cc: openstack@lists.openstack.org
Subject: Re: [Openstack] VM can receive traffic, but not send it

Wow! Thanks for answering both of my questions!
So, I did some things you suggested, including setting the MSS in iperf to something small (1000 bytes) and tested with no improvement. I then changed the VM running on Openstack to have an MTU of 1000 and retested with no improvement. I noticed that the node I was testing against was reporting back to the VM on Openstack that it had an MSS of 8960, so just for the heck of it, I changed the remote node's (server outside of Openstack) MTU also to 1000 bytes and retested with no improvement. (The effects of all of these tests were also validated by checking mss settings in the tcp header via tcpdump).
To simplify the equation, I ditched the iperf for the time being and just did a simple "telnet 'remote server' 8080" test from the remote server to the VM in Openstack, while capturing packets all along the way (4 different points along the network path). Every point saw the same packets, including the VM's tap interface as expected. I then reversed the test by initiating the tcp session on the VM in Openstack to the remote server while running the packet captures at those same points having set the remote server to respond with a TCP Reset. From VM to Remote server traffic looked correct with expected TCP SYN. The TCP Reset that the remote server responded with passed all 4 points of the network, including the external interface on the Compute node where the VM resides, but the TAP interface that connects to the VM NEVER sees the Reset. I can recreate this condition over and over.
So, thanks to your ideas Richard, I'm no longer convinced this is an MTU issue. What would prevent a TCP related response from being forwarded from the external interface to the intended VM? The security group we have applied to this VM is wide open, so I can't imagine that is the cause...
Here are 2 packet captures where I initiated a telnet to the remote server from the VM in Openstack. As said above, I set the remote server to respond with a reset. The top one is from the physical interface on the Compute node where the VM resides and the other, the tap interface to that VM:

[(openstack-mitaka) root@prv-0-18-compute user]# tcpdump -nni eth0 host x.y.120.23 and host x.y.224.45
tcpdump: WARNING: eth0: no IPv4 address assigned
tcpdump: verbose output suppressed, use -v or -vv for full protocol decode
listening on eth0, link-type EN10MB (Ethernet), capture size 65535 bytes
19:10:13.143931 IP x.y.120.23.53877 > x.y.224.45.8080: Flags [S], seq 3131027441, win 8192, options [mss 960,nop,wscale 8,nop,nop,sackOK], length 0
19:10:13.147951 IP x.y.224.45.8080 > x.y.120.23.53877: Flags [R.], seq 0, ack 3131027442, win 0, length 0
19:10:16.156520 IP x.y.120.23.53877 > x.y.224.45.8080: Flags [S], seq 3131027441, win 8192, options [mss 960,nop,wscale 8,nop,nop,sackOK], length 0
19:10:16.157693 IP x.y.224.45.8080 > x.y.120.23.53877: Flags [R.], seq 0, ack 1, win 0, length 0
19:10:22.157407 IP x.y.120.23.53877 > x.y.224.45.8080: Flags [S], seq 3131027441, win 8192, options [mss 960,nop,nop,sackOK], length 0
19:10:22.158682 IP x.y.224.45.8080 > x.y.120.23.53877: Flags [R.], seq 0, ack 1, win 0, length 0

[(openstack-mitaka) root@prv-0-18-compute user]# tcpdump -nni tap3bbe0f9d-6b host x.y.120.23 and host x.y.224.45
tcpdump: WARNING: tap3bbe0f9d-6b: no IPv4 address assigned
tcpdump: verbose output suppressed, use -v or -vv for full protocol decode
listening on tap3bbe0f9d-6b, link-type EN10MB (Ethernet), capture size 65535 bytes
19:10:13.143739 IP x.y.120.23.53877 > x.y.224.45.8080: Flags [S], seq 3131027441, win 8192, options [mss 960,nop,wscale 8,nop,nop,sackOK], length 0
19:10:16.156499 IP x.y.120.23.53877 > x.y.224.45.8080: Flags [S], seq 3131027441, win 8192, options [mss 960,nop,wscale 8,nop,nop,sackOK], length 0
19:10:22.157384 IP x.y.120.23.53877 > x.y.224.45.8080: Flags [S], seq 3131027441, win 8192, options [mss 960,nop,nop,sackOK], length 0
Any ideas? Thanks in advance for your help!!
Steve

On Mon, Mar 20, 2017 at 4:17 PM, Richard Jones rjones@suse.com wrote:
You might consider taking a packet trace of the start of an upload to see what the TCP MSS (Maximum Segment Size) options look like and perhaps compare between the different configs. Also, you could consider either using netperf and having it tweak the MSS to a smaller value (test-specific -G option if I recall correctly), or just try dropping the MTU of your VM before you try the upload.

Another way to use netperf to "probe" without tweaking MSS or MTU settings would be to use the TCP_RR test with increasing request/response sizes. If there is indeed an MTU issue somewhere along the way, as you walk the request/response size up to the local MTU, you should see the test performance drop off a cliff if not go fully to zero.

Does the port for the VM have a security group rule permitting ICMP traffic in? Offhand I wouldn't expect that to be different between the two network setups you've described because I'd not have expected the virtual router to pay attention to an arriving ICMP Destination Unreachable, Datagram Too Big message to have the routed version work, but it seemed a reasonable straw at which to grasp.

rick jones

PS perhaps iperf has a similar option to set the TCP MSS, I've not looked.

Sterdnot Shaken sterdnotshaken@gmail.com 03/20/17 3:07 PM >>>
Our info:

Openstack version: Mitaka (using OVS 2.5)
Firewall driver: Openvswitch

Anyone know why VM's that are directly on a Flat Provider Network (so the
VM would have a public IP directly assigned to it) can download data just
fine, but when we try and upload anything (iperf where the VM is the client
or something even like speedtest.net (upload portion)) the VM simply can't
get data out to the intended destination? Again, download works great,
upload doesn't.

If I take that VM and change it's interface to be a tenant network one that
has a Openstack HA virtual router, everything (upload and download) works
perfectly. The problem only seems to be apparent when the VM is directly on
the external network.

It seems like an MTU issue, but I don't see how... Here are the MTU's of
the part's at play:

VM: 1500
br-int (specific interface connecting to VM) - 9216
br-ex - (can't tell what that MTU is set to)

Any help would be GREATLY appreciated.

Steve


Mailing list: http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack
Post to : openstack@lists.openstack.org
Unsubscribe : http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack
responded Mar 22, 2017 by Kaustubh_Kelkar (1,780 points)   2 2 3
0 votes

For downloads, you're using probably DNAT or SNAT. For uploads, you're
using floating IP's I'm guessing. Does uploads work for other VM's with a
similar configuration? It's rare that this would occur so I would presume
it's firewall related (either security group via OpenStack) or firewall on
the VM itself.

Another question, are incoming connections timing out, is the security
group allowing connections from everyone or a subset? i ask because I
haven't seen the easy questions asked up front.

//adam

Adam Lawson

Principal Architect
Office: +1-916-794-5706

On Wed, Mar 22, 2017 at 11:31 AM, Kaustubh Kelkar <
kaustubh.kelkar@casa-systems.com> wrote:

The select_all = 1 is supposed to mirror all the packets.

Referring to the documentation (http://openvswitch.org/
support/dist-docs/ovs-vswitchd.conf.db.5.html),

select_all: boolean

          If true, every packet arriving  or  departing  on  any  port  is

          selected for mirroring.

And for OVS 2.5,

“In Open

   vSwitch 2.5 and later, mirroring  occurs  just  after  a  packet  first

   becomes  eligible, using the packet as it exists at that point; …

in Open vSwitch 2.4, the modifications are never visible to

   mirrors, whereas in Open  vSwitch  2.5  and  later  modifications

made

   before  the first output that makes it eligible for mirroring to a

par‐

   ticular destination are visible.

I believe, if the very first flow is dropping unicast packets, you might
not be able to mirror them.

Maybe you can monitor the flow-tables on each OVS bridge while sending
traffic and see which flows’ count increases. Something like,

watch –n 2 “ovs-ofctl dump-flows ”

-Kaustubh

From: Sterdnot Shaken [mailto:sterdnotshaken@gmail.com]
Sent: Wednesday, March 22, 2017 12:24 PM
To: Kaustubh Kelkar kaustubh.kelkar@casa-systems.com
Subject: Re: [Openstack] VM can receive traffic, but not send it

Here's was my first mirror setup:

ip link add name dummy3 type dummy
ip link set dev dummy3 up

ovs-vsctl add-port br-ex3 dummy3

ovs-vsctl -- set bridge br-ex3 mirrors=@m \
-- --id=@src get port pat-ex3-bss \
-- --id=@mir get port dummy3 \
-- --id=@m create mirror name=ovs_mirror3 select-dst-port=@src
select-src-port=@src output-port=@mir select-all=true

And here's the one I did by copying your example:

ip link add name dummy3 type dummy
ip link set dev dummy3 up

ovs-vsctl add-port br-ex3 dummy3

ovs-vsctl -- set Bridge br-ex3 mirrors=@m \
-- --id=@dummy3 get Port dummy3 \
-- --id=@pat-ex3-bss get Port pat-ex3-bss \
-- --id=@m create Mirror name=mirror0 \
select-dst-port=@pat-ex3-bss select-src-port=@pat-ex3-bss \
output-port=@dummy3 select_all=1

Both yield the same results. When I tcpdump the respective dummy interface
attached to br-ex3, I only see broadcast traffic for the VM in question, I
never see unicast traffic (case and point, if I ping the broadcast address
on the VM, then traffic show's up in the tcpdump). I can do a tcpdump on
the external interface and see the unicast traffic though, but I need to
see where it's breaking in the OVS bridges.

Is there some trick to mirror unicast dataplane traffic?

Thanks in advance!

On Wed, Mar 22, 2017 at 10:07 AM, Kaustubh Kelkar <
kaustubh.kelkar@casa-systems.com> wrote:

From: Sterdnot Shaken [mailto:sterdnotshaken@gmail.com]
Sent: Tuesday, March 21, 2017 8:54 PM
To: Kaustubh Kelkar kaustubh.kelkar@casa-systems.com
Cc: Richard Jones rjones@suse.com; openstack@lists.openstack.org
Subject: Re: [Openstack] VM can receive traffic, but not send it

Thanks for everyone's kind help!

Steve: I will try and turn off the offload features and see if that helps.
Thanks!

Neil: I will also check and make sure neither RPF nor TTL are posing any
issues.

Kaustubh: Is there a reason the mirror approach only seems to work on some
of the OVS bridges, but not others? if I follow your instructions, I can
see traffic when I set up a mirror on some bridges, but not others... Do I
need to put these OVS bridges into promiscuous mode before the mirror will
work?

[Kaustubh] I don’t recall putting the bridge in promiscuous mode, but it
has been a while since I had looked at this. How are you setting up the
mirrors? You would need to mirror a specific port of the bridge, not the
bridge itself.

Thanks!!

On Tue, Mar 21, 2017 at 9:42 AM, Kaustubh Kelkar <
kaustubh.kelkar@casa-systems.com> wrote:

You can narrow down the point where the packets are being dropped by
mirroring and tracing packets on OVS bridge ports. I use a script that does
the following (as root):

ip link add name sniff0 type dummy

ip link set dev sniff0 up

ovs-vsctl add-port br1 sniff0

ovs-vsctl -- set Bridge br1 mirrors=@m \

-- --id=@sniff0 get Port sniff0 \

-- --id=@eth0 get Port eth0 \

-- --id=@m create Mirror name=mirror0 \

select-dst-port=@eth0 select-src-port=@eth0 \

output-port=@sniff0 select_all=1

and to delete,

ovs-vsctl clear Bridge br1 mirrors

ovs-vsctl del-port br1 sniff0

ip link del dev sniff0

where eth0 is the point of packet capture and br1 is the bridge eth0
resides in. Then, you can run tcpdump on sniff0.

Create such mirror ports on

1) phy-br-ex on external OVS bridge

2) int-br-ex on integration bridge

3) qvo-xxx on integration bridge

Also capture packets on qvb-xxx on the linux bridge having the tap
interface of the VM. Hopefully, this will provide us more clues.

-Kaustubh

From: Sterdnot Shaken [mailto:sterdnotshaken@gmail.com]
Sent: Monday, March 20, 2017 9:17 PM
To: Richard Jones rjones@suse.com
Cc: openstack@lists.openstack.org
Subject: Re: [Openstack] VM can receive traffic, but not send it

Wow! Thanks for answering both of my questions!

So, I did some things you suggested, including setting the MSS in iperf to
something small (1000 bytes) and tested with no improvement. I then changed
the VM running on Openstack to have an MTU of 1000 and retested with no
improvement. I noticed that the node I was testing against was reporting
back to the VM on Openstack that it had an MSS of 8960, so just for the
heck of it, I changed the remote node's (server outside of Openstack) MTU
also to 1000 bytes and retested with no improvement. (The effects of all of
these tests were also validated by checking mss settings in the tcp header
via tcpdump).

To simplify the equation, I ditched the iperf for the time being and just
did a simple "telnet 'remote server' 8080" test from the remote server to
the VM in Openstack, while capturing packets all along the way (4 different
points along the network path). Every point saw the same packets, including
the VM's tap interface as expected. I then reversed the test by initiating
the tcp session on the VM in Openstack to the remote server while running
the packet captures at those same points having set the remote server to
respond with a TCP Reset. From VM to Remote server traffic looked correct
with expected TCP SYN. The TCP Reset that the remote server responded with
passed all 4 points of the network, including the external interface on the
Compute node where the VM resides, but the TAP interface that connects to
the VM NEVER sees the Reset. I can recreate this condition over and over.

So, thanks to your ideas Richard, I'm no longer convinced this is an MTU
issue. What would prevent a TCP related response from being forwarded from
the external interface to the intended VM? The security group we have
applied to this VM is wide open, so I can't imagine that is the cause...

Here are 2 packet captures where I initiated a telnet to the remote server
from the VM in Openstack. As said above, I set the remote server to respond
with a reset. The top one is from the physical interface on the Compute
node where the VM resides and the other, the tap interface to that VM:

[(openstack-mitaka) root@prv-0-18-compute user]# tcpdump -nni eth0 host
x.y.120.23 and host x.y.224.45
tcpdump: WARNING: eth0: no IPv4 address assigned
tcpdump: verbose output suppressed, use -v or -vv for full protocol decode
listening on eth0, link-type EN10MB (Ethernet), capture size 65535 bytes
19:10:13.143931 IP x.y.120.23.53877 > x.y.224.45.8080: Flags [S], seq
3131027441, win 8192, options [mss 960,nop,wscale 8,nop,nop,sackOK], length
0
19:10:13.147951 IP x.y.224.45.8080 > x.y.120.23.53877: Flags [R.], seq 0,
ack 3131027442, win 0, length 0
19:10:16.156520 IP x.y.120.23.53877 > x.y.224.45.8080: Flags [S], seq
3131027441, win 8192, options [mss 960,nop,wscale 8,nop,nop,sackOK], length
0
19:10:16.157693 IP x.y.224.45.8080 > x.y.120.23.53877: Flags [R.], seq 0,
ack 1, win 0, length 0
19:10:22.157407 IP x.y.120.23.53877 > x.y.224.45.8080: Flags [S], seq
3131027441, win 8192, options [mss 960,nop,nop,sackOK], length 0
19:10:22.158682 IP x.y.224.45.8080 > x.y.120.23.53877: Flags [R.], seq 0,
ack 1, win 0, length 0

[(openstack-mitaka) root@prv-0-18-compute user]# tcpdump -nni
tap3bbe0f9d-6b host x.y.120.23 and host x.y.224.45
tcpdump: WARNING: tap3bbe0f9d-6b: no IPv4 address assigned
tcpdump: verbose output suppressed, use -v or -vv for full protocol decode
listening on tap3bbe0f9d-6b, link-type EN10MB (Ethernet), capture size
65535 bytes
19:10:13.143739 IP x.y.120.23.53877 > x.y.224.45.8080: Flags [S], seq
3131027441, win 8192, options [mss 960,nop,wscale 8,nop,nop,sackOK], length
0
19:10:16.156499 IP x.y.120.23.53877 > x.y.224.45.8080: Flags [S], seq
3131027441, win 8192, options [mss 960,nop,wscale 8,nop,nop,sackOK], length
0
19:10:22.157384 IP x.y.120.23.53877 > x.y.224.45.8080: Flags [S], seq
3131027441, win 8192, options [mss 960,nop,nop,sackOK], length 0

Any ideas? Thanks in advance for your help!!

Steve

On Mon, Mar 20, 2017 at 4:17 PM, Richard Jones rjones@suse.com wrote:

You might consider taking a packet trace of the start of an upload to see
what the TCP MSS (Maximum Segment Size) options look like and perhaps
compare between the different configs. Also, you could consider either
using netperf and having it tweak the MSS to a smaller value (test-specific
-G option if I recall correctly), or just try dropping the MTU of your VM
before you try the upload.

Another way to use netperf to "probe" without tweaking MSS or MTU settings
would be to use the TCP_RR test with increasing request/response sizes. If
there is indeed an MTU issue somewhere along the way, as you walk the
request/response size up to the local MTU, you should see the test
performance drop off a cliff if not go fully to zero.

Does the port for the VM have a security group rule permitting ICMP
traffic in? Offhand I wouldn't expect that to be different between the two
network setups you've described because I'd not have expected the virtual
router to pay attention to an arriving ICMP Destination Unreachable,
Datagram Too Big message to have the routed version work, but it seemed a
reasonable straw at which to grasp.

rick jones

PS perhaps iperf has a similar option to set the TCP MSS, I've not looked.

Sterdnot Shaken sterdnotshaken@gmail.com 03/20/17 3:07 PM >>>

Our info:

Openstack version: Mitaka (using OVS 2.5)
Firewall driver: Openvswitch

Anyone know why VM's that are directly on a Flat Provider Network (so the
VM would have a public IP directly assigned to it) can download data just
fine, but when we try and upload anything (iperf where the VM is the client
or something even like speedtest.net (upload portion)) the VM simply can't
get data out to the intended destination? Again, download works great,
upload doesn't.

If I take that VM and change it's interface to be a tenant network one that
has a Openstack HA virtual router, everything (upload and download) works
perfectly. The problem only seems to be apparent when the VM is directly on
the external network.

It seems like an MTU issue, but I don't see how... Here are the MTU's of
the part's at play:

VM: 1500
br-int (specific interface connecting to VM) - 9216
br-ex - (can't tell what that MTU is set to)

Any help would be GREATLY appreciated.

Steve


Mailing list: http://lists.openstack.org/cgi-bin/mailman/listinfo/
openstack
Post to : openstack@lists.openstack.org
Unsubscribe : http://lists.openstack.org/cgi-bin/mailman/listinfo/
openstack


Mailing list: http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack
Post to : openstack@lists.openstack.org
Unsubscribe : http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack
responded Mar 23, 2017 by Adam_Lawson (7,540 points)   4 7 12
0 votes

Just to clarify: Version: Mitaka with OVS only. Firewall driver:
Openvswitch, VM OS: Windows 10

Kaustubh: Thanks for your help on the mirroring part. In my reading
yesterday, I came across a thread that stated you can't mirror a patch
interface with ovs? So, that would explain why I wasn't seeing the expected
traffic on the mirror output ports when mirroring said patch interfaces.
Outside of re-writing flows in OVS that OS installs and adding an
additional output port to the flow and then tcpdumping that added output
port, how would one effectively troubleshoot network traffic issues when
patch interfaces were in use?

Adam: Thanks for chiming in on my issue! I appreciate it. So the VM's are
placed directly on a provider network (external, flat) and, as such, have a
public ip assigned to their nic's. So for these VM's, their default gateway
is a physical router outside of Openstack's control.

As a way to further isolate the issue, I moved ALL but one vm off of one
compute node. Multiple issues happen to show there is an issue, but (on the
windows vm) running something as simple as a speed test (speedtest.net)
works great on the download, but totally fails on the upload. Looking at
all the drop flows on br-int, I did notice that this flow was incrimenting
when the upload part of the test was active:

cookie=0xa9964f66f62764ad, duration=1494.495s, table=82, npackets=5813,
n
bytes=348780, idleage=4, priority=50,ctstate=+inv+trk actions=drop

So I added this flow to mirror what would have been dropped to a dummy
interface (of port 2) that I could tcpdump to see what it was actually
dropping:

ovs-ofctl add-flow br-int
table=82,priority=51,ct_state=+inv+trk,actions=output:2

From the tcpdump, I call see the traffic that the VM is missing that is
likely causing this whole issue...

Anyone have any thoughts on this?

Thanks!

On Thu, Mar 23, 2017 at 11:49 AM, Adam Lawson alawson@aqorn.com wrote:

For downloads, you're using probably DNAT or SNAT. For uploads, you're
using floating IP's I'm guessing. Does uploads work for other VM's with a
similar configuration? It's rare that this would occur so I would presume
it's firewall related (either security group via OpenStack) or firewall on
the VM itself.

Another question, are incoming connections timing out, is the security
group allowing connections from everyone or a subset? i ask because I
haven't seen the easy questions asked up front.

//adam

Adam Lawson

Principal Architect
Office: +1-916-794-5706 <(916)%20794-5706>

On Wed, Mar 22, 2017 at 11:31 AM, Kaustubh Kelkar <
kaustubh.kelkar@casa-systems.com> wrote:

The select_all = 1 is supposed to mirror all the packets.

Referring to the documentation (http://openvswitch.org/suppor
t/dist-docs/ovs-vswitchd.conf.db.5.html),

select_all: boolean

          If true, every packet arriving  or  departing  on  any  port  is

          selected for mirroring.

And for OVS 2.5,

“In Open

   vSwitch 2.5 and later, mirroring  occurs  just  after  a  packet  first

   becomes  eligible, using the packet as it exists at that point; …

in Open vSwitch 2.4, the modifications are never visible to

   mirrors, whereas in Open  vSwitch  2.5  and  later  modifications

made

   before  the first output that makes it eligible for mirroring to a

par‐

   ticular destination are visible.

I believe, if the very first flow is dropping unicast packets, you might
not be able to mirror them.

Maybe you can monitor the flow-tables on each OVS bridge while sending
traffic and see which flows’ count increases. Something like,

watch –n 2 “ovs-ofctl dump-flows ”

-Kaustubh

From: Sterdnot Shaken [mailto:sterdnotshaken@gmail.com]
Sent: Wednesday, March 22, 2017 12:24 PM
To: Kaustubh Kelkar kaustubh.kelkar@casa-systems.com
Subject: Re: [Openstack] VM can receive traffic, but not send it

Here's was my first mirror setup:

ip link add name dummy3 type dummy
ip link set dev dummy3 up

ovs-vsctl add-port br-ex3 dummy3

ovs-vsctl -- set bridge br-ex3 mirrors=@m \
-- --id=@src get port pat-ex3-bss \
-- --id=@mir get port dummy3 \
-- --id=@m create mirror name=ovs_mirror3 select-dst-port=@src
select-src-port=@src output-port=@mir select-all=true

And here's the one I did by copying your example:

ip link add name dummy3 type dummy
ip link set dev dummy3 up

ovs-vsctl add-port br-ex3 dummy3

ovs-vsctl -- set Bridge br-ex3 mirrors=@m \
-- --id=@dummy3 get Port dummy3 \
-- --id=@pat-ex3-bss get Port pat-ex3-bss \
-- --id=@m create Mirror name=mirror0 \
select-dst-port=@pat-ex3-bss select-src-port=@pat-ex3-bss \
output-port=@dummy3 select_all=1

Both yield the same results. When I tcpdump the respective dummy
interface attached to br-ex3, I only see broadcast traffic for the VM in
question, I never see unicast traffic (case and point, if I ping the
broadcast address on the VM, then traffic show's up in the tcpdump). I can
do a tcpdump on the external interface and see the unicast traffic though,
but I need to see where it's breaking in the OVS bridges.

Is there some trick to mirror unicast dataplane traffic?

Thanks in advance!

On Wed, Mar 22, 2017 at 10:07 AM, Kaustubh Kelkar <
kaustubh.kelkar@casa-systems.com> wrote:

From: Sterdnot Shaken [mailto:sterdnotshaken@gmail.com]
Sent: Tuesday, March 21, 2017 8:54 PM
To: Kaustubh Kelkar kaustubh.kelkar@casa-systems.com
Cc: Richard Jones rjones@suse.com; openstack@lists.openstack.org
Subject: Re: [Openstack] VM can receive traffic, but not send it

Thanks for everyone's kind help!

Steve: I will try and turn off the offload features and see if that
helps. Thanks!

Neil: I will also check and make sure neither RPF nor TTL are posing any
issues.

Kaustubh: Is there a reason the mirror approach only seems to work on
some of the OVS bridges, but not others? if I follow your instructions, I
can see traffic when I set up a mirror on some bridges, but not others...
Do I need to put these OVS bridges into promiscuous mode before the mirror
will work?

[Kaustubh] I don’t recall putting the bridge in promiscuous mode, but it
has been a while since I had looked at this. How are you setting up the
mirrors? You would need to mirror a specific port of the bridge, not the
bridge itself.

Thanks!!

On Tue, Mar 21, 2017 at 9:42 AM, Kaustubh Kelkar <
kaustubh.kelkar@casa-systems.com> wrote:

You can narrow down the point where the packets are being dropped by
mirroring and tracing packets on OVS bridge ports. I use a script that does
the following (as root):

ip link add name sniff0 type dummy

ip link set dev sniff0 up

ovs-vsctl add-port br1 sniff0

ovs-vsctl -- set Bridge br1 mirrors=@m \

-- --id=@sniff0 get Port sniff0 \

-- --id=@eth0 get Port eth0 \

-- --id=@m create Mirror name=mirror0 \

select-dst-port=@eth0 select-src-port=@eth0 \

output-port=@sniff0 select_all=1

and to delete,

ovs-vsctl clear Bridge br1 mirrors

ovs-vsctl del-port br1 sniff0

ip link del dev sniff0

where eth0 is the point of packet capture and br1 is the bridge eth0
resides in. Then, you can run tcpdump on sniff0.

Create such mirror ports on

1) phy-br-ex on external OVS bridge

2) int-br-ex on integration bridge

3) qvo-xxx on integration bridge

Also capture packets on qvb-xxx on the linux bridge having the tap
interface of the VM. Hopefully, this will provide us more clues.

-Kaustubh

From: Sterdnot Shaken [mailto:sterdnotshaken@gmail.com]
Sent: Monday, March 20, 2017 9:17 PM
To: Richard Jones rjones@suse.com
Cc: openstack@lists.openstack.org
Subject: Re: [Openstack] VM can receive traffic, but not send it

Wow! Thanks for answering both of my questions!

So, I did some things you suggested, including setting the MSS in iperf
to something small (1000 bytes) and tested with no improvement. I then
changed the VM running on Openstack to have an MTU of 1000 and retested
with no improvement. I noticed that the node I was testing against was
reporting back to the VM on Openstack that it had an MSS of 8960, so just
for the heck of it, I changed the remote node's (server outside of
Openstack) MTU also to 1000 bytes and retested with no improvement. (The
effects of all of these tests were also validated by checking mss settings
in the tcp header via tcpdump).

To simplify the equation, I ditched the iperf for the time being and just
did a simple "telnet 'remote server' 8080" test from the remote server to
the VM in Openstack, while capturing packets all along the way (4 different
points along the network path). Every point saw the same packets, including
the VM's tap interface as expected. I then reversed the test by initiating
the tcp session on the VM in Openstack to the remote server while running
the packet captures at those same points having set the remote server to
respond with a TCP Reset. From VM to Remote server traffic looked correct
with expected TCP SYN. The TCP Reset that the remote server responded with
passed all 4 points of the network, including the external interface on the
Compute node where the VM resides, but the TAP interface that connects to
the VM NEVER sees the Reset. I can recreate this condition over and over.

So, thanks to your ideas Richard, I'm no longer convinced this is an MTU
issue. What would prevent a TCP related response from being forwarded from
the external interface to the intended VM? The security group we have
applied to this VM is wide open, so I can't imagine that is the cause...

Here are 2 packet captures where I initiated a telnet to the remote
server from the VM in Openstack. As said above, I set the remote server to
respond with a reset. The top one is from the physical interface on the
Compute node where the VM resides and the other, the tap interface to that
VM:

[(openstack-mitaka) root@prv-0-18-compute user]# tcpdump -nni eth0 host
x.y.120.23 and host x.y.224.45
tcpdump: WARNING: eth0: no IPv4 address assigned
tcpdump: verbose output suppressed, use -v or -vv for full protocol decode
listening on eth0, link-type EN10MB (Ethernet), capture size 65535 bytes
19:10:13.143931 IP x.y.120.23.53877 > x.y.224.45.8080: Flags [S], seq
3131027441, win 8192, options [mss 960,nop,wscale 8,nop,nop,sackOK], length
0
19:10:13.147951 IP x.y.224.45.8080 > x.y.120.23.53877: Flags [R.], seq 0,
ack 3131027442, win 0, length 0
19:10:16.156520 IP x.y.120.23.53877 > x.y.224.45.8080: Flags [S], seq
3131027441, win 8192, options [mss 960,nop,wscale 8,nop,nop,sackOK], length
0
19:10:16.157693 IP x.y.224.45.8080 > x.y.120.23.53877: Flags [R.], seq 0,
ack 1, win 0, length 0
19:10:22.157407 IP x.y.120.23.53877 > x.y.224.45.8080: Flags [S], seq
3131027441, win 8192, options [mss 960,nop,nop,sackOK], length 0
19:10:22.158682 IP x.y.224.45.8080 > x.y.120.23.53877: Flags [R.], seq 0,
ack 1, win 0, length 0

[(openstack-mitaka) root@prv-0-18-compute user]# tcpdump -nni
tap3bbe0f9d-6b host x.y.120.23 and host x.y.224.45
tcpdump: WARNING: tap3bbe0f9d-6b: no IPv4 address assigned
tcpdump: verbose output suppressed, use -v or -vv for full protocol decode
listening on tap3bbe0f9d-6b, link-type EN10MB (Ethernet), capture size
65535 bytes
19:10:13.143739 IP x.y.120.23.53877 > x.y.224.45.8080: Flags [S], seq
3131027441, win 8192, options [mss 960,nop,wscale 8,nop,nop,sackOK], length
0
19:10:16.156499 IP x.y.120.23.53877 > x.y.224.45.8080: Flags [S], seq
3131027441, win 8192, options [mss 960,nop,wscale 8,nop,nop,sackOK], length
0
19:10:22.157384 IP x.y.120.23.53877 > x.y.224.45.8080: Flags [S], seq
3131027441, win 8192, options [mss 960,nop,nop,sackOK], length 0

Any ideas? Thanks in advance for your help!!

Steve

On Mon, Mar 20, 2017 at 4:17 PM, Richard Jones rjones@suse.com wrote:

You might consider taking a packet trace of the start of an upload to see
what the TCP MSS (Maximum Segment Size) options look like and perhaps
compare between the different configs. Also, you could consider either
using netperf and having it tweak the MSS to a smaller value (test-specific
-G option if I recall correctly), or just try dropping the MTU of your VM
before you try the upload.

Another way to use netperf to "probe" without tweaking MSS or MTU
settings would be to use the TCP_RR test with increasing request/response
sizes. If there is indeed an MTU issue somewhere along the way, as you
walk the request/response size up to the local MTU, you should see the test
performance drop off a cliff if not go fully to zero.

Does the port for the VM have a security group rule permitting ICMP
traffic in? Offhand I wouldn't expect that to be different between the two
network setups you've described because I'd not have expected the virtual
router to pay attention to an arriving ICMP Destination Unreachable,
Datagram Too Big message to have the routed version work, but it seemed a
reasonable straw at which to grasp.

rick jones

PS perhaps iperf has a similar option to set the TCP MSS, I've not looked.

Sterdnot Shaken sterdnotshaken@gmail.com 03/20/17 3:07 PM >>>

Our info:

Openstack version: Mitaka (using OVS 2.5)
Firewall driver: Openvswitch

Anyone know why VM's that are directly on a Flat Provider Network (so the
VM would have a public IP directly assigned to it) can download data just
fine, but when we try and upload anything (iperf where the VM is the
client
or something even like speedtest.net (upload portion)) the VM simply
can't
get data out to the intended destination? Again, download works great,
upload doesn't.

If I take that VM and change it's interface to be a tenant network one
that
has a Openstack HA virtual router, everything (upload and download) works
perfectly. The problem only seems to be apparent when the VM is directly
on
the external network.

It seems like an MTU issue, but I don't see how... Here are the MTU's of
the part's at play:

VM: 1500
br-int (specific interface connecting to VM) - 9216
br-ex - (can't tell what that MTU is set to)

Any help would be GREATLY appreciated.

Steve


Mailing list: http://lists.openstack.org/cgi
-bin/mailman/listinfo/openstack
Post to : openstack@lists.openstack.org
Unsubscribe : http://lists.openstack.org/cgi
-bin/mailman/listinfo/openstack


Mailing list: http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack
Post to : openstack@lists.openstack.org
Unsubscribe : http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack
responded Mar 23, 2017 by Sterdnot_Shaken (900 points)   2 4 10
0 votes

From: Sterdnot Shakensterdnotshaken@gmail.com
Sent: Thursday, March 23, 2017 2:04 PM
To: Adam Lawsonalawson@aqorn.com
Cc: Kaustubh Kelkarkaustubh.kelkar@casa-systems.com; openstack@lists.openstack.org
Subject: Re: [Openstack] VM can receive traffic, but not send it

Just to clarify: Version: Mitaka with OVS only. Firewall driver: Openvswitch, VM OS: Windows 10

Kaustubh: Thanks for your help on the mirroring part. In my reading yesterday, I came across a thread that stated you can't mirror a patch interface with ovs? So, that would explain why I wasn't seeing the expected traffic on the mirror output ports when mirroring said patch interfaces. Outside of re-writing flows in OVS that OS installs and adding an additional output port to the flow and then tcpdumping that added output port, how would one effectively troubleshoot network traffic issues when patch interfaces were in use?

Adam: Thanks for chiming in on my issue! I appreciate it. So the VM's are placed directly on a provider network (external, flat) and, as such, have a public ip assigned to their nic's. So for these VM's, their default gateway is a physical router outside of Openstack's control.
As a way to further isolate the issue, I moved ALL but one vm off of one compute node. Multiple issues happen to show there is an issue, but (on the windows vm) running something as simple as a speed test (speedtest.net) works great on the download, but totally fails on the upload. Looking at all the drop flows on br-int, I did notice that this flow was incrimenting when the upload part of the test was active:

cookie=0xa9964f66f62764ad, duration=1494.495s, table=82, npackets=5813, nbytes=348780, idleage=4, priority=50,ctstate=+inv+trk actions=drop
[Kaustubh] The flow is dropping an invalid packet for a tracked connection. From [1], maybe the nfconntrack* modules are not loaded on the compute? Without knowing the complete flow information, I may not be able to provide much help.

I still live in an era where security groups are implemented within iptables on Linux bridges!
[1] http://www.ovn.org/support/dist-docs-2.5/ovs-ofctl.8.pdf
-Kaustubh
So I added this flow to mirror what would have been dropped to a dummy interface (of port 2) that I could tcpdump to see what it was actually dropping:

ovs-ofctl add-flow br-int table=82,priority=51,ct_state=+inv+trk,actions=output:2

From the tcpdump, I call see the traffic that the VM is missing that is likely causing this whole issue...
Anyone have any thoughts on this?
Thanks!

On Thu, Mar 23, 2017 at 11:49 AM, Adam Lawson alawson@aqorn.com wrote:
For downloads, you're using probably DNAT or SNAT. For uploads, you're using floating IP's I'm guessing. Does uploads work for other VM's with a similar configuration? It's rare that this would occur so I would presume it's firewall related (either security group via OpenStack) or firewall on the VM itself.

Another question, are incoming connections timing out, is the security group allowing connections from everyone or a subset? i ask because I haven't seen the easy questions asked up front.

//adam

Adam Lawson

Principal Architect
Office: +1-916-794-5706<tel:(916)%20794-5706>

On Wed, Mar 22, 2017 at 11:31 AM, Kaustubh Kelkar kaustubh.kelkar@casa-systems.com wrote:
The select_all = 1 is supposed to mirror all the packets.

Referring to the documentation (http://openvswitch.org/support/dist-docs/ovs-vswitchd.conf.db.5.html),

“select_all: boolean

          If true, every packet arriving  or  departing  on  any  port  is

          selected for mirroring.

And for OVS 2.5,

“In Open

   vSwitch 2.5 and later, mirroring  occurs  just  after  a  packet  first

   becomes  eligible, using the packet as it exists at that point; …

in Open vSwitch 2.4, the modifications are never visible to
mirrors, whereas in Open vSwitch 2.5 and later modifications made
before the first output that makes it eligible for mirroring to a par‐
ticular destination are visible.

I believe, if the very first flow is dropping unicast packets, you might not be able to mirror them.

Maybe you can monitor the flow-tables on each OVS bridge while sending traffic and see which flows’ count increases. Something like,
watch –n 2 “ovs-ofctl dump-flows ”

-Kaustubh

From: Sterdnot Shaken [mailto:sterdnotshaken@gmail.com]
Sent: Wednesday, March 22, 2017 12:24 PM
To: Kaustubh Kelkar kaustubh.kelkar@casa-systems.com
Subject: Re: [Openstack] VM can receive traffic, but not send it

Here's was my first mirror setup:
ip link add name dummy3 type dummy
ip link set dev dummy3 up

ovs-vsctl add-port br-ex3 dummy3

ovs-vsctl -- set bridge br-ex3 mirrors=@m \
-- --id=@src get port pat-ex3-bss \
-- --id=@mir get port dummy3 \
-- --id=@m create mirror name=ovs_mirror3 select-dst-port=@src select-src-port=@src output-port=@mir select-all=true

And here's the one I did by copying your example:
ip link add name dummy3 type dummy
ip link set dev dummy3 up

ovs-vsctl add-port br-ex3 dummy3

ovs-vsctl -- set Bridge br-ex3 mirrors=@m \
-- --id=@dummy3 get Port dummy3 \
-- --id=@pat-ex3-bss get Port pat-ex3-bss \
-- --id=@m create Mirror name=mirror0 \
select-dst-port=@pat-ex3-bss select-src-port=@pat-ex3-bss \
output-port=@dummy3 select_all=1

Both yield the same results. When I tcpdump the respective dummy interface attached to br-ex3, I only see broadcast traffic for the VM in question, I never see unicast traffic (case and point, if I ping the broadcast address on the VM, then traffic show's up in the tcpdump). I can do a tcpdump on the external interface and see the unicast traffic though, but I need to see where it's breaking in the OVS bridges.
Is there some trick to mirror unicast dataplane traffic?
Thanks in advance!

On Wed, Mar 22, 2017 at 10:07 AM, Kaustubh Kelkar kaustubh.kelkar@casa-systems.com wrote:

From: Sterdnot Shaken [mailto:sterdnotshaken@gmail.com]
Sent: Tuesday, March 21, 2017 8:54 PM
To: Kaustubh Kelkar kaustubh.kelkar@casa-systems.com
Cc: Richard Jones rjones@suse.com; openstack@lists.openstack.org
Subject: Re: [Openstack] VM can receive traffic, but not send it

Thanks for everyone's kind help!
Steve: I will try and turn off the offload features and see if that helps. Thanks!
Neil: I will also check and make sure neither RPF nor TTL are posing any issues.

Kaustubh: Is there a reason the mirror approach only seems to work on some of the OVS bridges, but not others? if I follow your instructions, I can see traffic when I set up a mirror on some bridges, but not others... Do I need to put these OVS bridges into promiscuous mode before the mirror will work?
[Kaustubh] I don’t recall putting the bridge in promiscuous mode, but it has been a while since I had looked at this. How are you setting up the mirrors? You would need to mirror a specific port of the bridge, not the bridge itself.
Thanks!!

On Tue, Mar 21, 2017 at 9:42 AM, Kaustubh Kelkar kaustubh.kelkar@casa-systems.com wrote:
You can narrow down the point where the packets are being dropped by mirroring and tracing packets on OVS bridge ports. I use a script that does the following (as root):

ip link add name sniff0 type dummy
ip link set dev sniff0 up
ovs-vsctl add-port br1 sniff0
ovs-vsctl -- set Bridge br1 mirrors=@m \
-- --id=@sniff0 get Port sniff0 \
-- --id=@eth0 get Port eth0 \
-- --id=@m create Mirror name=mirror0 \
select-dst-port=@eth0 select-src-port=@eth0 \
output-port=@sniff0 select_all=1

and to delete,
ovs-vsctl clear Bridge br1 mirrors
ovs-vsctl del-port br1 sniff0
ip link del dev sniff0

where eth0 is the point of packet capture and br1 is the bridge eth0 resides in. Then, you can run tcpdump on sniff0.
Create such mirror ports on
1) phy-br-ex on external OVS bridge
2) int-br-ex on integration bridge
3) qvo-xxx on integration bridge
Also capture packets on qvb-xxx on the linux bridge having the tap interface of the VM. Hopefully, this will provide us more clues.

-Kaustubh

From: Sterdnot Shaken [mailto:sterdnotshaken@gmail.com]
Sent: Monday, March 20, 2017 9:17 PM
To: Richard Jones rjones@suse.com
Cc: openstack@lists.openstack.org
Subject: Re: [Openstack] VM can receive traffic, but not send it

Wow! Thanks for answering both of my questions!
So, I did some things you suggested, including setting the MSS in iperf to something small (1000 bytes) and tested with no improvement. I then changed the VM running on Openstack to have an MTU of 1000 and retested with no improvement. I noticed that the node I was testing against was reporting back to the VM on Openstack that it had an MSS of 8960, so just for the heck of it, I changed the remote node's (server outside of Openstack) MTU also to 1000 bytes and retested with no improvement. (The effects of all of these tests were also validated by checking mss settings in the tcp header via tcpdump).
To simplify the equation, I ditched the iperf for the time being and just did a simple "telnet 'remote server' 8080" test from the remote server to the VM in Openstack, while capturing packets all along the way (4 different points along the network path). Every point saw the same packets, including the VM's tap interface as expected. I then reversed the test by initiating the tcp session on the VM in Openstack to the remote server while running the packet captures at those same points having set the remote server to respond with a TCP Reset. From VM to Remote server traffic looked correct with expected TCP SYN. The TCP Reset that the remote server responded with passed all 4 points of the network, including the external interface on the Compute node where the VM resides, but the TAP interface that connects to the VM NEVER sees the Reset. I can recreate this condition over and over.
So, thanks to your ideas Richard, I'm no longer convinced this is an MTU issue. What would prevent a TCP related response from being forwarded from the external interface to the intended VM? The security group we have applied to this VM is wide open, so I can't imagine that is the cause...
Here are 2 packet captures where I initiated a telnet to the remote server from the VM in Openstack. As said above, I set the remote server to respond with a reset. The top one is from the physical interface on the Compute node where the VM resides and the other, the tap interface to that VM:

[(openstack-mitaka) root@prv-0-18-compute user]# tcpdump -nni eth0 host x.y.120.23 and host x.y.224.45
tcpdump: WARNING: eth0: no IPv4 address assigned
tcpdump: verbose output suppressed, use -v or -vv for full protocol decode
listening on eth0, link-type EN10MB (Ethernet), capture size 65535 bytes
19:10:13.143931 IP x.y.120.23.53877 > x.y.224.45.8080: Flags [S], seq 3131027441, win 8192, options [mss 960,nop,wscale 8,nop,nop,sackOK], length 0
19:10:13.147951 IP x.y.224.45.8080 > x.y.120.23.53877: Flags [R.], seq 0, ack 3131027442, win 0, length 0
19:10:16.156520 IP x.y.120.23.53877 > x.y.224.45.8080: Flags [S], seq 3131027441, win 8192, options [mss 960,nop,wscale 8,nop,nop,sackOK], length 0
19:10:16.157693 IP x.y.224.45.8080 > x.y.120.23.53877: Flags [R.], seq 0, ack 1, win 0, length 0
19:10:22.157407 IP x.y.120.23.53877 > x.y.224.45.8080: Flags [S], seq 3131027441, win 8192, options [mss 960,nop,nop,sackOK], length 0
19:10:22.158682 IP x.y.224.45.8080 > x.y.120.23.53877: Flags [R.], seq 0, ack 1, win 0, length 0

[(openstack-mitaka) root@prv-0-18-compute user]# tcpdump -nni tap3bbe0f9d-6b host x.y.120.23 and host x.y.224.45
tcpdump: WARNING: tap3bbe0f9d-6b: no IPv4 address assigned
tcpdump: verbose output suppressed, use -v or -vv for full protocol decode
listening on tap3bbe0f9d-6b, link-type EN10MB (Ethernet), capture size 65535 bytes
19:10:13.143739 IP x.y.120.23.53877 > x.y.224.45.8080: Flags [S], seq 3131027441, win 8192, options [mss 960,nop,wscale 8,nop,nop,sackOK], length 0
19:10:16.156499 IP x.y.120.23.53877 > x.y.224.45.8080: Flags [S], seq 3131027441, win 8192, options [mss 960,nop,wscale 8,nop,nop,sackOK], length 0
19:10:22.157384 IP x.y.120.23.53877 > x.y.224.45.8080: Flags [S], seq 3131027441, win 8192, options [mss 960,nop,nop,sackOK], length 0
Any ideas? Thanks in advance for your help!!
Steve

On Mon, Mar 20, 2017 at 4:17 PM, Richard Jones rjones@suse.com wrote:
You might consider taking a packet trace of the start of an upload to see what the TCP MSS (Maximum Segment Size) options look like and perhaps compare between the different configs. Also, you could consider either using netperf and having it tweak the MSS to a smaller value (test-specific -G option if I recall correctly), or just try dropping the MTU of your VM before you try the upload.

Another way to use netperf to "probe" without tweaking MSS or MTU settings would be to use the TCP_RR test with increasing request/response sizes. If there is indeed an MTU issue somewhere along the way, as you walk the request/response size up to the local MTU, you should see the test performance drop off a cliff if not go fully to zero.

Does the port for the VM have a security group rule permitting ICMP traffic in? Offhand I wouldn't expect that to be different between the two network setups you've described because I'd not have expected the virtual router to pay attention to an arriving ICMP Destination Unreachable, Datagram Too Big message to have the routed version work, but it seemed a reasonable straw at which to grasp.

rick jones

PS perhaps iperf has a similar option to set the TCP MSS, I've not looked.

Sterdnot Shaken sterdnotshaken@gmail.com 03/20/17 3:07 PM >>>
Our info:

Openstack version: Mitaka (using OVS 2.5)
Firewall driver: Openvswitch

Anyone know why VM's that are directly on a Flat Provider Network (so the
VM would have a public IP directly assigned to it) can download data just
fine, but when we try and upload anything (iperf where the VM is the client
or something even like speedtest.net (upload portion)) the VM simply can't
get data out to the intended destination? Again, download works great,
upload doesn't.

If I take that VM and change it's interface to be a tenant network one that
has a Openstack HA virtual router, everything (upload and download) works
perfectly. The problem only seems to be apparent when the VM is directly on
the external network.

It seems like an MTU issue, but I don't see how... Here are the MTU's of
the part's at play:

VM: 1500
br-int (specific interface connecting to VM) - 9216
br-ex - (can't tell what that MTU is set to)

Any help would be GREATLY appreciated.

Steve


Mailing list: http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack
Post to : openstack@lists.openstack.org
Unsubscribe : http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack


Mailing list: http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack
Post to : openstack@lists.openstack.org
Unsubscribe : http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack
responded Mar 24, 2017 by Kaustubh_Kelkar (1,780 points)   2 2 3
...