settingsLogin | Registersettings

[openstack-dev] [Neutron][L3] Representing a networks connected by routers

0 votes

I'm looking for feedback from anyone interest but, in particular, I'd
like feedback from the following people for varying perspectives:
Mark McClain (proposed alternate), John Belamaric (IPAM), Ryan Tidwell
(BGP), Neil Jerram (L3 networks), Aaron Rosen (help understand
multi-provider networks) and you if you're reading this list of names
and thinking "he forgot me!"

We have been struggling to develop a way to model a network which is
composed of disjoint L2 networks connected by routers. The intent of
this email is to describe the two proposals and request input on the
two in attempt to choose a direction forward. But, first:
requirements.

Requirements:

The network should appear to end users as a single network choice.
They should not be burdened with choosing between segments. It might
interest them that L2 communications may not work between instances on
this network but that is all. This has been requested by numerous
operators [1][4]. It can be useful for external networks and provider
networks.

The model needs to be flexible enough to support two distinct types of
addresses: 1) address blocks which are statically bound to a single
segment and 2) address blocks which are mobile across segments using
some sort of dynamic routing capability like BGP or programmatically
injecting routes in to the infrastructure's routers with a plugin.

Overlay networks are not the answer to this. The goal of this effort
is to scale very large networks with many connected ports by doing L3
routing (e.g. to the top of rack) instead of using a large continuous
L2 fabric. Also, the operators interested in this work do not want
the complexity of overlay networks [4].

Proposal 1:

We refined this model [2] at the Neutron mid-cycle a couple of weeks
ago. This proposal has already resonated reasonably with operators,
especially those from GoDaddy who attended the Neutron sprint. Some
key parts of this proposal are:

  1. The routed super network is called a front network. The segments
    are called back(ing) networks.
  2. Backing networks are modeled as admin-owned private provider
    networks but otherwise are full-blown Neutron networks.
  3. The front network is marked with a new provider type.
  4. A Neutron router is created to link the backing networks with
    internal ports. It represents the collective routing ability of the
    underlying infrastructure.
  5. Backing networks are associated with a subset of hosts.
  6. Ports created on the front network must have a host binding and
    are actually created on a backing network when all is said and done.
    They carry the ID of the backing network in the DB.

Using Neutron networks to model the segments allows us to fully
specify the details of each network using the regular Neutron model.
They could be heterogeneous or homogeneous, it doesn't matter.

This proposal offers a clear separation between the statically bound
and the mobile address blocks by associating the former with the
backing networks and the latter with the front network. The mobile
addresses are modeled just like floating IPs are today but are
implemented by some plugin code (possibly without NAT).

This proposal also provides some advantages for integrating dynamic
routing. Since each backing network will, by necessity, have a
corresponding router in the infrastructure, the relationship between
dynamic routing speaker, router, and network is clear in the model:
network <-> speaker <-> router.

Proposal 2:

This alternate model has not been fully fleshed out. Some parts of it
are still unclear to me. The basic idea is to give the IPAM system
information about IP availability on a given host. When creating a
port, the binding information would be sent to the IPAM system and the
system would choose an appropriate address block for the allocation.

  1. This alternate model offers no way to distinguish the two types of
    address blocks.
  2. We don't have the benefit of modeling the segments with Neutron networks.

It was suggested that hierarchical port binding could help here but I
see it as orthogonal to this. Hierarchical port binding extends the
L2 properties of a port to a hierarchical infrastructure to achieve
continuous L2 connectivity. It is also intended for overlay networks.
That isn't what we're doing here and I don't think it fits.

I have also considered the multi-provider extension [3] for this.
This is not yet clear to me either. First, my understanding was that
this extension describes multi-segment continuous L2 fabrics. Second,
there doesn't seem to be any host binding aspect to the multi-provider
extension. Third, not all L2 plugins support this extension. It
seems silly to require L2 plugin support in order to enable routing
between segments.

It isn't clear to me how a dynamic routing speaker will fit in to this
model. My first thought is that it must be integrated with IPAM
because the IPAM system has the understanding of how to map address
blocks to infrastructure. This pushes even more infrastructure
knowledge down to the IPAM system. If dynamic routing is pushed down
to the IPAM system, it will also be necessary to push the association
of mobile IPs or routed tenant subnets down in to the IPAM system too.
This means Neutron needs to tell IPAM about every floating IP
association and every tenant subnet behind a Neutron router in the
same address scope as the external network. I'm not convinced that
IPAM and routing really belong together like this.

If you made it this far in this email, you must have some feedback.
Please help us out.

Carl Baldwin

[1] https://bugs.launchpad.net/neutron/+bug/1458890
[2] https://review.openstack.org/#/c/196812/
[3] http://developer.openstack.org/api-ref-networking-v2-ext.html#network_multi_provider-ext
[4] https://etherpad.openstack.org/p/Network_Segmentation_Usecases


OpenStack Development Mailing List (not for usage questions)
Unsubscribe: OpenStack-dev-request@lists.openstack.org?subject:unsubscribe
http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev
asked Jul 20, 2015 in openstack-dev by Carl_Baldwin (14,940 points)   2 4 7

43 Responses

0 votes

There are two routed network models:

  • I give my VM an address that bears no relation to its location and ensure
    the routed fabric routes packets there - this is very much the routing
    protocol method for doing things where I have injected a route into the
    network and it needs to propagate. It's also pretty useless because there
    are too many host routes in any reasonable sized cloud.

  • I give my VM an address that is based on its location, which only becomes
    apparent at binding time. This means that the semantics of a port changes

  • a port has no address of any meaning until binding, because its location
    is related to what it does - and it leaves open questions about what to do
    when you migrate.

Now, you seem to generally be thinking in terms of the latter model,
particularly since the provider network model you're talking about fits
there. But then you say:

On 20 July 2015 at 10:33, Carl Baldwin carl@ecbaldwin.net wrote:

When creating a
port, the binding information would be sent to the IPAM system and the
system would choose an appropriate address block for the allocation.

No, it wouldn't, because creating and binding a port are separate
operations. I can't give the port a location-specific address on creation
- not until it's bound, in fact, which happens much later.

On proposal 1: consider the cost of adding a datamodel to Neutron. It has
to be respected by all developers, it frequently has to be deployed by all
operators, and every future change has to align with it. Plus either it
has to be generic or optional, and if optional it's a burden to some
proportion of Neutron developers and users. I accept proposal 1 is easy,
but it's not universally applicable. It doesn't work with Neil Jerram's
plans, it doesn't work with multiple interfaces per host, and it doesn't
work with the IPv6 routed-network model I worked on.

Given that, I wonder whether proposal 2 could be rephrased.

1: some network types don't allow unbound ports to have addresses, they
just get placeholder addresses for each subnet until they're bound
2: 'subnets' on these networks are more special than subnets on other
networks. (More accurately, they dont use subnets. It's a shame subnets
are core Neutron, because they're pretty horrible and yet hard to replace.)
3: there's an independent (in an extension? In another API endpoint?)
datamodel that the network points to and that IPAM refers to to find a port
an address. Bonus, people who aren't using funky network types can disable
this extension.
4: when the port is bound, the IPAM is referred to, and it's told the
binding information of the port.
5: when binding the port, once IPAM has returned its address, the network
controller probably does stuff with that address when it completes the
binding (like initialising routing).
6: live migration either has to renumber a port or forward old traffic to
the new address via route injection. This is an open question now, so I'm
mentioning it rather than solving it.

In fact, adding that hook to IPAM at binding plus setting aside a 'not set'
IP address might be all you need to do to make it possible. The IPAM needs
data to work out what an address is, but that doesn't have to take the form
of existing Neutron constructs.


OpenStack Development Mailing List (not for usage questions)
Unsubscribe: OpenStack-dev-request@lists.openstack.org?subject:unsubscribe
http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev
responded Jul 20, 2015 by Ian_Wells (5,300 points)   1 2 5
0 votes

----- Original Message -----
I'm looking for feedback from anyone interest but, in particular, I'd
like feedback from the following people for varying perspectives:
Mark McClain (proposed alternate), John Belamaric (IPAM), Ryan Tidwell
(BGP), Neil Jerram (L3 networks), Aaron Rosen (help understand
multi-provider networks) and you if you're reading this list of names
and thinking "he forgot me!"

We have been struggling to develop a way to model a network which is
composed of disjoint L2 networks connected by routers. The intent of
this email is to describe the two proposals and request input on the
two in attempt to choose a direction forward. But, first:
requirements.

Requirements:

The network should appear to end users as a single network choice.
They should not be burdened with choosing between segments. It might
interest them that L2 communications may not work between instances on
this network but that is all. This has been requested by numerous
operators [1][4]. It can be useful for external networks and provider
networks.

I think that [1] and [4] are conflating the problem statement with the
proposed solutions, and lacking some lower level details regarding the
problem statement, which makes it a lot harder to engage in a discussion.

I'm looking at [4]:
What I don't see explicitly mentioned is: Does the same CIDR extend across racks,
or would each rack get its own CIDR(s)? I understand this can differ according to
the architectural choices you make in your data center, and that changes the
choices we'd need to make to Neutron in order to satisfy that requirement.

To clarify, option (1) means that a subnet is contained to a rack. Option (2)
means that a subnet may span across racks. I don't think we need to change the network/subnet
model at all to satisfy case (1). Each rack would have its own network/subnet
(Or perhaps multiple, if more than a single VLAN or other characteristic is desired).
Each network would be tagged with an AZ (This ties in nicely to the already proposed Neutron AZ spec),
and the Nova scheduler would become aware of Neutron network AZs. In this model
you don't want to connect to a network, you want Nova to schedule the VM and then have Nova choose
the network on that rack. If you want more than a single network in a rack, then there's
some difference between those networks that could be expressed in tags (Think: Network flavors),
such as the security zone. You'd then specify a tag that should be satisfied by the
network that the VM ends up connecting to, so that the tag may be added to the list
of Nova scheduler filters. Again, this goes back to keeping the Neutron network and subnet
just as they are but doing some work with AZs, tagging and the Nova scheduler.
We've known that the Nova scheduler must become Network aware for the past few years,
perhaps it's time to finally tackle that.

I can see why option (2) may require a fundamental change to how Neutron models networks/subnets.
I think it's essentially a different problem, and we'd have to see how we model
Neutron networks/subnets so that something like Calico would feel better. That being said,
if option (1) is worth pursuing, that would be a reasonable first step because any changes required
by option (2) are, I think, unrelated.

The model needs to be flexible enough to support two distinct types of
addresses: 1) address blocks which are statically bound to a single
segment and 2) address blocks which are mobile across segments using
some sort of dynamic routing capability like BGP or programmatically
injecting routes in to the infrastructure's routers with a plugin.

Overlay networks are not the answer to this. The goal of this effort
is to scale very large networks with many connected ports by doing L3
routing (e.g. to the top of rack) instead of using a large continuous
L2 fabric. Also, the operators interested in this work do not want
the complexity of overlay networks [4].

Proposal 1:

We refined this model [2] at the Neutron mid-cycle a couple of weeks
ago. This proposal has already resonated reasonably with operators,
especially those from GoDaddy who attended the Neutron sprint. Some
key parts of this proposal are:

  1. The routed super network is called a front network. The segments
    are called back(ing) networks.
  2. Backing networks are modeled as admin-owned private provider
    networks but otherwise are full-blown Neutron networks.
  3. The front network is marked with a new provider type.
  4. A Neutron router is created to link the backing networks with
    internal ports. It represents the collective routing ability of the
    underlying infrastructure.
  5. Backing networks are associated with a subset of hosts.
  6. Ports created on the front network must have a host binding and
    are actually created on a backing network when all is said and done.
    They carry the ID of the backing network in the DB.

Using Neutron networks to model the segments allows us to fully
specify the details of each network using the regular Neutron model.
They could be heterogeneous or homogeneous, it doesn't matter.

This proposal offers a clear separation between the statically bound
and the mobile address blocks by associating the former with the
backing networks and the latter with the front network. The mobile
addresses are modeled just like floating IPs are today but are
implemented by some plugin code (possibly without NAT).

This proposal also provides some advantages for integrating dynamic
routing. Since each backing network will, by necessity, have a
corresponding router in the infrastructure, the relationship between
dynamic routing speaker, router, and network is clear in the model:
network <-> speaker <-> router.

Proposal 2:

This alternate model has not been fully fleshed out. Some parts of it
are still unclear to me. The basic idea is to give the IPAM system
information about IP availability on a given host. When creating a
port, the binding information would be sent to the IPAM system and the
system would choose an appropriate address block for the allocation.

  1. This alternate model offers no way to distinguish the two types of
    address blocks.
  2. We don't have the benefit of modeling the segments with Neutron networks.

It was suggested that hierarchical port binding could help here but I
see it as orthogonal to this. Hierarchical port binding extends the
L2 properties of a port to a hierarchical infrastructure to achieve
continuous L2 connectivity. It is also intended for overlay networks.
That isn't what we're doing here and I don't think it fits.

I have also considered the multi-provider extension [3] for this.
This is not yet clear to me either. First, my understanding was that
this extension describes multi-segment continuous L2 fabrics. Second,
there doesn't seem to be any host binding aspect to the multi-provider
extension. Third, not all L2 plugins support this extension. It
seems silly to require L2 plugin support in order to enable routing
between segments.

It isn't clear to me how a dynamic routing speaker will fit in to this
model. My first thought is that it must be integrated with IPAM
because the IPAM system has the understanding of how to map address
blocks to infrastructure. This pushes even more infrastructure
knowledge down to the IPAM system. If dynamic routing is pushed down
to the IPAM system, it will also be necessary to push the association
of mobile IPs or routed tenant subnets down in to the IPAM system too.
This means Neutron needs to tell IPAM about every floating IP
association and every tenant subnet behind a Neutron router in the
same address scope as the external network. I'm not convinced that
IPAM and routing really belong together like this.

If you made it this far in this email, you must have some feedback.
Please help us out.

Carl Baldwin

[1] https://bugs.launchpad.net/neutron/+bug/1458890
[2] https://review.openstack.org/#/c/196812/
[3]
http://developer.openstack.org/api-ref-networking-v2-ext.html#network_multi_provider-ext
[4] https://etherpad.openstack.org/p/Network_Segmentation_Usecases


OpenStack Development Mailing List (not for usage questions)
Unsubscribe: OpenStack-dev-request@lists.openstack.org?subject:unsubscribe
http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev


OpenStack Development Mailing List (not for usage questions)
Unsubscribe: OpenStack-dev-request@lists.openstack.org?subject:unsubscribe
http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev
responded Jul 21, 2015 by Assaf_Muller (5,540 points)   1 2 6
0 votes

On 20/07/15 18:36, Carl Baldwin wrote:
I'm looking for feedback from anyone interest but, in particular, I'd
like feedback from the following people for varying perspectives:
Mark McClain (proposed alternate), John Belamaric (IPAM), Ryan Tidwell
(BGP), Neil Jerram (L3 networks), Aaron Rosen (help understand
multi-provider networks) and you if you're reading this list of names
and thinking "he forgot me!"

We have been struggling to develop a way to model a network which is
composed of disjoint L2 networks connected by routers. The intent of
this email is to describe the two proposals and request input on the
two in attempt to choose a direction forward. But, first:
requirements.

Requirements:

The network should appear to end users as a single network choice.
They should not be burdened with choosing between segments. It might
interest them that L2 communications may not work between instances on
this network but that is all. This has been requested by numerous
operators [1][4]. It can be useful for external networks and provider
networks.

The model needs to be flexible enough to support two distinct types of
addresses: 1) address blocks which are statically bound to a single
segment and 2) address blocks which are mobile across segments using
some sort of dynamic routing capability like BGP or programmatically
injecting routes in to the infrastructure's routers with a plugin.

FWIW, I hadn't previously realized (2) here.

Overlay networks are not the answer to this. The goal of this effort
is to scale very large networks with many connected ports by doing L3
routing (e.g. to the top of rack) instead of using a large continuous
L2 fabric. Also, the operators interested in this work do not want
the complexity of overlay networks [4].

Proposal 1:

We refined this model [2] at the Neutron mid-cycle a couple of weeks
ago. This proposal has already resonated reasonably with operators,
especially those from GoDaddy who attended the Neutron sprint. Some
key parts of this proposal are:

  1. The routed super network is called a front network. The segments
    are called back(ing) networks.
  2. Backing networks are modeled as admin-owned private provider
    networks but otherwise are full-blown Neutron networks.
  3. The front network is marked with a new provider type.
  4. A Neutron router is created to link the backing networks with
    internal ports. It represents the collective routing ability of the
    underlying infrastructure.
  5. Backing networks are associated with a subset of hosts.
  6. Ports created on the front network must have a host binding and
    are actually created on a backing network when all is said and done.
    They carry the ID of the backing network in the DB.

Using Neutron networks to model the segments allows us to fully
specify the details of each network using the regular Neutron model.
They could be heterogeneous or homogeneous, it doesn't matter.

You've probably seen Robert Kukura's comment on the related bug at
https://bugs.launchpad.net/neutron/+bug/1458890/comments/30, and there
is a useful detailed description of how the multiprovider extension
works at
https://bugs.launchpad.net/openstack-api-site/+bug/1242019/comments/3.
I believe it is correct to say that using multiprovider would be an
effective substitute for using multiple backing networks with different
{networktype, physicalnetwork, segmentation_id}, and that logically
multiprovider is aiming to describe the same thing as this email thread
is, i.e. non-overlay mapping onto a physical network composed of
multiple segments.

However, I believe multiprovider does not (per se) address the IP
addressing requirement(s) of the multi-segment scenario.

This proposal offers a clear separation between the statically bound
and the mobile address blocks by associating the former with the
backing networks and the latter with the front network. The mobile
addresses are modeled just like floating IPs are today but are
implemented by some plugin code (possibly without NAT).

Couldn't the mobile addresses be exactly like floating IPs already
are? Why is anything different from floating IPs needed here?

This proposal also provides some advantages for integrating dynamic
routing. Since each backing network will, by necessity, have a
corresponding router in the infrastructure, the relationship between
dynamic routing speaker, router, and network is clear in the model:
network <-> speaker <-> router.

I'm not sure exactly what you mean here by 'dynamic routing', but I
think this touches on a key point: can IP routing happen anywhere in a
Neutron network, without being explicitly represented by a router object
in the model?

I think the answer to that should be yes. It clearly already is in the
underlay if you are using tunnels - the tunnel between two compute hosts
may require multiple IP hops across the fabric. At the network level
that Neutron networks currently model, the answer is currently no, but I
think it's interesting to consider changing that.

Proposal 2:

This alternate model has not been fully fleshed out.

I should begin by admitting the blame here. Much of this is a
half-baked idea from me, that I haven't yet had time to explore
properly. However....

Some parts of it
are still unclear to me. The basic idea is to give the IPAM system
information about IP availability on a given host. When creating a
port, the binding information would be sent to the IPAM system and the
system would choose an appropriate address block for the allocation.

Right. A key requirement, for this to be possible, is that Nova's host
selection happens before the IPAM system is asked to allocate an IP
address. I have an action to investigate that, but if anyone happens to
know already, please do say.

  1. This alternate model offers no way to distinguish the two types of
    address blocks.

Agreed. But I wonder if normal floating IPs can be used for the mobile
IP addresses (as also suggested above).

  1. We don't have the benefit of modeling the segments with Neutron networks.

Agreed, but it appears that multiprovider has already taken a different
view here, and already provides the ability for a network to map to
multiple {networktype, physicalnetwork, segmentation_id} tuples.

It was suggested that hierarchical port binding could help here but I
see it as orthogonal to this. Hierarchical port binding extends the
L2 properties of a port to a hierarchical infrastructure to achieve
continuous L2 connectivity. It is also intended for overlay networks.
That isn't what we're doing here and I don't think it fits.

I have also considered the multi-provider extension [3] for this.
This is not yet clear to me either. First, my understanding was that
this extension describes multi-segment continuous L2 fabrics.

https://bugs.launchpad.net/openstack-api-site/+bug/1242019/comments/3 says:

"Note that, although ML2 can manage binding to multi-segment networks,
neutron does not manage bridging between the segments of a multi-segment
network. This is assumed to be done administratively."

So I think it is not intended for a multiprovider network to be
"continuous".

Again, this touches on the point above about routing happening without
being explicitly represented in the Neutron model...

Second,
there doesn't seem to be any host binding aspect to the multi-provider
extension. Third, not all L2 plugins support this extension. It
seems silly to require L2 plugin support in order to enable routing
between segments.

Good point. If all plugins required the same kind of transformation to
support multiprovider, perhaps that's telling us that the multi-ness
should instead be in a layer above, more like your proposal 1.

It isn't clear to me how a dynamic routing speaker will fit in to this
model. My first thought is that it must be integrated with IPAM
because the IPAM system has the understanding of how to map address
blocks to infrastructure. This pushes even more infrastructure
knowledge down to the IPAM system. If dynamic routing is pushed down
to the IPAM system, it will also be necessary to push the association
of mobile IPs or routed tenant subnets down in to the IPAM system too.
This means Neutron needs to tell IPAM about every floating IP
association and every tenant subnet behind a Neutron router in the
same address scope as the external network. I'm not convinced that
IPAM and routing really belong together like this.

I'm afraid I don't yet sufficiently understand the 'dynamic routing'
requirements here. Can you say more about them?

If you made it this far in this email, you must have some feedback.
Please help us out.

There are a lot of moving parts here. I'm afraid I don't yet see any
clarity, but perhaps if we talk about this enough, that will eventually
emerge!

Regards,
Neil


OpenStack Development Mailing List (not for usage questions)
Unsubscribe: OpenStack-dev-request@lists.openstack.org?subject:unsubscribe
http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev
responded Jul 21, 2015 by Neil_Jerram (8,580 points)   1 4 11
0 votes

On 20/07/15 23:27, Ian Wells wrote:
There are two routed network models:

  • I give my VM an address that bears no relation to its location and ensure the routed fabric routes packets there - this is very much the routing protocol method for doing things where I have injected a route into the network and it needs to propagate. It's also pretty useless because there are too many host routes in any reasonable sized cloud.
  • I give my VM an address that is based on its location, which only becomes apparent at binding time. This means that the semantics of a port changes - a port has no address of any meaning until binding, because its location is related to what it does - and it leaves open questions about what to do when you migrate.

Now, you seem to generally be thinking in terms of the latter model, particularly since the provider network model you're talking about fits there.

Right.

But then you say:

On 20 July 2015 at 10:33, Carl Baldwin carl@ecbaldwin.net wrote:
When creating a
port, the binding information would be sent to the IPAM system and the
system would choose an appropriate address block for the allocation.

No, it wouldn't, because creating and binding a port are separate operations. I can't give the port a location-specific address on creation - not until it's bound, in fact, which happens much later.

Thanks, good point. And does IP allocation currently happen when a port is created?

(By the way, (1) any faults in the IPAM-related proposal are really mine, not Carl's; he's just trying to present my half-baked idea as fairly as he can; (2) therefore I really appreciate your feedback on it!)

On proposal 1: consider the cost of adding a datamodel to Neutron. It has to be respected by all developers, it frequently has to be deployed by all operators, and every future change has to align with it. Plus either it has to be generic or optional, and if optional it's a burden to some proportion of Neutron developers and users.

I suppose any Neutron API enhancement will have some cost, and proposal 1 has the one specific aspect that Mark McClain pointed out, that ports end up with a backing network ID, not the front network ID, and that that may surprise a lot of existing plugin code. I don't really see your "has to be deployed by all operators", "every future change has to align with it" and "if optional it's a burden ..." points, though.

I accept proposal 1 is easy, but it's not universally applicable. It doesn't work with Neil Jerram's plans,

I'm not sure that this current discussion has to address all possible use cases. To be clear about what you mean, though: do you mean that representing [routed] in terms of proposal 1 would require a backing network for each VM? (If so, I agree that that wouldn't be good!)

[routed] https://review.openstack.org/#/c/198439/

it doesn't work with multiple interfaces per host,

Why not?

and it doesn't work with the IPv6 routed-network model I worked on.

Can you give a pointer?

Given that, I wonder whether proposal 2 could be rephrased.

1: some network types don't allow unbound ports to have addresses, they just get placeholder addresses for each subnet until they're bound

Should that say "some network types allow unbound ports not to have addresses"?

2: 'subnets' on these networks are more special than subnets on other networks. (More accurately, they dont use subnets. It's a shame subnets are core Neutron, because they're pretty horrible and yet hard to replace.)

Not sure on your exact point here, but what about new subnet pools?

3: there's an independent (in an extension? In another API endpoint?) datamodel that the network points to and that IPAM refers to to find a port an address. Bonus, people who aren't using funky network types can disable this extension.

Well that would be IPAM-module-internal anyway; I'd say that - at least during phase 1 - it could do whatever it likes to work out sensible IP allocation. Longer term, sure, one could create a formal datamodel for this.

4: when the port is bound, the IPAM is referred to, and it's told the binding information of the port.
5: when binding the port, once IPAM has returned its address, the network controller probably does stuff with that address when it completes the binding (like initialising routing).

I believe it's already supported for a port's fixed IPs to change - so hopefully nothing particularly new here.

6: live migration either has to renumber a port or forward old traffic to the new address via route injection. This is an open question now, so I'm mentioning it rather than solving it.

In fact, adding that hook to IPAM at binding plus setting aside a 'not set' IP address might be all you need to do to make it possible. The IPAM needs data to work out what an address is, but that doesn't have to take the form of existing Neutron constructs.

Thanks! That's very much what I was thinking too, so really appreciate your support and adding useful flesh to the bone.

Regards,
Neil


OpenStack Development Mailing List (not for usage questions)
Unsubscribe: OpenStack-dev-request@lists.openstack.org?subject:unsubscribe
http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev
responded Jul 21, 2015 by Neil_Jerram (8,580 points)   1 4 11
0 votes

[Sorry, mistaken send as mixed format, so quoting may not have come out
right. Hope this time is better...]

On 20/07/15 23:27, Ian Wells wrote:
There are two routed network models:

  • I give my VM an address that bears no relation to its location and
    ensure the routed fabric routes packets there - this is very much the
    routing protocol method for doing things where I have injected a route
    into the network and it needs to propagate. It's also pretty useless
    because there are too many host routes in any reasonable sized cloud.

  • I give my VM an address that is based on its location, which only
    becomes apparent at binding time. This means that the semantics of a
    port changes - a port has no address of any meaning until binding,
    because its location is related to what it does - and it leaves open
    questions about what to do when you migrate.

Now, you seem to generally be thinking in terms of the latter model,
particularly since the provider network model you're talking about
fits there.

Right.

But then you say:

On 20 July 2015 at 10:33, Carl Baldwin <carl@ecbaldwin.net
carl@ecbaldwin.net> wrote:

When creating a
port, the binding information would be sent to the IPAM system and the
system would choose an appropriate address block for the allocation.

No, it wouldn't, because creating and binding a port are separate
operations. I can't give the port a location-specific address on
creation - not until it's bound, in fact, which happens much later.

Thanks, good point. And does IP allocation currently happen when a port
is created?

(By the way, (1) any faults in the IPAM-related proposal are really
mine, not Carl's; he's just trying to present my half-baked idea as
fairly as he can; (2) therefore I really appreciate your feedback on it!)

On proposal 1: consider the cost of adding a datamodel to Neutron. It
has to be respected by all developers, it frequently has to be
deployed by all operators, and every future change has to align with
it. Plus either it has to be generic or optional, and if optional
it's a burden to some proportion of Neutron developers and users.

I suppose any Neutron API enhancement will have some cost, and proposal
1 has the one specific aspect that Mark McClain pointed out, that ports
end up with a backing network ID, not the front network ID, and that
that may surprise a lot of existing plugin code. I don't really see
your "has to be deployed by all operators", "every future change has to
align with it" and "if optional it's a burden ..." points, though.

I accept proposal 1 is easy, but it's not universally applicable.
It doesn't work with Neil Jerram's plans,

I'm not sure that this current discussion has to address all possible
use cases. To be clear about what you mean, though: do you mean that
representing [routed] in terms of proposal 1 would require a backing
network for each VM? (If so, I agree that that wouldn't be good!)

[routed] https://review.openstack.org/#/c/198439/

it doesn't work with multiple interfaces per host,

Why not?

and it doesn't work with the IPv6 routed-network model I worked on.

Can you give a pointer?

Given that, I wonder whether proposal 2 could be rephrased.

1: some network types don't allow unbound ports to have addresses,
they just get placeholder addresses for each subnet until they're bound

Should that say "some network types allow unbound ports not to have
addresses"?

2: 'subnets' on these networks are more special than subnets on other
networks. (More accurately, they dont use subnets. It's a shame
subnets are core Neutron, because they're pretty horrible and yet hard
to replace.)

Not sure on your exact point here, but what about new subnet pools?

3: there's an independent (in an extension? In another API endpoint?)
datamodel that the network points to and that IPAM refers to to find a
port an address. Bonus, people who aren't using funky network types
can disable this extension.

Well that would be IPAM-module-internal anyway; I'd say that - at least
during phase 1 - it could do whatever it likes to work out sensible IP
allocation. Longer term, sure, one could create a formal datamodel for
this.

4: when the port is bound, the IPAM is referred to, and it's told the
binding information of the port.
5: when binding the port, once IPAM has returned its address, the
network controller probably does stuff with that address when it
completes the binding (like initialising routing).

I believe it's already supported for a port's fixed IPs to change - so
hopefully nothing particularly new here.

6: live migration either has to renumber a port or forward old traffic
to the new address via route injection. This is an open question now,
so I'm mentioning it rather than solving it.

In fact, adding that hook to IPAM at binding plus setting aside a 'not
set' IP address might be all you need to do to make it possible. The
IPAM needs data to work out what an address is, but that doesn't have
to take the form of existing Neutron constructs.

Thanks! That's very much what I was thinking too, so really appreciate
your support and adding useful flesh to the bone.

Regards,
Neil


OpenStack Development Mailing List (not for usage questions)
Unsubscribe: OpenStack-dev-request@lists.openstack.org?subject:unsubscribe
http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev
responded Jul 21, 2015 by Neil_Jerram (8,580 points)   1 4 11
0 votes

A few comments inline.

Generally speaking the only thing I'd like to remark is that this use case
makes sense independently of whether you are using overlay, or any other
"SDN" solution (whatever SDN means to you).

Also, please note that this thread is now split in two - there's a new
branch starting with Ian's post. So perhaps let's make two threads.

On 21 July 2015 at 14:21, Neil Jerram Neil.Jerram@metaswitch.com wrote:

On 20/07/15 18:36, Carl Baldwin wrote:

I'm looking for feedback from anyone interest but, in particular, I'd
like feedback from the following people for varying perspectives:
Mark McClain (proposed alternate), John Belamaric (IPAM), Ryan Tidwell
(BGP), Neil Jerram (L3 networks), Aaron Rosen (help understand
multi-provider networks) and you if you're reading this list of names
and thinking "he forgot me!"

We have been struggling to develop a way to model a network which is
composed of disjoint L2 networks connected by routers. The intent of
this email is to describe the two proposals and request input on the
two in attempt to choose a direction forward. But, first:
requirements.

Requirements:

The network should appear to end users as a single network choice.
They should not be burdened with choosing between segments. It might
interest them that L2 communications may not work between instances on
this network but that is all.

It is however important to ensure services like DHCP keep working as usual.
Treating segments as logical networks in their own right is the simples
solution to achieve this imho.

This has been requested by numerous

operators [1][4]. It can be useful for external networks and provider
networks.

The model needs to be flexible enough to support two distinct types of
addresses: 1) address blocks which are statically bound to a single
segment and 2) address blocks which are mobile across segments using
some sort of dynamic routing capability like BGP or programmatically
injecting routes in to the infrastructure's routers with a plugin.

FWIW, I hadn't previously realized (2) here.

A "mobile address block" translates to a subnet whose network association
might change.
Achieving mobile address block does not seem simple to me at all. Route
injection (booring) and BGP might solve the networking aspect of the
problem, but we'd need also coordination with the compute service to ensure
also all the workloads using addresses from the mobile block migrate;
unless I've not understood the way these mobile address blocks work, I
struggle to see this as a requirement.

Overlay networks are not the answer to this. The goal of this effort
is to scale very large networks with many connected ports by doing L3
routing (e.g. to the top of rack) instead of using a large continuous
L2 fabric.

As a side note, I find interesting that overlays where indeed proposed as a
solution to avoid hybrid L2/L3 networks or having to span VLANs across the
core and aggregation layers.

Also, the operators interested in this work do not want

the complexity of overlay networks [4].

Proposal 1:

We refined this model [2] at the Neutron mid-cycle a couple of weeks
ago. This proposal has already resonated reasonably with operators,
especially those from GoDaddy who attended the Neutron sprint. Some
key parts of this proposal are:

  1. The routed super network is called a front network. The segments
    are called back(ing) networks.
  2. Backing networks are modeled as admin-owned private provider
    networks but otherwise are full-blown Neutron networks.
  3. The front network is marked with a new provider type.
  4. A Neutron router is created to link the backing networks with
    internal ports. It represents the collective routing ability of the
    underlying infrastructure.
  5. Backing networks are associated with a subset of hosts.
  6. Ports created on the front network must have a host binding and
    are actually created on a backing network when all is said and done.
    They carry the ID of the backing network in the DB.

While the logical model and workflow you describe here makes sense, I have
the impression that:
1) The front network is not a neutron logical network. Because it does not
really behave like a network, with the only exception that you can pass its
id to the nova API. To reinforce this consider that basically the front
network has no ports.
2) from a topological perspective the front network "kind of" behaves like
an external network; but it isn't. The front network is not really a common
gateway for all backing networks, more like a label which is attached to
the router which interconnects all the backing networks.
3) more on topology. How can we know that all these segments will always be
connected by a single logical router? Using static router (or If one day
BGP will be a thing), it is already possible to implement multi-segments
networks with L3 connectivity using multiple logical routers, isn't it?
4) Point #5 is making assumptions on network aware scheduling. I am not
sure we already have the ability to inform the nova scheduler to deploy an
instance on a host where a give network is available.
5) I think that I would treat the "front" network as a "network group" or
"cluster". I noticed the term "subnet cluster" is used in the etherpad. I
find this term appropriate because it seems to me that in this scenario the
final user does not care at all about the network intended as a L2 segment.
6) It seems one of the purposes of using backing networks is to identify an
address block for the ports being created. But then how would that play
with mobile address blocks? From an instance workflow perspective, should
instances be associated with one or more address blocks at boot time?
7) What happens is a user attaches a router to a backing network and
connect that router to an external network? Does that becomes a gateway for
all backing networks or just for that network? And would the workflow be
for uplinking a front network to an external network?

Using Neutron networks to model the segments allows us to fully
specify the details of each network using the regular Neutron model.
They could be heterogeneous or homogeneous, it doesn't matter.

You've probably seen Robert Kukura's comment on the related bug at
https://bugs.launchpad.net/neutron/+bug/1458890/comments/30, and there
is a useful detailed description of how the multiprovider extension
works at
https://bugs.launchpad.net/openstack-api-site/+bug/1242019/comments/3.
I believe it is correct to say that using multiprovider would be an
effective substitute for using multiple backing networks with different
{networktype, physicalnetwork, segmentation_id}, and that logically
multiprovider is aiming to describe the same thing as this email thread
is, i.e. non-overlay mapping onto a physical network composed of
multiple segments.

However, I believe multiprovider does not (per se) address the IP
addressing requirement(s) of the multi-segment scenario.

Indeed it does not. The multiprovider extension simply indicates that a
network can be built using different L2 segments.
It is then up to the operator to ensure that these segments are correct,
and it's up to whatever is running in the backend to ensure that instances
on the various segments can communicate each other.

I believe the ask here is for Neutron to provide this capability (the
neutron reference control plane currently doesn't). It is not yet entirely
clear to me whether there's a real need of changing the logical model, but
IP addressing implications might be a reason, as pointed out by Neil.

This proposal offers a clear separation between the statically bound
and the mobile address blocks by associating the former with the
backing networks and the latter with the front network. The mobile
addresses are modeled just like floating IPs are today but are
implemented by some plugin code (possibly without NAT).

Couldn't the mobile addresses be exactly like floating IPs already
are? Why is anything different from floating IPs needed here?

This proposal also provides some advantages for integrating dynamic
routing. Since each backing network will, by necessity, have a
corresponding router in the infrastructure, the relationship between
dynamic routing speaker, router, and network is clear in the model:
network <-> speaker <-> router.

Ok. But how that changes because of backing networks? I believe the same
relationship holds true for every network, or am I wrong?

I'm not sure exactly what you mean here by 'dynamic routing', but I
think this touches on a key point: can IP routing happen anywhere in a
Neutron network, without being explicitly represented by a router object
in the model?

I think the answer to that should be yes.

But this would also mean that we should consider doing without the very
concept of router in Neutron.
If we look at the scenarios we're describing here, I'd agree with you, but
unfortunately Neutron is required to serve a wide variety of scenarios.

It clearly already is in the
underlay if you are using tunnels - the tunnel between two compute hosts
may require multiple IP hops across the fabric. At the network level
that Neutron networks currently model, the answer is currently no, but I
think it's interesting to consider changing that.

Proposal 2:

This alternate model has not been fully fleshed out.

I should begin by admitting the blame here. Much of this is a
half-baked idea from me, that I haven't yet had time to explore
properly. However....

Some parts of it
are still unclear to me. The basic idea is to give the IPAM system
information about IP availability on a given host. When creating a
port, the binding information would be sent to the IPAM system and the
system would choose an appropriate address block for the allocation.

To make a link to proposal #1, I read this as informing the IPAM system of
which baking network(s) can be implemented on the host which has been
selected.
But I am not 100% convinced that the two proposals implement the same
workflow.

Right. A key requirement, for this to be possible, is that Nova's host
selection happens before the IPAM system is asked to allocate an IP
address. I have an action to investigate that, but if anyone happens to
know already, please do say.

I am 99.99% sure this is not possible at the moment unless something is
done to make nova scheduler network aware.
Also, this will add a point of coupling between the instance boot and
network provisioning processes, which are independent at the moment.

  1. This alternate model offers no way to distinguish the two types of
    address blocks.

Agreed. But I wonder if normal floating IPs can be used for the mobile
IP addresses (as also suggested above).

I get the concept, but it's not really a floating IP in neutron terms, as
that implies SNAT/DNAT.
Also, from what I gather it's not about single mobile addresses, but we're
talking about entire subnets that can be moved around.

  1. We don't have the benefit of modeling the segments with Neutron
    networks.

Agreed, but it appears that multiprovider has already taken a different
view here, and already provides the ability for a network to map to
multiple {networktype, physicalnetwork, segmentation_id} tuples.

Modelling segments as logical networks is not necessarily a benefit in my
opinion;
it's more a convenience. For instance the reference control plane might
implement provider networks in a way such that:
1) a "ghost router" is created in the l3 agent to ensure E-W traffic across
all segments (the router is "ghost" because it's not exposed as neutron
logical router
2) a distinct dnsmasq instance is started on every segment of the network
to ensure DHCP functionality
3) metadata services can be provided through the ghost router rather than
using isolated metadata

I think this alternative is worth exploring anyway.

It was suggested that hierarchical port binding could help here but I
see it as orthogonal to this. Hierarchical port binding extends the
L2 properties of a port to a hierarchical infrastructure to achieve
continuous L2 connectivity. It is also intended for overlay networks.
That isn't what we're doing here and I don't think it fits.

I have also considered the multi-provider extension [3] for this.
This is not yet clear to me either. First, my understanding was that
this extension describes multi-segment continuous L2 fabrics.

https://bugs.launchpad.net/openstack-api-site/+bug/1242019/comments/3
says:

"Note that, although ML2 can manage binding to multi-segment networks,
neutron does not manage bridging between the segments of a multi-segment
network. This is assumed to be done administratively."

So I think it is not intended for a multiprovider network to be
"continuous".

Again, this touches on the point above about routing happening without
being explicitly represented in the Neutron model...

Second,
there doesn't seem to be any host binding aspect to the multi-provider
extension. Third, not all L2 plugins support this extension. It
seems silly to require L2 plugin support in order to enable routing
between segments.

Good point. If all plugins required the same kind of transformation to
support multiprovider, perhaps that's telling us that the multi-ness
should instead be in a layer above, more like your proposal 1.

It isn't clear to me how a dynamic routing speaker will fit in to this
model. My first thought is that it must be integrated with IPAM
because the IPAM system has the understanding of how to map address
blocks to infrastructure. This pushes even more infrastructure
knowledge down to the IPAM system. If dynamic routing is pushed down
to the IPAM system, it will also be necessary to push the association
of mobile IPs or routed tenant subnets down in to the IPAM system too.
This means Neutron needs to tell IPAM about every floating IP
association and every tenant subnet behind a Neutron router in the
same address scope as the external network. I'm not convinced that
IPAM and routing really belong together like this.

I'm afraid I don't yet sufficiently understand the 'dynamic routing'
requirements here. Can you say more about them?

If you made it this far in this email, you must have some feedback.
Please help us out.

There are a lot of moving parts here. I'm afraid I don't yet see any
clarity, but perhaps if we talk about this enough, that will eventually
emerge!

Regards,
Neil


OpenStack Development Mailing List (not for usage questions)
Unsubscribe: OpenStack-dev-request@lists.openstack.org?subject:unsubscribe
http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev


OpenStack Development Mailing List (not for usage questions)
Unsubscribe: OpenStack-dev-request@lists.openstack.org?subject:unsubscribe
http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev
responded Jul 21, 2015 by salv.orlando_at_gmai (2,460 points)   1 2
0 votes

On 21/07/15 01:47, Assaf Muller wrote:

----- Original Message -----

I'm looking for feedback from anyone interest but, in particular, I'd
like feedback from the following people for varying perspectives:
Mark McClain (proposed alternate), John Belamaric (IPAM), Ryan Tidwell
(BGP), Neil Jerram (L3 networks), Aaron Rosen (help understand
multi-provider networks) and you if you're reading this list of names
and thinking "he forgot me!"

We have been struggling to develop a way to model a network which is
composed of disjoint L2 networks connected by routers. The intent of
this email is to describe the two proposals and request input on the
two in attempt to choose a direction forward. But, first:
requirements.

Requirements:

The network should appear to end users as a single network choice.
They should not be burdened with choosing between segments. It might
interest them that L2 communications may not work between instances on
this network but that is all. This has been requested by numerous
operators [1][4]. It can be useful for external networks and provider
networks.
I think that [1] and [4] are conflating the problem statement with the
proposed solutions, and lacking some lower level details regarding the
problem statement, which makes it a lot harder to engage in a discussion.

I'm looking at [4]:
What I don't see explicitly mentioned is: Does the same CIDR extend across racks,
or would each rack get its own CIDR(s)?

I think it's the latter, i.e. what you call option (1) below.

I understand this can differ according to
the architectural choices you make in your data center, and that changes the
choices we'd need to make to Neutron in order to satisfy that requirement.

To clarify, option (1) means that a subnet is contained to a rack. Option (2)
means that a subnet may span across racks. I don't think we need to change the network/subnet
model at all to satisfy case (1). Each rack would have its own network/subnet
(Or perhaps multiple, if more than a single VLAN or other characteristic is desired).
Each network would be tagged with an AZ (This ties in nicely to the already proposed Neutron AZ spec),
and the Nova scheduler would become aware of Neutron network AZs. In this model
you don't want to connect to a network, you want Nova to schedule the VM and then have Nova choose
the network on that rack. If you want more than a single network in a rack, then there's
some difference between those networks that could be expressed in tags (Think: Network flavors),
such as the security zone. You'd then specify a tag that should be satisfied by the
network that the VM ends up connecting to, so that the tag may be added to the list
of Nova scheduler filters. Again, this goes back to keeping the Neutron network and subnet
just as they are but doing some work with AZs, tagging and the Nova scheduler.
We've known that the Nova scheduler must become Network aware for the past few years,
perhaps it's time to finally tackle that.

Interesting. Perhaps we can do something along those lines that will
fly without lots of change in Nova/Neutron interactions:

  • allow a Neutron network to have tags associated with it

  • when launching a set of VMs, allow specifying a network tag, instead
    of a specific network name/ID, with the meaning that each VM can attach
    to any network that has that tag.

Longer term the Nova scheduler could become tag-aware, as you suggest,
but until then I think what will happen is that

  • Nova will choose a host independently of the network tag

  • if it isn't possible for Neutron to bind a port on that host to a
    network with the requested tag, it will bounce back to Nova, and Nova
    will try the next available host (?)

So, inefficient, but kind of already working.

Effectively, with this model, the tag is replacing Carl's front
network. Which nicely side-steps any confusion above which network ID a
port is expected to have.

Regards,
Neil


OpenStack Development Mailing List (not for usage questions)
Unsubscribe: OpenStack-dev-request@lists.openstack.org?subject:unsubscribe
http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev
responded Jul 21, 2015 by Neil_Jerram (8,580 points)   1 4 11
0 votes

On Jul 20, 2015 4:26 PM, "Ian Wells" ijw.ubuntu@cack.org.uk wrote:

There are two routed network models:

  • I give my VM an address that bears no relation to its location and
    ensure the routed fabric routes packets there - this is very much the
    routing protocol method for doing things where I have injected a route into
    the network and it needs to propagate. It's also pretty useless because
    there are too many host routes in any reasonable sized cloud.

  • I give my VM an address that is based on its location, which only
    becomes apparent at binding time. This means that the semantics of a port
    changes - a port has no address of any meaning until binding, because its
    location is related to what it does - and it leaves open questions about
    what to do when you migrate.

Now, you seem to generally be thinking in terms of the latter model,
particularly since the provider network model you're talking about fits
there. But then you say:

Actually, both. For example, GoDaddy assigns each vm an ip from the
location based address blocks and optionally one from the routed location
agnostic ones. I would also like to assign router ports out of the
location based blocks which could host floating ips from the other blocks.

On 20 July 2015 at 10:33, Carl Baldwin carl@ecbaldwin.net wrote:

When creating a
port, the binding information would be sent to the IPAM system and the
system would choose an appropriate address block for the allocation.

Implicit in both is a need to provide at least a hint at host binding. Or,
delay address assignment until binding. I didn't mention it because my
email was already long.
This is something and discussed but applies equally to both proposals.

No, it wouldn't, because creating and binding a port are separate
operations. I can't give the port a location-specific address on creation
- not until it's bound, in fact, which happens much later.

On proposal 1: consider the cost of adding a datamodel to Neutron. It
has to be respected by all developers, it frequently has to be deployed by
all operators, and every future change has to align with it. Plus either
it has to be generic or optional, and if optional it's a burden to some
proportion of Neutron developers and users. I accept proposal 1 is easy,
but it's not universally applicable. It doesn't work with Neil Jerram's
plans, it doesn't work with multiple interfaces per host, and it doesn't
work with the IPv6 routed-network model I worked on.

Please be more specific. I'm not following your argument here. My
proposal doesn't really add much new data model.

We've discussed this with Neil at length. I haven't been able to reconcile
our respective approaches in to one model that works for both of us and
still provides value. The routed segments model needs to somehow handle
the L2 details of the underlying network. Neil's model confines L2 to the
port and routes to it. The two models can't just be squished together
unless I'm missing something.

Could you provide some links so that I can brush up on your ipv6 routed
network model? I'd like to consider it but I don't know much about it.

Given that, I wonder whether proposal 2 could be rephrased.

1: some network types don't allow unbound ports to have addresses, they
just get placeholder addresses for each subnet until they're bound
2: 'subnets' on these networks are more special than subnets on other
networks. (More accurately, they dont use subnets. It's a shame subnets
are core Neutron, because they're pretty horrible and yet hard to replace.)
3: there's an independent (in an extension? In another API endpoint?)
datamodel that the network points to and that IPAM refers to to find a port
an address. Bonus, people who aren't using funky network types can disable
this extension.
4: when the port is bound, the IPAM is referred to, and it's told the
binding information of the port.
5: when binding the port, once IPAM has returned its address, the network
controller probably does stuff with that address when it completes the
binding (like initialising routing).
6: live migration either has to renumber a port or forward old traffic to
the new address via route injection. This is an open question now, so I'm
mentioning it rather than solving it.

I left out the migration issue from my email also because it also affects
both proposals equally.

In fact, adding that hook to IPAM at binding plus setting aside a 'not
set' IP address might be all you need to do to make it possible. The IPAM
needs data to work out what an address is, but that doesn't have to take
the form of existing Neutron constructs.

What about the L2 network for each segment? I suggested creating provider
networks for these. Do you have a different suggestion?

What about distinguishing the bound address blocks from the mobile address
blocks? For example, the address blocks bound to the segments could be
from a private space. A router port may get an address from this private
space and be the next hop for public addresses. Or, GoDaddy's model where
vms get an address from the segment network and optionally a floating ip
which is routed.

Carl


OpenStack Development Mailing List (not for usage questions)
Unsubscribe: OpenStack-dev-request@lists.openstack.org?subject:unsubscribe
http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev
responded Jul 21, 2015 by Carl_Baldwin (14,940 points)   2 4 7
0 votes

On 21 July 2015 at 07:52, Carl Baldwin carl@ecbaldwin.net wrote:

Now, you seem to generally be thinking in terms of the latter model,
particularly since the provider network model you're talking about fits
there. But then you say:

Actually, both. For example, GoDaddy assigns each vm an ip from the
location based address blocks and optionally one from the routed location
agnostic ones. I would also like to assign router ports out of the
location based blocks which could host floating ips from the other blocks.

Well, routed IPs that are not location-specific are no different to normal
ones, are they? Why do they need special work that changes the API?

On 20 July 2015 at 10:33, Carl Baldwin carl@ecbaldwin.net wrote:

When creating a
port, the binding information would be sent to the IPAM system and the
system would choose an appropriate address block for the allocation.

Implicit in both is a need to provide at least a hint at host binding.
Or, delay address assignment until binding. I didn't mention it because my
email was already long.
This is something and discussed but applies equally to both proposals.

No, it doesn't - if the IP address is routed and not relevant to the
location of the host then yes, you would want to inject a route at
binding, but you wouldn't want to delay address assignment till binding
because it's location-agnostic.

No, it wouldn't, because creating and binding a port are separate
operations. I can't give the port a location-specific address on creation
- not until it's bound, in fact, which happens much later.

On proposal 1: consider the cost of adding a datamodel to Neutron. It
has to be respected by all developers, it frequently has to be deployed by
all operators, and every future change has to align with it. Plus either
it has to be generic or optional, and if optional it's a burden to some
proportion of Neutron developers and users. I accept proposal 1 is easy,
but it's not universally applicable. It doesn't work with Neil Jerram's
plans, it doesn't work with multiple interfaces per host, and it doesn't
work with the IPv6 routed-network model I worked on.

Please be more specific. I'm not following your argument here. My
proposal doesn't really add much new data model.

My point is that there's a whole bunch of work there to solve the question
of 'how do I allocate addresses to a port when addresses are location
specific' that assumes that there's one model for location specific
addresses that is a bunch of segments with each host on one segment. I can
break this model easily. Per the previous IPv6 proposal, I might choose my
address with more care than just by its location, to contain extra
information I care about. I might have multiple segments connected to one
host where either segment will do and the scheduler should choose the most
useful one.

If this whole model is built using reusable-ish concepts like networks, and
adds a field to ports, then basically it ends up in, or significantly
affecting, the model of core Neutron. Every Neutron developer to come will
have to read it, understand it, and not break it. Depending on how it's
implemented, every operator that comes along will have to deploy it and may
be affected by bugs in it (though that depends on precisely how much ends
up as an extension).

If we find a more general purpose interface - and per above, mostly the
interface is 'sometimes I want to pick my address only at binding' plus
'IPAM and address assignment is more complex than the subnet model we have
today' then potentially these datamodels can be specific to IPAM - and not
general purpose 'we have these objects around already' things we're reusing
- and with a clean interface the models may not even be present as code
into a deployed system, which is the best proof they are not introducing
bugs.

Every bit of cruft we write, we have to carry. It makes more sense to make
the core extensible for this case, in my mind, than it does to introduce it
into the core.

We've discussed this with Neil at length. I haven't been able to
reconcile our respective approaches in to one model that works for both of
us and still provides value.

QED.

Could you provide some links so that I can brush up on your ipv6 routed
network model? I'd like to consider it but I don't know much about it.

The best writeup I have is
http://datatracker.ietf.org/doc/draft-baker-openstack-ipv6-model/?include_text=1
(don't judge it by the place it was filed). But the concept was that (a)
VMs received v6 addresses, (b) they were location specific, (c) each had
their own L2 segment (per Neil's idea, and really the ultimate use of this
model), and (d) there was information in the address additional to just its
location and the entropy of choosing a random address.

1: some network types don't allow unbound ports to have addresses, they
just get placeholder addresses for each subnet until they're bound
2: 'subnets' on these networks are more special than subnets on other
networks. (More accurately, they dont use subnets. It's a shame subnets
are core Neutron, because they're pretty horrible and yet hard to replace.)
3: there's an independent (in an extension? In another API endpoint?)
datamodel that the network points to and that IPAM refers to to find a port
an address. Bonus, people who aren't using funky network types can disable
this extension.
4: when the port is bound, the IPAM is referred to, and it's told the
binding information of the port.
5: when binding the port, once IPAM has returned its address, the
network controller probably does stuff with that address when it completes
the binding (like initialising routing).
6: live migration either has to renumber a port or forward old traffic
to the new address via route injection. This is an open question now, so
I'm mentioning it rather than solving it.

I left out the migration issue from my email also because it also affects
both proposals equally.

Understood, and wise at this point, I think.

In fact, adding that hook to IPAM at binding plus setting aside a 'not
set' IP address might be all you need to do to make it possible. The IPAM
needs data to work out what an address is, but that doesn't have to take
the form of existing Neutron constructs.

What about the L2 network for each segment?

What about it? There may be an L2 network per port and no more. True in
both my case and Neil's, in fact, where your network system proposed is
just storing address ranges and they're not really L2 segments at all.
(Also, your terminology usage is now getting confusing. ;)

My point is not that your model is wrong - it's that it's not generally
applicable
and therefore shouldn't be encoded in core Neutron as if it's
the only way this could possibly work. Better that we have a system where
that model is not core, and as an added bonus doesn't attempt to make use
of core objects like networks for things that they don't entirely suit.

I suggested creating provider networks for these. Do you have a different
suggestion?

My suggestion is per above - that using Neutron 'networks' to be 'a network
to which I can attach a VM' and 'a location specific addressing construct'
is overloading it considerably, and doesn't suit the general purpose case.
And that pluggable IPAM, kinda, allows you to make IPAM specific addressing
models (which are generally admin-only anyway) - or, better, letting the
IPAM system discover for itself how addressing works - is a nicer option.
It's actually not difficult to create completely objects to represent this,
what is difficult is getting that agreed as a change to a core datamodel,
so I think taking this and riffing on it works better than saying 'this is
how it shall be' with a somewhat limited solution. People can still use
the solutions in the meantime, particularly since they're admin-facing
interfaces.

You know, there comes a moment where I really want to get in front of a
whiteboard...


OpenStack Development Mailing List (not for usage questions)
Unsubscribe: OpenStack-dev-request@lists.openstack.org?subject:unsubscribe
http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev
responded Jul 21, 2015 by Ian_Wells (5,300 points)   1 2 5
0 votes

Wow, a lot to digest in these threads. If I can summarize my understanding of the two proposals. Let me know whether I get this right. There are a couple problems that need to be solved:

a. Scheduling based on host reachability to the segments
b. Floating IP functionality across the segments. I am not sure I am clear on this one but it sounds like you want the routers attached to the segments to advertise routes to the specific floating IPs. Presumably then they would do NAT or the instance would assign both the fixed IP and the floating IP to its interface?

In Proposal 1, (a) is solved by associating segments to the front network via a router - that association is used to provide a single hook into the existing API that limits the scope of segment selection to those associated with the front network. (b) is solved by tying the floating IP ranges to the same front network and managing the reachability with dynamic routing.

In Proposal 2, (a) is solved by tagging each network with some meta-data that the IPAM system uses to make a selection. This implies an IP allocation request that passes something other than a network/port to the IPAM subsystem. This fine from the IPAM point of view but there is no corresponding API for this right now. To solve (b) either the IPAM system has to publish the routes or the higher level management has to ALSO be aware of the mappings (rather than just IPAM).

To throw some fuel on the fire, I would argue also that (a) is not sufficient and address availability needs to be considered as well (as described in [1]). Selecting a host based on reachability alone will fail when addresses are exhausted. Similarly, with (b) I think there needs to be consideration during association of a floating IP to the effect on routing. That is, rather than a huge number of host routes it would be ideal to allocate the floating IPs in blocks that can be associated with the backing networks (though we would want to be able to split these blocks as small as a /32 if necessary - but avoid it/optimize as much as possible).

In fact, I think that these proposals are more or less the same - it's just in #1 the meta-data used to tie the backing networks together is another network. This allows it to fit in neatly with the existing APIs. You would still need to implement something prior to IPAM or within IPAM that would select the appropriate backing network.

As a (gulp) third alternative, we should consider that the front network here is in essence a layer 3 domain, and we have modeled layer 3 domains as address scopes in Liberty. The user is essentially saying "give me an address that is routable in this scope" - they don't care which actual subnet it gets allocated on. This is conceptually more in-line with [2] - modeling L3 domain separately from the existing Neutron concept of a network being a broadcast domain.

Fundamentally, however we associate the segments together, this comes down to a scheduling problem. Nova needs to be able to incorporate data from Neutron in its scheduling decision. Rather than solving this with a single piece of meta-data like network_id as described in proposal 1, it probably makes more sense to build out the general concept of utilizing network data for nova scheduling. We could still model this as in #1, or using address scopes, or some arbitrary data as in #2. But the harder problem to solve is the scheduling, not how we tag these things to inform that scheduling.

The optimization of routing for floating IPs is also a scheduling problem, though one that would require a lot more changes to how FIP are allocated and associated to solve.

John

[1] https://review.openstack.org/#/c/180803/
[2] https://bugs.launchpad.net/neutron/+bug/1458890/comments/7

On Jul 21, 2015, at 10:52 AM, Carl Baldwin carl@ecbaldwin.net wrote:

On Jul 20, 2015 4:26 PM, "Ian Wells" ijw.ubuntu@cack.org.uk wrote:

There are two routed network models:

  • I give my VM an address that bears no relation to its location and ensure the routed fabric routes packets there - this is very much the routing protocol method for doing things where I have injected a route into the network and it needs to propagate. It's also pretty useless because there are too many host routes in any reasonable sized cloud.

  • I give my VM an address that is based on its location, which only becomes apparent at binding time. This means that the semantics of a port changes - a port has no address of any meaning until binding, because its location is related to what it does - and it leaves open questions about what to do when you migrate.

Now, you seem to generally be thinking in terms of the latter model, particularly since the provider network model you're talking about fits there. But then you say:

Actually, both. For example, GoDaddy assigns each vm an ip from the location based address blocks and optionally one from the routed location agnostic ones. I would also like to assign router ports out of the location based blocks which could host floating ips from the other blocks.

On 20 July 2015 at 10:33, Carl Baldwin carl@ecbaldwin.net wrote:

When creating a
port, the binding information would be sent to the IPAM system and the
system would choose an appropriate address block for the allocation.

Implicit in both is a need to provide at least a hint at host binding. Or, delay address assignment until binding. I didn't mention it because my email was already long.
This is something and discussed but applies equally to both proposals.

No, it wouldn't, because creating and binding a port are separate operations. I can't give the port a location-specific address on creation - not until it's bound, in fact, which happens much later.

On proposal 1: consider the cost of adding a datamodel to Neutron. It has to be respected by all developers, it frequently has to be deployed by all operators, and every future change has to align with it. Plus either it has to be generic or optional, and if optional it's a burden to some proportion of Neutron developers and users. I accept proposal 1 is easy, but it's not universally applicable. It doesn't work with Neil Jerram's plans, it doesn't work with multiple interfaces per host, and it doesn't work with the IPv6 routed-network model I worked on.

Please be more specific. I'm not following your argument here. My proposal doesn't really add much new data model.

We've discussed this with Neil at length. I haven't been able to reconcile our respective approaches in to one model that works for both of us and still provides value. The routed segments model needs to somehow handle the L2 details of the underlying network. Neil's model confines L2 to the port and routes to it. The two models can't just be squished together unless I'm missing something.

Could you provide some links so that I can brush up on your ipv6 routed network model? I'd like to consider it but I don't know much about it.

Given that, I wonder whether proposal 2 could be rephrased.

1: some network types don't allow unbound ports to have addresses, they just get placeholder addresses for each subnet until they're bound
2: 'subnets' on these networks are more special than subnets on other networks. (More accurately, they dont use subnets. It's a shame subnets are core Neutron, because they're pretty horrible and yet hard to replace.)
3: there's an independent (in an extension? In another API endpoint?) datamodel that the network points to and that IPAM refers to to find a port an address. Bonus, people who aren't using funky network types can disable this extension.
4: when the port is bound, the IPAM is referred to, and it's told the binding information of the port.
5: when binding the port, once IPAM has returned its address, the network controller probably does stuff with that address when it completes the binding (like initialising routing).
6: live migration either has to renumber a port or forward old traffic to the new address via route injection. This is an open question now, so I'm mentioning it rather than solving it.

I left out the migration issue from my email also because it also affects both proposals equally.

In fact, adding that hook to IPAM at binding plus setting aside a 'not set' IP address might be all you need to do to make it possible. The IPAM needs data to work out what an address is, but that doesn't have to take the form of existing Neutron constructs.

What about the L2 network for each segment? I suggested creating provider networks for these. Do you have a different suggestion?

What about distinguishing the bound address blocks from the mobile address blocks? For example, the address blocks bound to the segments could be from a private space. A router port may get an address from this private space and be the next hop for public addresses. Or, GoDaddy's model where vms get an address from the segment network and optionally a floating ip which is routed.

Carl


OpenStack Development Mailing List (not for usage questions)
Unsubscribe: OpenStack-dev-request@lists.openstack.org?subject:unsubscribe
http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev


OpenStack Development Mailing List (not for usage questions)
Unsubscribe: OpenStack-dev-request@lists.openstack.org?subject:unsubscribe
http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev
responded Jul 21, 2015 by John_Belamaric (2,140 points)   1 2
...