settingsLogin | Registersettings

[openstack-dev] [Cinder] A possible solution for HA Active-Active

0 votes

Hi all,

I know we've all been looking at the HA Active-Active problem in Cinder
and trying our best to figure out possible solutions to the different
issues, and since current plan is going to take a while (because it
requires that we finish first fixing Cinder-Nova interactions), I've been
looking at alternatives that allow Active-Active configurations without
needing to wait for those changes to take effect.

And I think I have found a possible solution, but since the HA A-A
problem has a lot of moving parts I ended up upgrading my initial
Etherpad notes to a post 1.

Even if we decide that this is not the way to go, which we'll probably
do, I still think that the post brings a little clarity on all the
moving parts of the problem, even some that are not reflected on our
Etherpad 2, and it can help us not miss anything when deciding on a
different solution.

Cheers,
Gorka.


OpenStack Development Mailing List (not for usage questions)
Unsubscribe: OpenStack-dev-request@lists.openstack.org?subject:unsubscribe
http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev
asked Jul 27, 2015 in openstack-dev by Gorka_Eguileor (2,960 points)   2 3

77 Responses

0 votes

Thanks for this work Gorka. Even if we don't end up taking the approach you
suggest, there are parts that are undoubtedly useful piece of quality, well
thought out code, posted in clean patches, that can be used to easily try
out ideas that were not possible previously. I'm both impressed, and
imthusiastic about moving forward on this for the first time in a while.
Appreciated.

--
Duncan Thomas

On 27 July 2015 at 22:35, Gorka Eguileor geguileo@redhat.com wrote:

Hi all,

I know we've all been looking at the HA Active-Active problem in Cinder
and trying our best to figure out possible solutions to the different
issues, and since current plan is going to take a while (because it
requires that we finish first fixing Cinder-Nova interactions), I've been
looking at alternatives that allow Active-Active configurations without
needing to wait for those changes to take effect.

And I think I have found a possible solution, but since the HA A-A
problem has a lot of moving parts I ended up upgrading my initial
Etherpad notes to a post [1].

Even if we decide that this is not the way to go, which we'll probably
do, I still think that the post brings a little clarity on all the
moving parts of the problem, even some that are not reflected on our
Etherpad [2], and it can help us not miss anything when deciding on a
different solution.

Cheers,
Gorka.

[1]: http://gorka.eguileor.com/a-cinder-road-to-activeactive-ha/
[2]:
https://etherpad.openstack.org/p/cinder-active-active-vol-service-issues


OpenStack Development Mailing List (not for usage questions)
Unsubscribe: OpenStack-dev-request@lists.openstack.org?subject:unsubscribe
http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev

--
--
Duncan Thomas


OpenStack Development Mailing List (not for usage questions)
Unsubscribe: OpenStack-dev-request@lists.openstack.org?subject:unsubscribe
http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev
responded Jul 27, 2015 by Duncan_Thomas (16,160 points)   1 3 6
0 votes

On Mon, Jul 27, 2015 at 12:35 PM, Gorka Eguileor geguileo@redhat.com wrote:
I know we've all been looking at the HA Active-Active problem in Cinder
and trying our best to figure out possible solutions to the different
issues, and since current plan is going to take a while (because it
requires that we finish first fixing Cinder-Nova interactions), I've been
looking at alternatives that allow Active-Active configurations without
needing to wait for those changes to take effect.

And I think I have found a possible solution, but since the HA A-A
problem has a lot of moving parts I ended up upgrading my initial
Etherpad notes to a post [1].

Even if we decide that this is not the way to go, which we'll probably
do, I still think that the post brings a little clarity on all the
moving parts of the problem, even some that are not reflected on our
Etherpad [2], and it can help us not miss anything when deciding on a
different solution.

Based on IRC conversations in the Cinder room and hearing people's
opinions in the spec reviews, I'm not convinced the complexity that a
distributed lock manager adds to Cinder for both developers and the
operators who ultimately are going to have to learn to maintain things
like Zoo Keeper as a result is worth it.

Key point: We're not scaling Cinder itself, it's about scaling to
avoid build up of operations from the storage backend solutions
themselves.

Whatever people think ZooKeeper "scaling level" is going to accomplish
is not even a question. We don't need it, because Cinder isn't as
complex as people are making it.

I'd like to think the Cinder team is a great in recognizing potential
cross project initiatives. Look at what Thang Pham has done with
Nova's version object solution. He made a generic solution into an
Oslo solution for all, and Cinder is using it. That was awesome, and
people really appreciated that there was a focus for other projects to
get better, not just Cinder.

Have people consider Ironic's hash ring solution? The project Akanda
is now adopting it [1], and I think it might have potential. I'd
appreciate it if interested parties could have this evaluated before
the Cinder midcycle sprint next week, to be ready for discussion.

[1] - https://review.openstack.org/#/c/195366/

--
Mike Perez


OpenStack Development Mailing List (not for usage questions)
Unsubscribe: OpenStack-dev-request@lists.openstack.org?subject:unsubscribe
http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev
responded Jul 31, 2015 by Mike_Perez (13,120 points)   2 3 4
0 votes

On Fri, Jul 31, 2015 at 01:47:22AM -0700, Mike Perez wrote:
On Mon, Jul 27, 2015 at 12:35 PM, Gorka Eguileor geguileo@redhat.com wrote:

I know we've all been looking at the HA Active-Active problem in Cinder
and trying our best to figure out possible solutions to the different
issues, and since current plan is going to take a while (because it
requires that we finish first fixing Cinder-Nova interactions), I've been
looking at alternatives that allow Active-Active configurations without
needing to wait for those changes to take effect.

And I think I have found a possible solution, but since the HA A-A
problem has a lot of moving parts I ended up upgrading my initial
Etherpad notes to a post [1].

Even if we decide that this is not the way to go, which we'll probably
do, I still think that the post brings a little clarity on all the
moving parts of the problem, even some that are not reflected on our
Etherpad [2], and it can help us not miss anything when deciding on a
different solution.

Based on IRC conversations in the Cinder room and hearing people's
opinions in the spec reviews, I'm not convinced the complexity that a
distributed lock manager adds to Cinder for both developers and the
operators who ultimately are going to have to learn to maintain things
like Zoo Keeper as a result is worth it.

Hi Mike,

I think you are right in bringing up the cost that adding a DLM to the
solution brings to operators, as it is something important to take into
consideration, and I would like to say that Ceilometer is already using
Tooz so operators are already familiar with these DLM, but unfortunately
that would be stretching the truth, since Cinder is present in 73% of
OpenStack production workloads while Ceilometer is only in 33% of them,
so we would be certainly disturbing some operators.

But we must not forget that the only operators that would need to worry
about deploying and maintaining the DLM are those wanting to deploy
Active-Active configurations (for Active-Passive configuration Tooz will
be working with local file locks like we are doing now), and some of
those may think like Duncan does: "I already have to administer rabbit,
mysql, backends, horizon, load ballancers, rate limiters... adding
redis isn't going to make it that much harder".

That's why I don't think this is such a big deal for the vast majority
of operators.

On the developer side I have to disagree, there is no difference between
using Tooz and using current oslo synchronization mechanism for non
Active-Active deployments.

Key point: We're not scaling Cinder itself, it's about scaling to
avoid build up of operations from the storage backend solutions
themselves.

You must also consider that Active-Active solution will help deployments
where downtime is not an option or have SLAs with uptime or operational
requirements, it's not only about increasing volume of operations and
reducing times.

Whatever people think ZooKeeper "scaling level" is going to accomplish
is not even a question. We don't need it, because Cinder isn't as
complex as people are making it.

I'd like to think the Cinder team is a great in recognizing potential
cross project initiatives. Look at what Thang Pham has done with
Nova's version object solution. He made a generic solution into an
Oslo solution for all, and Cinder is using it. That was awesome, and
people really appreciated that there was a focus for other projects to
get better, not just Cinder.

To be fair, Tooz is just one of those cross project initiatives you are
describing, it's a generic solution that can be used in all projects,
not just Ceilometer.

Have people consider Ironic's hash ring solution? The project Akanda
is now adopting it [1], and I think it might have potential. I'd
appreciate it if interested parties could have this evaluated before
the Cinder midcycle sprint next week, to be ready for discussion.

I will have a look at the hash ring solution you mention and see if it
makes sense to use it.

And I would really love to see the HA A-A discussion enabled for remote
people, as some of us are interested in the discussion but won't be able
to attend. In my case problems with living in the Old World :-(

In a way I have to agree with you that sometimes we make Cinder look
more complex than it really is, and in my case the solution I proposed
in the post was way too complex as it has been pointed out. I just
tried to solve de A-A problem and fix some other issues like recovering
lost jobs (those waiting for locks) at the same time.

There is an alternative solution I am considering that will be much
simpler and will align with Walter's efforts to remove locks from the
Volume Manager. I just need to give it a hard think to make sure the
solution has all bases covered.

The main reason why I am suggesting using Tooz and a DLM is because I
think it will allow us to reach Active-Active faster and with less
effort, not because I think it will fix all our problems or that we'll
have to keep using it forever. It's basically replacing our current
local locks.

As I see the road of HA A-A for Cinder would look like:

Step 1: Get A-A with Tooz locks and a DLM. There are other pieces of
the puzzle to solve this, but those pieces will carry on to the final
solution.

Step 2: Remove locks from the manager, here we'll be keeping locks in
drivers.

Step 3: See what drivers can work without locks in Active-Passive
configurations (for example LVM will still need local file locks to work
as seen in bug #1460692) and in Active-Active configurations there may
be some file based solutions that require additional locks.

Looking for an alternative solution to DLM will require more work and
bring more bugs into the code, and for what? After all we are going to
get rid of any additional mechanism in the manager and just use the DB
to return resource is busy errors.

We know that our current locking mechanism works, lets use that to our
advantage for a little while.

If people still think we should not go with a DLM I'll write a proposal
that doesn't need it, but it's going to be more work until we can see an
Active-Active configuration working and we'll probably still need a DLM
for some drivers.

Cheers,
Gorka.

PS: I have given a good thought to the solution you proposed the other
day and I can discuss it now.

[1] - https://review.openstack.org/#/c/195366/

--
Mike Perez


OpenStack Development Mailing List (not for usage questions)
Unsubscribe: OpenStack-dev-request@lists.openstack.org?subject:unsubscribe
http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev


OpenStack Development Mailing List (not for usage questions)
Unsubscribe: OpenStack-dev-request@lists.openstack.org?subject:unsubscribe
http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev
responded Jul 31, 2015 by Gorka_Eguileor (2,960 points)   2 3
0 votes

Mike Perez wrote:
On Mon, Jul 27, 2015 at 12:35 PM, Gorka Eguileorgeguileo@redhat.com wrote:

I know we've all been looking at the HA Active-Active problem in Cinder
and trying our best to figure out possible solutions to the different
issues, and since current plan is going to take a while (because it
requires that we finish first fixing Cinder-Nova interactions), I've been
looking at alternatives that allow Active-Active configurations without
needing to wait for those changes to take effect.

And I think I have found a possible solution, but since the HA A-A
problem has a lot of moving parts I ended up upgrading my initial
Etherpad notes to a post [1].

Even if we decide that this is not the way to go, which we'll probably
do, I still think that the post brings a little clarity on all the
moving parts of the problem, even some that are not reflected on our
Etherpad [2], and it can help us not miss anything when deciding on a
different solution.

Based on IRC conversations in the Cinder room and hearing people's
opinions in the spec reviews, I'm not convinced the complexity that a
distributed lock manager adds to Cinder for both developers and the
operators who ultimately are going to have to learn to maintain things
like Zoo Keeper as a result is worth it.

Key point: We're not scaling Cinder itself, it's about scaling to
avoid build up of operations from the storage backend solutions
themselves.

Whatever people think ZooKeeper "scaling level" is going to accomplish
is not even a question. We don't need it, because Cinder isn't as
complex as people are making it.

I agree with 'cinder isn't as complex as people are making it' and that
is very likely a good thing to keep in mind, whether zookeeper can help
or not is a different question. Zookeeper imho is just another tool in
your toolset/belt, and as with any tool u have to know when to use it
(of course you can also just continue using chisels and such to); I'd
rather people see that it is just that and avoid getting caught up on
the other aspects prematurely.

...random thought here, skip as needed... in all honesty orchestration
solutions like mesos
(http://mesos.apache.org/assets/img/documentation/architecture3.jpg),
map-reduce solutions like hadoop, stream processing systems like apache
storm (...), are already using zookeeper and I'm not saying we should
just use it cause they are, but the likelihood that they just picked it
for no reason are imho slim.

I'd like to think the Cinder team is a great in recognizing potential
cross project initiatives. Look at what Thang Pham has done with
Nova's version object solution. He made a generic solution into an
Oslo solution for all, and Cinder is using it. That was awesome, and
people really appreciated that there was a focus for other projects to
get better, not just Cinder.

Have people consider Ironic's hash ring solution? The project Akanda
is now adopting it [1], and I think it might have potential. I'd
appreciate it if interested parties could have this evaluated before
the Cinder midcycle sprint next week, to be ready for discussion.

[1] - https://review.openstack.org/#/c/195366/

--
Mike Perez


OpenStack Development Mailing List (not for usage questions)
Unsubscribe: OpenStack-dev-request@lists.openstack.org?subject:unsubscribe
http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev


OpenStack Development Mailing List (not for usage questions)
Unsubscribe: OpenStack-dev-request@lists.openstack.org?subject:unsubscribe
http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev
responded Jul 31, 2015 by Joshua_Harlow (12,560 points)   1 4 4
0 votes

On Fri, Jul 31, 2015 at 8:56 AM, Joshua Harlow harlowja@outlook.com wrote:
...random thought here, skip as needed... in all honesty orchestration
solutions like mesos
(http://mesos.apache.org/assets/img/documentation/architecture3.jpg),
map-reduce solutions like hadoop, stream processing systems like apache
storm (...), are already using zookeeper and I'm not saying we should just
use it cause they are, but the likelihood that they just picked it for no
reason are imho slim.

I'd really like to see focus cross project. I don't want Ceilometer to
depend on Zoo Keeper, Cinder to depend on etcd, etc. This is not ideal
for an operator to have to deploy, learn and maintain each of these
solutions.

I think this is difficult when you consider everyone wants options of
their preferred DLM. If we went this route, we should pick one.

Regardless, I want to know if we really need a DLM. Does Ceilometer
really need a DLM? Does Cinder really need a DLM? Can we just use a
hash ring solution where operators don't even have to know or care
about deploying a DLM and running multiple instances of Cinder manager
just works?

--
Mike Perez


OpenStack Development Mailing List (not for usage questions)
Unsubscribe: OpenStack-dev-request@lists.openstack.org?subject:unsubscribe
http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev
responded Jul 31, 2015 by Mike_Perez (13,120 points)   2 3 4
0 votes

On 31 July 2015 at 20:40, Mike Perez thingee@gmail.com wrote:

Regardless, I want to know if we really need a DLM. Does Ceilometer
really need a DLM? Does Cinder really need a DLM? Can we just use a
hash ring solution where operators don't even have to know or care
about deploying a DLM and running multiple instances of Cinder manager
just works?

There's a lot of circling around here about what we're trying to achieve
with 'H/A'.

Some people are interested in performance. For them, a hash ring solution
(deterministic load balancing) is fine. If the aim is availability (as mine
is) then I can't see how it helps. I might be missing something, of course
- if so, I'm happy to be corrected.

To be clear, my aim with H/A is to remove the situation where a single node
failure removes the control path for my storage. Currently, the only way to
avoid this is to use something like pacemaker to monitor the c-vol
services. Extensive experience suggests that pacemaker is a complex,
fragile piece of software. Every component of cinder except c-vol can be
deployer active/active[/active/...] - I'm aiming for consistency of
approach if nothing else.

If it ends up that trying to fix this adds too much complexity and/or
fragility to cinder itself, then I can accept that - once whatever we do
ends up being worse than pacemaker, we've taken a significant step
backwards.

Regardless of how H/A discussions go, the first part of Gorka's patch can
certainly be used to fix a few of the API races we have, and can do so with
rather nice, elegant, easy to understand code, so I think the whole process
has been productive whatever the H/A outcome.

--
Duncan Thomas


OpenStack Development Mailing List (not for usage questions)
Unsubscribe: OpenStack-dev-request@lists.openstack.org?subject:unsubscribe
http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev
responded Jul 31, 2015 by Duncan_Thomas (16,160 points)   1 3 6
0 votes

Mike Perez wrote:
On Fri, Jul 31, 2015 at 8:56 AM, Joshua Harlowharlowja@outlook.com wrote:

...random thought here, skip as needed... in all honesty orchestration
solutions like mesos
(http://mesos.apache.org/assets/img/documentation/architecture3.jpg),
map-reduce solutions like hadoop, stream processing systems like apache
storm (...), are already using zookeeper and I'm not saying we should just
use it cause they are, but the likelihood that they just picked it for no
reason are imho slim.

I'd really like to see focus cross project. I don't want Ceilometer to
depend on Zoo Keeper, Cinder to depend on etcd, etc. This is not ideal
for an operator to have to deploy, learn and maintain each of these
solutions.

I think this is difficult when you consider everyone wants options of
their preferred DLM. If we went this route, we should pick one.

+1

Regardless, I want to know if we really need a DLM. Does Ceilometer
really need a DLM? Does Cinder really need a DLM? Can we just use a
hash ring solution where operators don't even have to know or care
about deploying a DLM and running multiple instances of Cinder manager
just works?

All very good questions, although IMHO a hash-ring is just a piece of
the puzzle, and is more equivalent to sharding resources, which yes is
one way to scale as long as each shard never touches anything from the
other shards. If those shards ever start to need to touch anything
shared then u get back into this same situation again for a DLM (and at
that point u really do need the 'distributed' part of DLM, because each
shard is distributed).

And an few (maybe obvious) questions:

  • How would re-sharding work?
  • If sharding (the hash-ring partitioning) is based on entities
    (conductors/other) owning a 'bucket' of resources (ie entity 1 manages
    resources A-F, entity 2 manages resources G-M...), what happens if a
    entity dies, does some other entity take over that bucket, what happens
    if that entity really hasn't 'died' but is just disconnected from the
    network (partition tolerance...)? (If the answer is there is a lock on
    the resource/s being used by each entity, then u get back into the LM
    question).

I'm unsure about how ironic handles these problems (although I believe
they have a hash-ring and still have a locking scheme as well, so maybe
thats there answer for the dual-entities manipulating the same bucket
problem).

--
Mike Perez


OpenStack Development Mailing List (not for usage questions)
Unsubscribe: OpenStack-dev-request@lists.openstack.org?subject:unsubscribe
http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev


OpenStack Development Mailing List (not for usage questions)
Unsubscribe: OpenStack-dev-request@lists.openstack.org?subject:unsubscribe
http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev
responded Jul 31, 2015 by Joshua_Harlow (12,560 points)   1 4 4
0 votes

Joshua Harlow wrote:
Mike Perez wrote:

On Fri, Jul 31, 2015 at 8:56 AM, Joshua Harlowharlowja@outlook.com
wrote:

...random thought here, skip as needed... in all honesty orchestration
solutions like mesos
(http://mesos.apache.org/assets/img/documentation/architecture3.jpg),
map-reduce solutions like hadoop, stream processing systems like apache
storm (...), are already using zookeeper and I'm not saying we should
just
use it cause they are, but the likelihood that they just picked it
for no
reason are imho slim.

I'd really like to see focus cross project. I don't want Ceilometer to
depend on Zoo Keeper, Cinder to depend on etcd, etc. This is not ideal
for an operator to have to deploy, learn and maintain each of these
solutions.

I think this is difficult when you consider everyone wants options of
their preferred DLM. If we went this route, we should pick one.

+1

Regardless, I want to know if we really need a DLM. Does Ceilometer
really need a DLM? Does Cinder really need a DLM? Can we just use a
hash ring solution where operators don't even have to know or care
about deploying a DLM and running multiple instances of Cinder manager
just works?

All very good questions, although IMHO a hash-ring is just a piece of
the puzzle, and is more equivalent to sharding resources, which yes is
one way to scale as long as each shard never touches anything from the
other shards. If those shards ever start to need to touch anything
shared then u get back into this same situation again for a DLM (and at
that point u really do need the 'distributed' part of DLM, because each
shard is distributed).

And an few (maybe obvious) questions:

  • How would re-sharding work?
  • If sharding (the hash-ring partitioning) is based on entities
    (conductors/other) owning a 'bucket' of resources (ie entity 1 manages
    resources A-F, entity 2 manages resources G-M...), what happens if a
    entity dies, does some other entity take over that bucket, what happens
    if that entity really hasn't 'died' but is just disconnected from the
    network (partition tolerance...)? (If the answer is there is a lock on
    the resource/s being used by each entity, then u get back into the LM
    question).

I'm unsure about how ironic handles these problems (although I believe
they have a hash-ring and still have a locking scheme as well, so maybe
thats there answer for the dual-entities manipulating the same bucket
problem).

Code for some of this, maybe ironic folks can chime-in:

https://github.com/openstack/ironic/blob/2015.1.1/ironic/conductor/task_manager.py#L18
(using DB as DLM)

Afaik, since ironic built-in a hash-ring and the above task manager
since the start (or from a very earlier commit) they have better been
able to accomplish the HA goal, retrofitting stuff on-top of
nova,cinder,others... is not going to as easy...

--
Mike Perez


OpenStack Development Mailing List (not for usage questions)
Unsubscribe:
OpenStack-dev-request@lists.openstack.org?subject:unsubscribe
http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev


OpenStack Development Mailing List (not for usage questions)
Unsubscribe: OpenStack-dev-request@lists.openstack.org?subject:unsubscribe
http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev


OpenStack Development Mailing List (not for usage questions)
Unsubscribe: OpenStack-dev-request@lists.openstack.org?subject:unsubscribe
http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev
responded Jul 31, 2015 by Joshua_Harlow (12,560 points)   1 4 4
0 votes

On Fri, Jul 31, 2015 at 12:47:34PM -0700, Joshua Harlow wrote:
Joshua Harlow wrote:

Mike Perez wrote:

On Fri, Jul 31, 2015 at 8:56 AM, Joshua Harlowharlowja@outlook.com
wrote:

...random thought here, skip as needed... in all honesty orchestration
solutions like mesos
(http://mesos.apache.org/assets/img/documentation/architecture3.jpg),
map-reduce solutions like hadoop, stream processing systems like apache
storm (...), are already using zookeeper and I'm not saying we should
just
use it cause they are, but the likelihood that they just picked it
for no
reason are imho slim.

I'd really like to see focus cross project. I don't want Ceilometer to
depend on Zoo Keeper, Cinder to depend on etcd, etc. This is not ideal
for an operator to have to deploy, learn and maintain each of these
solutions.

I think this is difficult when you consider everyone wants options of
their preferred DLM. If we went this route, we should pick one.

+1

Regardless, I want to know if we really need a DLM. Does Ceilometer
really need a DLM? Does Cinder really need a DLM? Can we just use a
hash ring solution where operators don't even have to know or care
about deploying a DLM and running multiple instances of Cinder manager
just works?

All very good questions, although IMHO a hash-ring is just a piece of
the puzzle, and is more equivalent to sharding resources, which yes is
one way to scale as long as each shard never touches anything from the
other shards. If those shards ever start to need to touch anything
shared then u get back into this same situation again for a DLM (and at
that point u really do need the 'distributed' part of DLM, because each
shard is distributed).

And an few (maybe obvious) questions:

  • How would re-sharding work?
  • If sharding (the hash-ring partitioning) is based on entities
    (conductors/other) owning a 'bucket' of resources (ie entity 1 manages
    resources A-F, entity 2 manages resources G-M...), what happens if a
    entity dies, does some other entity take over that bucket, what happens
    if that entity really hasn't 'died' but is just disconnected from the
    network (partition tolerance...)? (If the answer is there is a lock on
    the resource/s being used by each entity, then u get back into the LM
    question).

I'm unsure about how ironic handles these problems (although I believe
they have a hash-ring and still have a locking scheme as well, so maybe
thats there answer for the dual-entities manipulating the same bucket
problem).

Code for some of this, maybe ironic folks can chime-in:

https://github.com/openstack/ironic/blob/2015.1.1/ironic/conductor/task_manager.py#L18
(using DB as DLM)

Afaik, since ironic built-in a hash-ring and the above task manager since
the start (or from a very earlier commit) they have better been able to
accomplish the HA goal, retrofitting stuff on-top of nova,cinder,others...
is not going to as easy...

I would still like to find time, one day, to use etcd or zookeeper as
our DLM in Ironic. Not having TTLs etc has been painful for us, though
we've mostly worked around it by now.

// jim


OpenStack Development Mailing List (not for usage questions)
Unsubscribe: OpenStack-dev-request@lists.openstack.org?subject:unsubscribe
http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev
responded Jul 31, 2015 by Jim_Rollenhagen (12,800 points)   2 3 3
0 votes

Excerpts from Mike Perez's message of 2015-07-31 10:40:04 -0700:

On Fri, Jul 31, 2015 at 8:56 AM, Joshua Harlow harlowja@outlook.com wrote:

...random thought here, skip as needed... in all honesty orchestration
solutions like mesos
(http://mesos.apache.org/assets/img/documentation/architecture3.jpg),
map-reduce solutions like hadoop, stream processing systems like apache
storm (...), are already using zookeeper and I'm not saying we should just
use it cause they are, but the likelihood that they just picked it for no
reason are imho slim.

I'd really like to see focus cross project. I don't want Ceilometer to
depend on Zoo Keeper, Cinder to depend on etcd, etc. This is not ideal
for an operator to have to deploy, learn and maintain each of these
solutions.

I think this is difficult when you consider everyone wants options of
their preferred DLM. If we went this route, we should pick one.

Regardless, I want to know if we really need a DLM. Does Ceilometer
really need a DLM? Does Cinder really need a DLM? Can we just use a
hash ring solution where operators don't even have to know or care
about deploying a DLM and running multiple instances of Cinder manager
just works?

So in the Ironic case, if two conductors decide they both own one IPMI
controller, chaos can ensue. They may, at different times, read that
the power is up, or down, and issue power control commands that may take
many seconds, and thus on the next status run of the other command may
cause the conductor to react by reversing, and they'll just fight over
the node in a tug-o-war fashion.

Oh wait, except, thats not true. Instead, they use the database as a
locking mechanism, and AFAIK, no nodes have been torn limb from limb by
two conductors thus far.

But, a DLM would be more efficient, and actually simplify failure
recovery for Ironic's operators. The database locks suffer from being a
little too conservative, and sometimes you just have to go into the DB
and delete a lock after something explodes (this was true 6 months ago,
it may have better automation sometimes now, I don't know).

Anyway, I'm all for the simplest possible solution. But, don't make it
too simple.


OpenStack Development Mailing List (not for usage questions)
Unsubscribe: OpenStack-dev-request@lists.openstack.org?subject:unsubscribe
http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev
responded Jul 31, 2015 by Clint_Byrum (40,940 points)   4 5 9
...