settingsLogin | Registersettings

[Openstack-operators] Nodes and configurations management in Puppet

0 votes

We maintain a fairly flat hiera structure, which largely is due to our OS
infrastructure still being pretty simple.

Like Clayton & Matt, we use a ?world? attribute to indicate dev/test/prod.
(Although in hindsight, I like the ?echelon? term a lot better. We did
the same exercise of thinking of synonyms for ?environment.?) So the
structure looks like:

  • %{::world}/%{::clientcert}

    • %{::world}
    • global

The global file is empty, and almost all of the config is stored in the
world file. Over time, this has led to hiera sprawl so the world files
have gotten quite messy. And there is a lot of items that aren?t unique
across worlds, so should really be in a global file. But, at the same
time, this gives us a [mostly] single source of truth and avoids the ?grep
-R? issue Joe described.

ENC at this point is done by specifying a ?role? parameter in the
individual clientcert file for each node. This is a major downside, and
doesn?t scale, so we need to figure out something better. Maybe we can
come up with a hostname scheme to encode the info there, like others have
done.

We run all masterless, for a variety of reasons (which limits ENC options,
too.) Ansible is used to kick off runs across the environment. r10k
deploys the Puppet environments (?master? and ?prod? which correspond to
git branches), heira data, and all the modules. Hiera data is in a
separate (private) git repo, but there?s only a master branch there.

I?ve been a big fan of the role/profile model, too, and it?s worked well
for us. One thing I?ve thought about is specifying a list of profile
classes for each node or node type in hiera, rather than maintaining a
mostly static role module. Then we can just hiera_include(), which is the
method we use in site.pp to include the role class now. I?d be interested
in others thoughts on this idea. I can?t really think of a compelling
reason to switch, other than it?s kind of clever.

Mike

On 9/26/14, 12:03 PM, "Mathieu Gagn?" wrote:

Hi Joe,

Your experience and story about Puppet and OpenStack makes me feel like
you are a long lost co-worker. :)

On 2014-09-25 10:30 PM, Joe Topjian wrote:

Hiera takes the cake for my love/hate of Puppet. I try really hard to
keep the number of hierarchies small and even then I find it awkward
sometimes. I love the concept of Hiera, but I find it can be
unintuitive.

Same here. The aspect I hate about Hiera is that files become very big
and unorganized very fast due to the quantity of configs. So you try to
split them in multiples files instead and then you have the problem you
describe below...

Similar to the other replies, I have a "common" hierarchy
where 90% of the data is stored. The other hierarchies either override
"common" or append to it. When I need to know where a parameter is
ultimately configured, I find myself thinking "is that parameter common
across everything or specific to a certain location or node, and if so,
why did I make it specific?", then doing a "grep -R" to find where it's
located, and finally thinking "oh right - that's why it's there".

Yep. That's the feeling I was referring to when I said "heart attack".

And now, try to form a new co-worker and explain him how it's organized:
"Oh, I felt the file was too big so I split it in a hope to restore
sanity which it did with limited success."

The other difficulty is the management of "common" configs like keystone
auth URL. Multiple services need this value, yet their might be split in
multiple files and the YAML anchor hack [1] I used so far does not work
across YAML files. Same for database configs which are needed by the
database server (to provision the user) and services (for the database
connection string).

Another area of Puppet that I'm finding difficult to work with is
configuring HA environments. There are two main pain points here and
they're pretty applicable to using Puppet with OpenStack:

The other HA pain point is creating many-to-one configurations [...]

I think a cleaner way of doing this is to introduce service discovery
into my environment, but I haven't had time to look into this in more
detail.

I wholly agree with you and that's a concept I'm interested to explore.
Come to think of it, it strangely looks like the "dependency inversion
principle" in software development.

I however feel that an external ENC becomes inevitable to achieve this
ease of use. Unfortunately, each time I looked into it, I rapidly get
lost in my dream of a simple dashboard to manage everything. I feel I
rapidly come to the limits of what exported resources, Hiera and
puppetdb can do.

One idea would be to export an haproxy::listen resource from one of the
controller (which now becomes a pet as you said) and realize it on the
HAProxy nodes with its associated haproxy::member resources.

I should mention that some of these HA pains can be resolved by just
moving all of the data to the HAProxy nodes themselves. So when I want
to add a new service, such as RabbitMQ, to HAProxy, I add the RabbitMQ
settings to the HAProxy role/profiles. But I want HAProxy to be "dumb"
about what it's hosting. I want to be able to use it in a Juju-like
fashion where I can introduce any arbitrary service and HAProxy
configures itself without prior knowledge of the new service.

Yes! How do you guys think we can implement such discovery?

With Nova cells, this problem became much more apparent due to
inter-relations between the API cell and compute cells. The API cell has
to know about the compute cells and vice versa.

In general, though, I really enjoy working with Puppet. Our current
Puppet configurations allow us to stand up test OpenStack environments
with little manual input as well as upgrade to newer releases of
OpenStack with very little effort.

Yes, I really enjoy Puppet too. After all hardware/infrastructure
aspects are figured out, we are able to bootstrap a new OpenStack region
in less than an hour.

To summarize my current pain points:
- Out of control Hiera configuration files
- Lack of service auto-discovery

[1] https://dmsimard.com/2014/02/15/quick-hiera-tips/

--
Mathieu


OpenStack-operators mailing list
OpenStack-operators at lists.openstack.org
http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-operators
asked Oct 3, 2014 in openstack-operators by Michael_Dorman (4,160 points)   5 11
retagged Apr 14, 2015 by admin

4 Responses

0 votes

We (http://www.csail.mit.edu) have a very complex (chaotic) puppet
world outside of OpenStack as we have dozens of research groups, each
with servers and workstations that all need slightly different though
mostly the same things along with "default" configs for these classes
of systems and the various infrastructure services that actually run
it all.

This gives us a deeper-than-I'd-like hiera hierarchy, though only the
top three are really used for OpenStack specific stuff:

:hierarchy:
- %{fqdn}
- %{role}
- %{cluster}
- %{group}
- %{lsbdistcodename}
- %{osfamily}
- common

$cluster is where we define "production", "test" etc,

$role specifices "compute-node" or "controller".

$fqdn is rarely used, but we maintain a testing host-aggregate for
last stage tests in the production cloud occasionally teh set of
fqdn.yaml files that match the hypervisors in that aggregate get
symlinked to some hiera data before actually putitng it in the
$role.yaml that all the hypervisors get. I'm currently using this to
verify my ceph integration will really & truly work for ephemeral
storage...

Though I'm also still using the deprecated
https://github.com/stackforge/puppet-openstack.git module last updated
for havana on my icehouse cloud, so take that as you will (the actual
heavy lifting mods line 'nova' are newer)

-Jon

On Thu, Oct 2, 2014 at 11:50 PM, Michael Dorman wrote:
We maintain a fairly flat hiera structure, which largely is due to our OS
infrastructure still being pretty simple.

Like Clayton & Matt, we use a ?world? attribute to indicate dev/test/prod.
(Although in hindsight, I like the ?echelon? term a lot better. We did
the same exercise of thinking of synonyms for ?environment.?) So the
structure looks like:

  • %{::world}/%{::clientcert}

    • %{::world}
    • global

The global file is empty, and almost all of the config is stored in the
world file. Over time, this has led to hiera sprawl so the world files
have gotten quite messy. And there is a lot of items that aren?t unique
across worlds, so should really be in a global file. But, at the same
time, this gives us a [mostly] single source of truth and avoids the ?grep
-R? issue Joe described.

ENC at this point is done by specifying a ?role? parameter in the
individual clientcert file for each node. This is a major downside, and
doesn?t scale, so we need to figure out something better. Maybe we can
come up with a hostname scheme to encode the info there, like others have
done.

We run all masterless, for a variety of reasons (which limits ENC options,
too.) Ansible is used to kick off runs across the environment. r10k
deploys the Puppet environments (?master? and ?prod? which correspond to
git branches), heira data, and all the modules. Hiera data is in a
separate (private) git repo, but there?s only a master branch there.

I?ve been a big fan of the role/profile model, too, and it?s worked well
for us. One thing I?ve thought about is specifying a list of profile
classes for each node or node type in hiera, rather than maintaining a
mostly static role module. Then we can just hiera_include(), which is the
method we use in site.pp to include the role class now. I?d be interested
in others thoughts on this idea. I can?t really think of a compelling
reason to switch, other than it?s kind of clever.

Mike

On 9/26/14, 12:03 PM, "Mathieu Gagn?" wrote:

Hi Joe,

Your experience and story about Puppet and OpenStack makes me feel like
you are a long lost co-worker. :)

On 2014-09-25 10:30 PM, Joe Topjian wrote:

Hiera takes the cake for my love/hate of Puppet. I try really hard to
keep the number of hierarchies small and even then I find it awkward
sometimes. I love the concept of Hiera, but I find it can be
unintuitive.

Same here. The aspect I hate about Hiera is that files become very big
and unorganized very fast due to the quantity of configs. So you try to
split them in multiples files instead and then you have the problem you
describe below...

Similar to the other replies, I have a "common" hierarchy
where 90% of the data is stored. The other hierarchies either override
"common" or append to it. When I need to know where a parameter is
ultimately configured, I find myself thinking "is that parameter common
across everything or specific to a certain location or node, and if so,
why did I make it specific?", then doing a "grep -R" to find where it's
located, and finally thinking "oh right - that's why it's there".

Yep. That's the feeling I was referring to when I said "heart attack".

And now, try to form a new co-worker and explain him how it's organized:
"Oh, I felt the file was too big so I split it in a hope to restore
sanity which it did with limited success."

The other difficulty is the management of "common" configs like keystone
auth URL. Multiple services need this value, yet their might be split in
multiple files and the YAML anchor hack [1] I used so far does not work
across YAML files. Same for database configs which are needed by the
database server (to provision the user) and services (for the database
connection string).

Another area of Puppet that I'm finding difficult to work with is
configuring HA environments. There are two main pain points here and
they're pretty applicable to using Puppet with OpenStack:

The other HA pain point is creating many-to-one configurations [...]

I think a cleaner way of doing this is to introduce service discovery
into my environment, but I haven't had time to look into this in more
detail.

I wholly agree with you and that's a concept I'm interested to explore.
Come to think of it, it strangely looks like the "dependency inversion
principle" in software development.

I however feel that an external ENC becomes inevitable to achieve this
ease of use. Unfortunately, each time I looked into it, I rapidly get
lost in my dream of a simple dashboard to manage everything. I feel I
rapidly come to the limits of what exported resources, Hiera and
puppetdb can do.

One idea would be to export an haproxy::listen resource from one of the
controller (which now becomes a pet as you said) and realize it on the
HAProxy nodes with its associated haproxy::member resources.

I should mention that some of these HA pains can be resolved by just
moving all of the data to the HAProxy nodes themselves. So when I want
to add a new service, such as RabbitMQ, to HAProxy, I add the RabbitMQ
settings to the HAProxy role/profiles. But I want HAProxy to be "dumb"
about what it's hosting. I want to be able to use it in a Juju-like
fashion where I can introduce any arbitrary service and HAProxy
configures itself without prior knowledge of the new service.

Yes! How do you guys think we can implement such discovery?

With Nova cells, this problem became much more apparent due to
inter-relations between the API cell and compute cells. The API cell has
to know about the compute cells and vice versa.

In general, though, I really enjoy working with Puppet. Our current
Puppet configurations allow us to stand up test OpenStack environments
with little manual input as well as upgrade to newer releases of
OpenStack with very little effort.

Yes, I really enjoy Puppet too. After all hardware/infrastructure
aspects are figured out, we are able to bootstrap a new OpenStack region
in less than an hour.

To summarize my current pain points:
- Out of control Hiera configuration files
- Lack of service auto-discovery

[1] https://dmsimard.com/2014/02/15/quick-hiera-tips/

--
Mathieu


OpenStack-operators mailing list
OpenStack-operators at lists.openstack.org
http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-operators


OpenStack-operators mailing list
OpenStack-operators at lists.openstack.org
http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-operators
responded Oct 3, 2014 by Jonathan_Proulx (2,200 points)   2 4 10
0 votes

Sounds like a great topic for the puppet operator session... one of the CERN people can explain how we're now doing it (since we're moving to a new structure after 12 months experience). Cells makes it even more interesting.

Tim

-----Original Message-----
From: Jonathan Proulx [mailto:jon at jonproulx.com]
Sent: 03 October 2014 18:01
To: Michael Dorman
Cc: openstack-operators at lists.openstack.org
Subject: Re: [Openstack-operators] Nodes and configurations management in
Puppet

We (http://www.csail.mit.edu) have a very complex (chaotic) puppet world
outside of OpenStack as we have dozens of research groups, each with servers
and workstations that all need slightly different though mostly the same things
along with "default" configs for these classes of systems and the various
infrastructure services that actually run it all.

This gives us a deeper-than-I'd-like hiera hierarchy, though only the top three
are really used for OpenStack specific stuff:

:hierarchy:
- %{fqdn}
- %{role}
- %{cluster}
- %{group}
- %{lsbdistcodename}
- %{osfamily}
- common

$cluster is where we define "production", "test" etc,

$role specifices "compute-node" or "controller".

$fqdn is rarely used, but we maintain a testing host-aggregate for last stage
tests in the production cloud occasionally teh set of fqdn.yaml files that match
the hypervisors in that aggregate get symlinked to some hiera data before
actually putitng it in the $role.yaml that all the hypervisors get. I'm currently
using this to verify my ceph integration will really & truly work for ephemeral
storage...

Though I'm also still using the deprecated
https://github.com/stackforge/puppet-openstack.git module last updated for
havana on my icehouse cloud, so take that as you will (the actual heavy lifting
mods line 'nova' are newer)

-Jon

On Thu, Oct 2, 2014 at 11:50 PM, Michael Dorman
wrote:

We maintain a fairly flat hiera structure, which largely is due to our
OS infrastructure still being pretty simple.

Like Clayton & Matt, we use a ?world? attribute to indicate dev/test/prod.
(Although in hindsight, I like the ?echelon? term a lot better. We
did the same exercise of thinking of synonyms for ?environment.?) So
the structure looks like:

  • %{::world}/%{::clientcert}

    • %{::world}
    • global

The global file is empty, and almost all of the config is stored in
the world file. Over time, this has led to hiera sprawl so the world
files have gotten quite messy. And there is a lot of items that
aren?t unique across worlds, so should really be in a global file.
But, at the same time, this gives us a [mostly] single source of truth
and avoids the ?grep -R? issue Joe described.

ENC at this point is done by specifying a ?role? parameter in the
individual clientcert file for each node. This is a major downside,
and doesn?t scale, so we need to figure out something better. Maybe
we can come up with a hostname scheme to encode the info there, like
others have done.

We run all masterless, for a variety of reasons (which limits ENC
options,
too.) Ansible is used to kick off runs across the environment. r10k
deploys the Puppet environments (?master? and ?prod? which correspond
to git branches), heira data, and all the modules. Hiera data is in a
separate (private) git repo, but there?s only a master branch there.

I?ve been a big fan of the role/profile model, too, and it?s worked
well for us. One thing I?ve thought about is specifying a list of
profile classes for each node or node type in hiera, rather than
maintaining a mostly static role module. Then we can just
hiera_include(), which is the method we use in site.pp to include the
role class now. I?d be interested in others thoughts on this idea. I
can?t really think of a compelling reason to switch, other than it?s kind of
clever.

Mike

On 9/26/14, 12:03 PM, "Mathieu Gagn?" wrote:

Hi Joe,

Your experience and story about Puppet and OpenStack makes me feel
like you are a long lost co-worker. :)

On 2014-09-25 10:30 PM, Joe Topjian wrote:

Hiera takes the cake for my love/hate of Puppet. I try really hard
to keep the number of hierarchies small and even then I find it
awkward sometimes. I love the concept of Hiera, but I find it can be
unintuitive.

Same here. The aspect I hate about Hiera is that files become very big
and unorganized very fast due to the quantity of configs. So you try
to split them in multiples files instead and then you have the problem
you describe below...

Similar to the other replies, I have a "common" hierarchy
where 90% of the data is stored. The other hierarchies either
override "common" or append to it. When I need to know where a
parameter is ultimately configured, I find myself thinking "is that
parameter common across everything or specific to a certain location
or node, and if so, why did I make it specific?", then doing a "grep
-R" to find where it's located, and finally thinking "oh right - that's why it's
there".

Yep. That's the feeling I was referring to when I said "heart attack".

And now, try to form a new co-worker and explain him how it's organized:
"Oh, I felt the file was too big so I split it in a hope to restore
sanity which it did with limited success."

The other difficulty is the management of "common" configs like
keystone auth URL. Multiple services need this value, yet their might
be split in multiple files and the YAML anchor hack [1] I used so far
does not work across YAML files. Same for database configs which are
needed by the database server (to provision the user) and services
(for the database connection string).

Another area of Puppet that I'm finding difficult to work with is
configuring HA environments. There are two main pain points here and
they're pretty applicable to using Puppet with OpenStack:

The other HA pain point is creating many-to-one configurations [...]

I think a cleaner way of doing this is to introduce service
discovery into my environment, but I haven't had time to look into
this in more detail.

I wholly agree with you and that's a concept I'm interested to explore.
Come to think of it, it strangely looks like the "dependency inversion
principle" in software development.

I however feel that an external ENC becomes inevitable to achieve this
ease of use. Unfortunately, each time I looked into it, I rapidly get
lost in my dream of a simple dashboard to manage everything. I feel I
rapidly come to the limits of what exported resources, Hiera and
puppetdb can do.

One idea would be to export an haproxy::listen resource from one of
the controller (which now becomes a pet as you said) and realize it on
the HAProxy nodes with its associated haproxy::member resources.

I should mention that some of these HA pains can be resolved by just
moving all of the data to the HAProxy nodes themselves. So when I
want to add a new service, such as RabbitMQ, to HAProxy, I add the
RabbitMQ settings to the HAProxy role/profiles. But I want HAProxy to be
"dumb"
about what it's hosting. I want to be able to use it in a Juju-like
fashion where I can introduce any arbitrary service and HAProxy
configures itself without prior knowledge of the new service.

Yes! How do you guys think we can implement such discovery?

With Nova cells, this problem became much more apparent due to
inter-relations between the API cell and compute cells. The API cell
has to know about the compute cells and vice versa.

In general, though, I really enjoy working with Puppet. Our current
Puppet configurations allow us to stand up test OpenStack
environments with little manual input as well as upgrade to newer
releases of OpenStack with very little effort.

Yes, I really enjoy Puppet too. After all hardware/infrastructure
aspects are figured out, we are able to bootstrap a new OpenStack
region in less than an hour.

To summarize my current pain points:
- Out of control Hiera configuration files
- Lack of service auto-discovery

[1] https://dmsimard.com/2014/02/15/quick-hiera-tips/

--
Mathieu


OpenStack-operators mailing list
OpenStack-operators at lists.openstack.org
http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-operator
s


OpenStack-operators mailing list
OpenStack-operators at lists.openstack.org
http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-operator
s


OpenStack-operators mailing list
OpenStack-operators at lists.openstack.org
http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-operators
responded Oct 3, 2014 by Tim_Bell (16,440 points)   1 6 10
0 votes

On 2014-10-02 11:50 PM, Michael Dorman wrote:

r10k
deploys the Puppet environments (?master? and ?prod? which correspond to
git branches), heira data, and all the modules. Hiera data is in a
separate (private) git repo, but there?s only a master branch there.

Are people maintaining the manifests/modules able to access the Hiera
private repository? Should someone wish to introduce a new manifest
requiring a new Hiera value, how does he make sure it get added to the
private repository?

How do we make sure someone introducing a new Hiera config asks the
other people to add it to the private repository.

Are there tests in place combining your manifests/modules and Hiera
repositories to validate that the catalog compiles correctly?

We do have this test in one of our project and it's kind of cool. But
manifests, some modules and Hiera are all in the same repository, easing
its maintenance, tests and deployment.

Our team are struggling to come up with a clever way to handle Hiera
secrets as not all people contributing to our manifests/modules should
be able to access them. The challenges are related to tests, packaging
and distributions. We have yet to come up with ideas, so it's mostly
exploration and popular consultation for now.

I?ve been a big fan of the role/profile model, too, and it?s worked well
for us. One thing I?ve thought about is specifying a list of profile
classes for each node or node type in hiera, rather than maintaining a
mostly static role module. Then we can just hiera_include(), which is the
method we use in site.pp to include the role class now. I?d be interested
in others thoughts on this idea. I can?t really think of a compelling
reason to switch, other than it?s kind of clever.

Unless you face strong limitations with your actual model, I don't see
any reason to switch to a "pure" role model. =)

--
Mathieu

responded Oct 3, 2014 by Mathieu_Gagné (3,300 points)   1 3 6
0 votes

On 10/3/14, 3:56 PM, "Mathieu Gagn?" wrote:

On 2014-10-02 11:50 PM, Michael Dorman wrote:

r10k
deploys the Puppet environments (?master? and ?prod? which correspond to
git branches), heira data, and all the modules. Hiera data is in a
separate (private) git repo, but there?s only a master branch there.

Are people maintaining the manifests/modules able to access the Hiera
private repository? Should someone wish to introduce a new manifest
requiring a new Hiera value, how does he make sure it get added to the
private repository?

How do we make sure someone introducing a new Hiera config asks the
other people to add it to the private repository.

At this point, it?s all the same group, so it just works. We did start
using hiera-eyaml a while back, which keeps any of the ?secrets?
encrypted, so theoretically we could make this repo non-private. But then
it?s back to the standard key management problem.

Are there tests in place combining your manifests/modules and Hiera
repositories to validate that the catalog compiles correctly?

We do have this test in one of our project and it's kind of cool. But
manifests, some modules and Hiera are all in the same repository, easing
its maintenance, tests and deployment.

No real integration testing to speak of today. The fact that we don?t use
any branches on the hiera repo simplifies it a bit, but it does make it
tricky for testing across multiple branches of each repo.

Our team are struggling to come up with a clever way to handle Hiera
secrets as not all people contributing to our manifests/modules should
be able to access them. The challenges are related to tests, packaging
and distributions. We have yet to come up with ideas, so it's mostly
exploration and popular consultation for now.

You should check out hiera-eyaml if you haven?t already
(https://github.com/TomPoulton/hiera-eyaml ). Doesn?t solve All The
Problems, but helps.

I?ve been a big fan of the role/profile model, too, and it?s worked well
for us. One thing I?ve thought about is specifying a list of profile
classes for each node or node type in hiera, rather than maintaining a
mostly static role module. Then we can just hiera_include(), which is
the
method we use in site.pp to include the role class now. I?d be
interested
in others thoughts on this idea. I can?t really think of a compelling
reason to switch, other than it?s kind of clever.

Unless you face strong limitations with your actual model, I don't see
any reason to switch to a "pure" role model. =)

Just because you can, doesn?t mean you should, right?

--
Mathieu

responded Oct 6, 2014 by Michael_Dorman (4,160 points)   5 11
...