settingsLogin | Registersettings

[openstack-dev] [Marconi] Why is marconi a queue implementation vs a provisioning API?

0 votes

So this came up briefly at the tripleo sprint, and since I can't seem
to find a /why/ document
(https://wiki.openstack.org/wiki/Marconi/Incubation#Raised_Questions_.2B_Answers
and https://wiki.openstack.org/wiki/Marconi#Design don't supply this)
we decided at the TC meeting that I should raise it here.

Firstly, let me check my facts :) - Marconi is backed by a modular
'storage' layer which places some conceptual design constraints on the
storage backends that are possible (e.g. I rather expect a 0mq
implementation to be very tricky, at best (vs the RPC style front end
https://wiki.openstack.org/wiki/Marconi/specs/zmq/api/v1 )), and has a
hybrid control/data plane API implementation where one can call into
it to make queues etc, and to consume them.

The API for the queues is very odd from a queueing perspective -
https://wiki.openstack.org/wiki/Marconi/specs/api/v1#Get_a_Specific_Message
- you don't subscribe to the queue, you enumerate and ask for a single
message.

And the implementations in tree are mongodb (which is at best
contentious, due to the AGPL and many folks reasonable concerns about
it), and mysq.

My desires around Marconi are:
- to make sure the queue we have is suitable for use by OpenStack
itself: we have a very strong culture around consolidating technology
choices, and it would be extremely odd to have Marconi be something
that isn't suitable to replace rabbitmq etc as the queue abstraction
in the fullness of time.
- to make sure that deployers with scale / performance needs can have
that met by Marconi
- to make my life easy as a deployer ;)

So my questions are:
- why isn't the API a queue friendly API (e.g. like
https://github.com/twitter/kestrel - kestrel which uses the memcache
API, puts put into the queue, gets get from the queue). The current
API looks like pretty much the worst case scenario there - CRUD rather
than submit/retrieve with blocking requests (e.g. longpoll vs poll).
- wouldn't it be better to expose other existing implementations of
HTTP message queues like nova does with hypervisors, rather than
creating our own one? E.g. HTTPSQS, RestMQ, Kestrel, queues.io.
- or even do what Trove does and expose the actual implementation directly?
- whats the plan to fix the API?
- is there a plan / desire to back onto actual queue services (e.g.
AMQP, $anyof the http ones above, etc)
- what is the current performance - how many usecs does it take to
put a message, and get one back, in real world use? How many
concurrent clients can a single Marconi API server with one backing
server deliver today?

As background, 'implement a message queue in a SQL DB' is such a
horrid antipattern its been a standing joke in many organisations I've
been in - and yet we're preparing to graduate exactly that which is
frankly perplexing.

-Rob

--
Robert Collins
Distinguished Technologist
HP Converged Cloud

asked Mar 18, 2014 in openstack-dev by Robert_Collins (27,200 points)   4 6 12
retagged Mar 8, 2015 by admin

26 Responses

0 votes

I think we can agree that a data-plane API only makes sense if it is
useful to a large number of web and mobile developers deploying their apps
on OpenStack. Also, it only makes sense if it is cost-effective and
scalable for operators who wish to deploy such a service.

Marconi was born of practical experience and direct interaction with
prospective users. When Marconi was kicked off a few summits ago, the
community was looking for a multi-tenant messaging service to round out
the OpenStack portfolio. Users were asking operators for something easier
to work with and more web-friendly than established options such as AMQP.

To that end, we started drafting an HTTP-based API specification that
would afford several different messaging patterns, in order to support the
use cases that users were bringing to the table. We did this completely in
the open, and received lots of input from prospective users familiar with
a variety of message broker solutions, including more ?cloudy? ones like
SQS and Iron.io.

The resulting design was a hybrid that supported what you might call
?claim-based? semantics ala SQS and feed-based semantics ala RSS.
Application developers liked the idea of being able to use one or the
other, or combine them to come up with new patterns according to their
needs. For example:

  1. A video app can use Marconi to feed a worker pool of transcoders. When
    a video is uploaded, it is stored in Swift and a job message is posted to
    Marconi. Then, a worker claims the job and begins work on it. If the
    worker crashes, the claim expires and the message becomes available to be
    claimed by a different worker. Once the worker is finished with the job,
    it deletes the message so that another worker will not process it, and
    claims another message. Note that workers never ?list? messages in this
    use case; those endpoints in the API are simply ignored.

  2. A backup service can use Marconi to communicate with hundreds of
    thousands of backup agents running on customers' machines. Since Marconi
    queues are extremely light-weight, the service can create a different
    queue for each agent, and additional queues to broadcast messages to all
    the agents associated with a single customer. In this last scenario, the
    service would post a message to a single queue and the agents would simply
    list the messages on that queue, and everyone would get the same message.
    This messaging pattern is emergent, and requires no special routing setup
    in advance from one queue to another.

  3. A metering service for an Internet application can use Marconi to
    aggregate usage data from a number of web heads. Each web head collects
    several minutes of data, then posts it to Marconi. A worker periodically
    claims the messages off the queue, performs the final aggregation and
    processing, and stores the results in a DB. So far, this messaging pattern
    is very much like example #1, above. However, since Marconi?s API also
    affords the observer pattern via listing semantics, the metering service
    could run an auditor that logs the messages as they go through the queue
    in order to provide extremely valuable data for diagnosing problems in the
    aggregated data.

Users are excited about what Marconi offers today, and we are continuing
to evolve the API based on their feedback.

Of course, app developers aren?t the only audience Marconi needs to serve.
Operators want something that is cost-effective, scales, and is
customizable for the unique needs of their target market.

While Marconi has plenty of room to improve (who doesn?t?), here is where
the project currently stands in these areas:

  1. Customizable. Marconi transport and storage drivers can be swapped out,
    and messages can be manipulated in-flight with custom filter drivers.
    Currently we have MongoDB and SQLAlchemy drivers, and are exploring Redis
    and AMQP brokers. Now, the v1.0 API does impose some constraints on the
    backend in order to support the use cases mentioned earlier. For example,
    an AMQP backend would only be able to support a subset of the current API.
    Operators occasionally ask about AMQP broker support, in particular, and
    we are exploring ways to evolve the API in order to support that.

  2. Scalable. Operators can use Marconi?s HTTP transport to leverage their
    existing infrastructure and expertise in scaling out web heads. When it
    comes to the backend, for small deployments with minimal throughput needs,
    we are providing a SQLAlchemy driver as a non-AGPL alternative to MongoDB.
    For large-scale production deployments, we currently provide the MongoDB
    driver and will likely add Redis as another option (there is already a POC
    driver). And, of course, operators can provide drivers for NewSQL
    databases, such as VelocityDB, that are very fast and scale extremely
    well. In Marconi, every queue can be associated with a different backend
    cluster. This allows operators to scale both up and out, according to what
    is most cost-effective for them. Marconi's app-level sharding is currently
    done using a lookup table to provide for maximum operator control over
    placement, but I personally think it would be great to see this opened up
    so that we can swap in other types of drivers, such as one based on hash
    rings (TBD).

  3. Cost-effective. The Marconi team has done a lot of work to (1) provide
    several dimensions for scaling deployments that can be used according to
    what is most cost-effective for a given use case, and (2) make the Marconi
    service as efficient as possible, including time spent optimizing the
    transport layer (using Falcon in lieu of Pecan, reducing the work that the
    request handlers do, etc.), and tuning the MongoDB storage driver (the
    SQLAlchemy driver is newer and we haven?t had the chance to tune it yet,
    but are planning to do so during Juno). Turnaround on requests is in the
    low ms range (including dealing with HTTP), not the usec range, but that
    works perfectly well for a large class of applications. We?ve been
    benchmarking with Tsung for quite a while now, and we are working on
    making the raw data more accessible to folks outside our team. I?ll try to
    get some of the latest data up on the wiki this week.

Marconi was originally incubated because the community believed developers
building their apps on top of OpenStack were looking for this kind of
service, and it was a big missing gap in our portfolio. Since that time,
the team has worked hard to fill that gap.

Kurt

responded Mar 19, 2014 by Kurt_Griffiths (2,480 points)   2 3
0 votes

Kurt Griffiths,

Thanks for detailed explanation. Is there a comparison between Marconi and
existing message brokers anywhere that you can point me out?
I can see how your examples can be implemented using other brokers like
RabbitMQ. So why there is a need another broker? And what is wrong with
currently deployed RabbitMQ that most of OpenStack services are using
(typically via oslo.messaging RPC)?

On Wed, Mar 19, 2014 at 4:00 AM, Kurt Griffiths <
kurt.griffiths at rackspace.com> wrote:

I think we can agree that a data-plane API only makes sense if it is
useful to a large number of web and mobile developers deploying their apps
on OpenStack. Also, it only makes sense if it is cost-effective and
scalable for operators who wish to deploy such a service.

Marconi was born of practical experience and direct interaction with
prospective users. When Marconi was kicked off a few summits ago, the
community was looking for a multi-tenant messaging service to round out
the OpenStack portfolio. Users were asking operators for something easier
to work with and more web-friendly than established options such as AMQP.

To that end, we started drafting an HTTP-based API specification that
would afford several different messaging patterns, in order to support the
use cases that users were bringing to the table. We did this completely in
the open, and received lots of input from prospective users familiar with
a variety of message broker solutions, including more "cloudy" ones like
SQS and Iron.io.

The resulting design was a hybrid that supported what you might call
"claim-based" semantics ala SQS and feed-based semantics ala RSS.
Application developers liked the idea of being able to use one or the
other, or combine them to come up with new patterns according to their
needs. For example:

  1. A video app can use Marconi to feed a worker pool of transcoders. When
    a video is uploaded, it is stored in Swift and a job message is posted to
    Marconi. Then, a worker claims the job and begins work on it. If the
    worker crashes, the claim expires and the message becomes available to be
    claimed by a different worker. Once the worker is finished with the job,
    it deletes the message so that another worker will not process it, and
    claims another message. Note that workers never "list" messages in this
    use case; those endpoints in the API are simply ignored.

  2. A backup service can use Marconi to communicate with hundreds of
    thousands of backup agents running on customers' machines. Since Marconi
    queues are extremely light-weight, the service can create a different
    queue for each agent, and additional queues to broadcast messages to all
    the agents associated with a single customer. In this last scenario, the
    service would post a message to a single queue and the agents would simply
    list the messages on that queue, and everyone would get the same message.
    This messaging pattern is emergent, and requires no special routing setup
    in advance from one queue to another.

  3. A metering service for an Internet application can use Marconi to
    aggregate usage data from a number of web heads. Each web head collects
    several minutes of data, then posts it to Marconi. A worker periodically
    claims the messages off the queue, performs the final aggregation and
    processing, and stores the results in a DB. So far, this messaging pattern
    is very much like example #1, above. However, since Marconi's API also
    affords the observer pattern via listing semantics, the metering service
    could run an auditor that logs the messages as they go through the queue
    in order to provide extremely valuable data for diagnosing problems in the
    aggregated data.

Users are excited about what Marconi offers today, and we are continuing
to evolve the API based on their feedback.

Of course, app developers aren't the only audience Marconi needs to serve.
Operators want something that is cost-effective, scales, and is
customizable for the unique needs of their target market.

While Marconi has plenty of room to improve (who doesn't?), here is where
the project currently stands in these areas:

  1. Customizable. Marconi transport and storage drivers can be swapped out,
    and messages can be manipulated in-flight with custom filter drivers.
    Currently we have MongoDB and SQLAlchemy drivers, and are exploring Redis
    and AMQP brokers. Now, the v1.0 API does impose some constraints on the
    backend in order to support the use cases mentioned earlier. For example,
    an AMQP backend would only be able to support a subset of the current API.
    Operators occasionally ask about AMQP broker support, in particular, and
    we are exploring ways to evolve the API in order to support that.

  2. Scalable. Operators can use Marconi's HTTP transport to leverage their
    existing infrastructure and expertise in scaling out web heads. When it
    comes to the backend, for small deployments with minimal throughput needs,
    we are providing a SQLAlchemy driver as a non-AGPL alternative to MongoDB.
    For large-scale production deployments, we currently provide the MongoDB
    driver and will likely add Redis as another option (there is already a POC
    driver). And, of course, operators can provide drivers for NewSQL
    databases, such as VelocityDB, that are very fast and scale extremely
    well. In Marconi, every queue can be associated with a different backend
    cluster. This allows operators to scale both up and out, according to what
    is most cost-effective for them. Marconi's app-level sharding is currently
    done using a lookup table to provide for maximum operator control over
    placement, but I personally think it would be great to see this opened up
    so that we can swap in other types of drivers, such as one based on hash
    rings (TBD).

  3. Cost-effective. The Marconi team has done a lot of work to (1) provide
    several dimensions for scaling deployments that can be used according to
    what is most cost-effective for a given use case, and (2) make the Marconi
    service as efficient as possible, including time spent optimizing the
    transport layer (using Falcon in lieu of Pecan, reducing the work that the
    request handlers do, etc.), and tuning the MongoDB storage driver (the
    SQLAlchemy driver is newer and we haven't had the chance to tune it yet,
    but are planning to do so during Juno). Turnaround on requests is in the
    low ms range (including dealing with HTTP), not the usec range, but that
    works perfectly well for a large class of applications. We've been
    benchmarking with Tsung for quite a while now, and we are working on
    making the raw data more accessible to folks outside our team. I'll try to
    get some of the latest data up on the wiki this week.

Marconi was originally incubated because the community believed developers
building their apps on top of OpenStack were looking for this kind of
service, and it was a big missing gap in our portfolio. Since that time,
the team has worked hard to fill that gap.

Kurt


OpenStack-dev mailing list
OpenStack-dev at lists.openstack.org
http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev

--
Sincerely yours
Stanislav (Stan) Lagun
Senior Developer
Mirantis
35b/3, Vorontsovskaya St.
Moscow, Russia
Skype: stanlagun
www.mirantis.com
slagun at mirantis.com
-------------- next part --------------
An HTML attachment was scrubbed...
URL:

responded Mar 19, 2014 by Stan_Lagun (4,160 points)   1 2 2
0 votes

Kurt already gave a quite detailed explanation of why Marconi, what
can you do with it and where it's standing. I'll reply in-line:

On 19/03/14 10:17 +1300, Robert Collins wrote:
So this came up briefly at the tripleo sprint, and since I can't seem
to find a /why/ document
(https://wiki.openstack.org/wiki/Marconi/Incubation#Raised_Questions_.2B_Answers
and https://wiki.openstack.org/wiki/Marconi#Design don't supply this)
we decided at the TC meeting that I should raise it here.

Firstly, let me check my facts :) - Marconi is backed by a modular
'storage' layer which places some conceptual design constraints on the
storage backends that are possible (e.g. I rather expect a 0mq
implementation to be very tricky, at best (vs the RPC style front end
https://wiki.openstack.org/wiki/Marconi/specs/zmq/api/v1 )), and has a
hybrid control/data plane API implementation where one can call into
it to make queues etc, and to consume them.

Those docs refers to a transport driver not a storage driver. In
Marconi, it's possible to have different protocols on top of the API.
The current one is based on HTTP but there'll likely be others in the
future.

We've changed some things in the API to support amqp based storage drivers.
We had a session during the HKG summit about this and since then, we've
always kept amqp drivers in mind when doing changes on the API. I'm
not saying it's perfect, though.

The API for the queues is very odd from a queueing perspective -
https://wiki.openstack.org/wiki/Marconi/specs/api/v1#Get_a_Specific_Message
- you don't subscribe to the queue, you enumerate and ask for a single
message.

The current way to subscribe to queues is by using polling.
Subscribing is not just tight to the "API" but also the transport
itself. As mentioned above, we currently just have support for HTTP.

Also, enumerating is not necessary. For instance, claiming with limit
1 will consume one message.

(Side note: At the incubation meeting, it was recommended to not put
efforts on writing new transport but to stabilize the API and work an
a storage backend with a license != AGPL)

And the implementations in tree are mongodb (which is at best
contentious, due to the AGPL and many folks reasonable concerns about
it), and mysq.

Just to avoid misleading folks that are not familiar with marconi, I
just want to point out that the driver is based on sqlalchemy.

My desires around Marconi are:
- to make sure the queue we have is suitable for use by OpenStack
itself: we have a very strong culture around consolidating technology
choices, and it would be extremely odd to have Marconi be something
that isn't suitable to replace rabbitmq etc as the queue abstraction
in the fullness of time.

Although this could be done in the future, I've heard from many folks
in the community that replacing OpenStack's rabbitmq / qpid / etc layer
with Marconi is a no-go. I don't recall the exact reasons now but I
think I can grab them from logs or something (Unless those folks are
reading this email and want to chime in). FWIW, I'd be more than happy
to experiment with this in the future. Marconi is definitely not ready as-is.

  • to make sure that deployers with scale / performance needs can have
    that met by Marconi
  • to make my life easy as a deployer ;)

This has been part of our daily reviews, work and designs. I'm sure
there's room for improvement, though.

So my questions are:
- why isn't the API a queue friendly API (e.g. like

Define queue friendly

https://github.com/twitter/kestrel - kestrel which uses the memcache
API, puts put into the queue, gets get from the queue). The current

I don't know kestrel but, how is this different from what Marconi does?

API looks like pretty much the worst case scenario there - CRUD rather
than submit/retrieve with blocking requests (e.g. longpoll vs poll).

I agree there are some limitations from using HTTP for this job, hence
the support for different transports. Just saying the API is CRUD is
again misleading and it doesn't highlight the value of having an HTTP
based transport. It's just wrong to think about marconi as just
another queuing system
instead of considering the use-cases it's
trying to solve.

There's a rough support for websocket in an external project but:

  1. It's not offical... yet.
  2. It was written as a proof of concept for the transport layer.
  3. It likely needs to be updated.

https://github.com/FlaPer87/marconi-websocket

  • wouldn't it be better to expose other existing implementations of
    HTTP message queues like nova does with hypervisors, rather than
    creating our own one? E.g. HTTPSQS, RestMQ, Kestrel, queues.io.

We've discussed to have support for API extensions in order to allow
some deployments to expose features from a queuing technology that we
don't necessary consider part of the core API.

  • or even do what Trove does and expose the actual implementation directly?

    • whats the plan to fix the API?

Fix the API?

For starters, moving away from a data API to a provision API (or to
just exposing the queuing technologies features) would not be fixing,
it'd be re-writing Marconi (or actually a brand new project).

  • is there a plan / desire to back onto actual queue services (e.g.
    AMQP, $anyof the http ones above, etc)

We've a plan to support an AMQP storage. It was delayed to focus on
the graduation requirements but we've already done some changes in the
API in order to improve the support of this type of storage.

https://blueprints.launchpad.net/marconi/+spec/storage-amqp

  • what is the current performance - how many usecs does it take to
    put a message, and get one back, in real world use? How many
    concurrent clients can a single Marconi API server with one backing
    server deliver today?

I don't have the results of the last bench test we did but I'm sure
other folks can provide them. It's not as fast as qpid's or rabbit's.
I don't think the HTTP API driver will be.

As background, 'implement a message queues in a SQL DB' is such a
horrid antipattern its been a standing joke in many organisations I've
been in - and yet we're preparing to graduate exactly that which is
frankly perplexing.

TBH. I could say the exact same thing of some of the supported drivers
that exist in some of the integrated projects and yet, they're
integrated. This comment was not necessary and it's quite misleading
for folks that are not familiar with Marconi. The concerns about the
sqlalchemy driver could've been expressed differently.

FWIW, I think there's a value on having an sqlalchemy driver. It's
helpful for newcomers, it integrates perfectly with the gate and I
don't want to impose other folks what they should or shouldn't use in
production. Marconi may be providing a data API but it's still
non-opinionated and it wants to support other drivers - or at least provide
a nice way to implement them. Working on sqlalchemy instead of amqp (or
redis) was decided in the incubation meeting.

But again, It's an optional driver that we're talking about here. As
of now, our recommended driver is mongodb's and as I already mentioned
in this email, we'll start working on an amqp's one, which will likely
become the recommended one. There's also support for redis.

As already mentioned, we have plans to complete the redis driver and
write an amqp based one and let them both live in the code base.
Having support for different storage dirvers makes marconi's sharding
feature more valuable.

Side Note:

When I say "this was decided in the incubation meeting" I'm not
blaming the meeting nor the TC. What I mean there is that it was
considered, at that point, to be the best thing to have in the
immediate future.

Cheers,
Flavio

--
@flaper87
Flavio Percoco
-------------- next part --------------
A non-text attachment was scrubbed...
Name: not available
Type: application/pgp-signature
Size: 819 bytes
Desc: not available
URL:

responded Mar 19, 2014 by Flavio_Percoco (36,960 points)   3 8 11
0 votes

Flavio Percoco wrote:
On 19/03/14 10:17 +1300, Robert Collins wrote:

My desires around Marconi are:
- to make sure the queue we have is suitable for use by OpenStack
itself: we have a very strong culture around consolidating technology
choices, and it would be extremely odd to have Marconi be something
that isn't suitable to replace rabbitmq etc as the queue abstraction
in the fullness of time.

Although this could be done in the future, I've heard from many folks
in the community that replacing OpenStack's rabbitmq / qpid / etc layer
with Marconi is a no-go. I don't recall the exact reasons now but I
think I can grab them from logs or something (Unless those folks are
reading this email and want to chime in). FWIW, I'd be more than happy
to experiment with this in the future. Marconi is definitely not ready
as-is.

That's the root of this thread. Marconi is not really designed to cover
Robert's use case, which would be to be consumed internally by OpenStack
as a message queue.

I classify Marconi as an "application building block" (IaaS+), a
convenient, SQS-like way for cloud application builders to pass data
around without having to spin up their own message queue in a VM. I
think that's a relevant use case, as long as performance is not an order
of magnitude worse than the "spin up your own in a VM" alternative.
Personally I don't consider "serving the internal needs of OpenStack" as
a feature blocker. It would be nice if it could, but the IaaS+ use case
is IMHO compelling enough.

--
Thierry Carrez (ttx)

-------------- next part --------------
A non-text attachment was scrubbed...
Name: signature.asc
Type: application/pgp-signature
Size: 901 bytes
Desc: OpenPGP digital signature
URL:

responded Mar 19, 2014 by Thierry_Carrez (57,480 points)   3 8 13
0 votes

On Wed, 2014-03-19 at 10:17 +1300, Robert Collins wrote:
So this came up briefly at the tripleo sprint, and since I can't seem
to find a /why/ document
(https://wiki.openstack.org/wiki/Marconi/Incubation#Raised_Questions_.2B_Answers
and https://wiki.openstack.org/wiki/Marconi#Design don't supply this)

I think we need a slight reset on this discussion. The way this email
was phrased gives a strong sense of "Marconi is a dumb idea, it's going
to take a lot to persuade me otherwise".

That's not a great way to start a conversation, but it's easy to
understand - a TC member sees a project on the cusp of graduating and,
when they finally get a chance to look closely at it, a number of things
don't make much sense. "Wait! Stop! WTF!" is a natural reaction if you
think a bad decision is about to be made.

We've all got to understand how pressurized a situation these graduation
and incubation discussions are. Projects put an immense amount of work
into proving themselves worthy of being an integrated project, they get
fairly short bursts of interaction with the TC, TC members aren't
necessarily able to do a huge amount of due diligence in advance and yet
TC members are really, really keen to avoid either undermining a healthy
project around some cool new technology or undermining OpenStack by
including an unhealthy project or sub-par technology.

And then there's the time pressure where a decision has to be made by a
certain date and if that decision is "not this time", the six months
delay until the next chance for a positive decision can be really
draining on motivation and momentum when everybody had been so focused
on getting a positive decision this time around.

We really need cool heads here and, above all, to try our best to assume
good faith, intentions and ability on both sides.

Some of the questions Robert asked are common questions and I know they
were discussed during the incubation review. However, the questions
persist and it's really important that TC members (and the community at
large) feel they can stand behind the answers to those questions. If I'm
chatting to someone and they ask me "why does OpenStack need to
implement its own messaging broker?", I need to have a good answer.

How about we do our best to put the implications for the graduation
decision aside for a bit and focus on collaboratively pulling together a
FAQ that everyone can buy into? The "raised questions and answers"
section of the incubation review linked above is a good start, but I
think we can take this email as feedback that those questions and
answers need much improvement.

This could be a good pattern for all new projects - if the TC and the
new project can't work together to draft a solid FAQ like this, then
it's not a good sign for the project.

See below for my attempt to summarize the questions and how we might go
about answering them. Is this a reasonable start?

Mark.

Why isn't Marconi simply an API for provisioning and managing AMQP, Kestrel,
ZeroMQ, etc. brokers and queues? Why is a new broker implementation needed?

=> I'm not sure I can summarize the answer here - the need for a HTTP data
plane API, the need for multi-tenancy, etc.? Maybe a table listing the
required features and whether they're provided by these existing solutions.

Maybe there's also an element of "we think we can do a better job". If so,
the point probably worth addressing is "OpenStack shouldn't attempt to write
a new database, or a new hypervisor, or a new SDN controller, or a new block
storage implementation ... so why should we write a implement a new message
broker? If this is just a bad analogy, explain why?

Implementing a message queue using an SQL DB seems like a bad idea, why is
Marconi doing that?

=> Perhaps explain why MongoDB is a good storage technology for this use case
and the SQLalchemy driver is just a toy.

Marconi's default driver depends on MongoDB which is licensed under the AGPL.
This license is currently a no-go for some organizations, so what plans does
Marconi have to implement another production-ready storage driver that supports
all API features?

=> Discuss the Redis driver plans?

Is Marconi designed to be suitable for use by OpenStack itself?

=> Discuss that it's not currently in scope and why not. In what way does the
OpenStack use case differ from the applications Marconi's current API
focused on?

How should a client subscribe to a queue?

=> Discuss that it's not by GET /messages but instead POST /claims?limit=N

responded Mar 19, 2014 by Mark_McLoughlin (5,180 points)   1 4 6
0 votes

On 03/19/2014 07:49 AM, Thierry Carrez wrote:
Flavio Percoco wrote:

On 19/03/14 10:17 +1300, Robert Collins wrote:

My desires around Marconi are: - to make sure the queue we have
is suitable for use by OpenStack itself: we have a very strong
culture around consolidating technology choices, and it would
be extremely odd to have Marconi be something that isn't
suitable to replace rabbitmq etc as the queue abstraction in
the fullness of time.

Although this could be done in the future, I've heard from many
folks in the community that replacing OpenStack's rabbitmq / qpid
/ etc layer with Marconi is a no-go. I don't recall the exact
reasons now but I think I can grab them from logs or something
(Unless those folks are reading this email and want to chime in).
FWIW, I'd be more than happy to experiment with this in the
future. Marconi is definitely not ready as-is.

That's the root of this thread. Marconi is not really designed to
cover Robert's use case, which would be to be consumed internally
by OpenStack as a message queue.

I classify Marconi as an "application building block" (IaaS+), a
convenient, SQS-like way for cloud application builders to pass
data around without having to spin up their own message queue in a
VM. I think that's a relevant use case, as long as performance is
not an order of magnitude worse than the "spin up your own in a VM"
alternative. Personally I don't consider "serving the internal
needs of OpenStack" as a feature blocker. It would be nice if it
could, but the IaaS+ use case is IMHO compelling enough.

This is my view, as well. I never considered replacing OpenStack's
current use of messaging within the scope of Marconi.

It's possible we could have yet another project that is a queue
provisioning project in the style of Trove. I'm not sure that
actually makes sense (an application template you can deploy may
suffice here). In any case, I view OpenStack's use case and anyone
wanting to use qpid/rabbit/whatever directly separate and out of scope
of Marconi.

--
Russell Bryant

responded Mar 19, 2014 by Russell_Bryant (19,240 points)   2 3 8
0 votes

On 20 March 2014 01:06, Mark McLoughlin wrote:

I think we need a slight reset on this discussion. The way this email
was phrased gives a strong sense of "Marconi is a dumb idea, it's going
to take a lot to persuade me otherwise".

Thanks Mark, thats a great point to make. I don't think Marconi is
dumb, but I sure don't understand why <list of things discussed, and
that you've very nicely rephrased here>. Thank you!

-Rob

--
Robert Collins
Distinguished Technologist
HP Converged Cloud

responded Mar 19, 2014 by Robert_Collins (27,200 points)   4 6 12
0 votes

Let me start by saying that I want there to be a constructive discussion
around all this. I've done my best to keep my tone as non-snarky as I could
while still clearly stating my concerns. I've also spent a few hours
reviewing the current code and docs. Hopefully this contribution will be
beneficial in helping the discussion along.

For what it's worth, I don't have a clear understanding of why the Marconi
developer community chose to create a new queue rather than an abstraction
layer on top of existing queues. While my lack of understanding there isn't
a technical objection to the project, I hope they can address this in the
aforementioned FAQ.

The reference storage implementation is MongoDB. AFAIK, no integrated
projects require an AGPL package to be installed, and from the discussions
I've been part of, that would be a show-stopper if Marconi required
MongoDB. As I understand it, this is why sqlalchemy support was required
when Marconi was incubated. Saying "Marconi also supports SQLA" is
disingenuous because it is a second-class citizen, with incomplete API
support, is clearly not the recommended storage driver, and is going to be
unusuable at scale (I'll come back to this point in a bit).

Let me ask this. Which back-end is tested in Marconi's CI? That is the
back-end that matters right now. If that's Mongo, I think there's a
problem. If it's SQLA, then I think Marconi should declare any features
which SQLA doesn't support to be optional extensions, make SQLA the
default, and clearly document how to deploy Marconi at scale with a SQLA
back-end.

Then there's the db-as-a-queue antipattern, and the problems that I have
seen result from this in the past... I'm not the only one in the OpenStack
community with some experience scaling MySQL databases. Surely others have
their own experiences and opinions on whether a database (whether MySQL or
Mongo or Postgres or ...) can be used in such a way atscale_ and not fall
over from resource contention. I would hope that those members of the
community would chime into this discussion at some point. Perhaps they'll
even disagree with me!

A quick look at the code around claim (which, it seems, will be the most
commonly requested action) shows why this is an antipattern.

The MongoDB storage driver for claims requires four queries just to get a
message, with a serious race condition (but at least it's documented in the
code) if multiple clients are claiming messages in the same queue at the
same time. For reference:

https://github.com/openstack/marconi/blob/master/marconi/queues/storage/mongodb/claims.py#L119

The SQLAlchemy storage driver is no better. It's issuing five queries
just to claim a message (including a query to purge all expired claims
every time a new claim is created). The performance of this transaction
under high load is probably going to be bad...

https://github.com/openstack/marconi/blob/master/marconi/queues/storage/sqlalchemy/claims.py#L83

Lastly, it looks like the Marconi storage drivers assume the storage
back-end to be infinitely scalable. AFAICT, the mongo storage driver
supports mongo's native sharding -- which I'm happy to see -- but the SQLA
driver does not appear to support anything equivalent for other back-ends,
eg. MySQL. This relegates any deployment using the SQLA backend to the
scale of "only what one database instance can handle". It's unsuitable for
any large-scale deployment. Folks who don't want to use Mongo are likely to
use MySQL and will be promptly bitten by Marconi's lack of scalability with
this back end.

While there is a lot of room to improve the messaging around what/how/why,
and I think a FAQ will be very helpful, I don't think that Marconi should
graduate this cycle because:
(1) support for a non-AGPL-backend is a legal requirement [*] for Marconi's
graduation;
(2) deploying Marconi with sqla+mysql will result in an incomplete and
unscalable service.

It's possible that I'm wrong about the scalability of Marconi with sqla +
mysql. If anyone feels that this is going to perform blazingly fast on a
single mysql db backend, please publish a benchmark and I'll be very happy
to be proved wrong. To be meaningful, it must have a high concurrency of
clients creating and claiming messages with (num queues) << (num clients)
<< (num messages), and all clients polling on a reasonably short interval,
based on what ever the recommended client-rate-limit is. I'd like the test
to be repeated with both Mongo and SQLA back-ends on the same hardware for
comparison.

Regards,
Devananda

[*]
https://wiki.openstack.org/wiki/Marconi/Incubation/Graduation#Legal_requirements
-------------- next part --------------
An HTML attachment was scrubbed...
URL:

responded Mar 19, 2014 by Devananda_van_der_Ve (10,380 points)   2 4 5
0 votes

Can someone please give more detail into why MongoDB being AGPL is a problem? The drivers that Marconi uses are Apache2 licensed, MongoDB is separated by the network stack and MongoDB is not exposed to the Marconi users so I don't think the 'A' part of the GPL really kicks in at all since the MongoDB "user" is the cloud provider, not the cloud end user?

Thanks,
Kevin


From: Devananda van der Veen [devananda.vdv at gmail.com]
Sent: Wednesday, March 19, 2014 12:37 PM
To: OpenStack Development Mailing List (not for usage questions)
Subject: Re: [openstack-dev] [Marconi] Why is marconi a queue implementation vs a provisioning API?

Let me start by saying that I want there to be a constructive discussion around all this. I've done my best to keep my tone as non-snarky as I could while still clearly stating my concerns. I've also spent a few hours reviewing the current code and docs. Hopefully this contribution will be beneficial in helping the discussion along.

For what it's worth, I don't have a clear understanding of why the Marconi developer community chose to create a new queue rather than an abstraction layer on top of existing queues. While my lack of understanding there isn't a technical objection to the project, I hope they can address this in the aforementioned FAQ.

The reference storage implementation is MongoDB. AFAIK, no integrated projects require an AGPL package to be installed, and from the discussions I've been part of, that would be a show-stopper if Marconi required MongoDB. As I understand it, this is why sqlalchemy support was required when Marconi was incubated. Saying "Marconi also supports SQLA" is disingenuous because it is a second-class citizen, with incomplete API support, is clearly not the recommended storage driver, and is going to be unusuable at scale (I'll come back to this point in a bit).

Let me ask this. Which back-end is tested in Marconi's CI? That is the back-end that matters right now. If that's Mongo, I think there's a problem. If it's SQLA, then I think Marconi should declare any features which SQLA doesn't support to be optional extensions, make SQLA the default, and clearly document how to deploy Marconi at scale with a SQLA back-end.

Then there's the db-as-a-queue antipattern, and the problems that I have seen result from this in the past... I'm not the only one in the OpenStack community with some experience scaling MySQL databases. Surely others have their own experiences and opinions on whether a database (whether MySQL or Mongo or Postgres or ...) can be used in such a way atscale_ and not fall over from resource contention. I would hope that those members of the community would chime into this discussion at some point. Perhaps they'll even disagree with me!

A quick look at the code around claim (which, it seems, will be the most commonly requested action) shows why this is an antipattern.

The MongoDB storage driver for claims requires four queries just to get a message, with a serious race condition (but at least it's documented in the code) if multiple clients are claiming messages in the same queue at the same time. For reference:
https://github.com/openstack/marconi/blob/master/marconi/queues/storage/mongodb/claims.py#L119

The SQLAlchemy storage driver is no better. It's issuing five queries just to claim a message (including a query to purge all expired claims every time a new claim is created). The performance of this transaction under high load is probably going to be bad...
https://github.com/openstack/marconi/blob/master/marconi/queues/storage/sqlalchemy/claims.py#L83

Lastly, it looks like the Marconi storage drivers assume the storage back-end to be infinitely scalable. AFAICT, the mongo storage driver supports mongo's native sharding -- which I'm happy to see -- but the SQLA driver does not appear to support anything equivalent for other back-ends, eg. MySQL. This relegates any deployment using the SQLA backend to the scale of "only what one database instance can handle". It's unsuitable for any large-scale deployment. Folks who don't want to use Mongo are likely to use MySQL and will be promptly bitten by Marconi's lack of scalability with this back end.

While there is a lot of room to improve the messaging around what/how/why, and I think a FAQ will be very helpful, I don't think that Marconi should graduate this cycle because:
(1) support for a non-AGPL-backend is a legal requirement [*] for Marconi's graduation;
(2) deploying Marconi with sqla+mysql will result in an incomplete and unscalable service.

It's possible that I'm wrong about the scalability of Marconi with sqla + mysql. If anyone feels that this is going to perform blazingly fast on a single mysql db backend, please publish a benchmark and I'll be very happy to be proved wrong. To be meaningful, it must have a high concurrency of clients creating and claiming messages with (num queues) << (num clients) << (num messages), and all clients polling on a reasonably short interval, based on what ever the recommended client-rate-limit is. I'd like the test to be repeated with both Mongo and SQLA back-ends on the same hardware for comparison.

Regards,
Devananda

[*] https://wiki.openstack.org/wiki/Marconi/Incubation/Graduation#Legal_requirements
-------------- next part --------------
An HTML attachment was scrubbed...
URL:

responded Mar 19, 2014 by Fox,_Kevin_M (29,360 points)   1 3 4
0 votes

On 03/19/2014 02:24 PM, Fox, Kevin M wrote:
Can someone please give more detail into why MongoDB being AGPL is a
problem? The drivers that Marconi uses are Apache2 licensed, MongoDB is
separated by the network stack and MongoDB is not exposed to the Marconi
users so I don't think the 'A' part of the GPL really kicks in at all
since the MongoDB "user" is the cloud provider, not the cloud end user?

Even if MongoDB was exposed to end-users, would that be a problem?

Obviously the source to MongoDB would need to be made available
(presumably it already is) but does the AGPL licence "contaminate" the
Marconi stuff? I would have thought that would fall under "mere
aggregation".

Chris

responded Mar 19, 2014 by Chris_Friesen (20,420 points)   4 17 26
...