Scaling RabbitMQ
Jared W. Robinson
Vivint, Inc.
Who I am
- Father of five awesome children
- Enjoy programming and solving problems.
- Love Linux, and have been running it since 1995, currently using Ubuntu
- Enjoy working at Vivint, helping to scale our server-side infrastructure, which serves our customers and our security panels.
What I believe
- Use good tools
- Use what's available, practical
- Try it, test it, benchmark it. If one idea doesn't work, get feedback, and try something else. Fail fast.
What's covered: 1 of 2
- Basics
- Work queues, then and now
- Latency, then and now
- Speed & Throughput
- Fail-over
What's covered: 2 of 2
- limits.conf
- Transient vs durable trade-offs
- Backpressure and RabbitMQ death
- Grouping messages
- Resources
What's not covered
- Publish/Subscribe, Topics, RPC
- Alternate technologies:
Notes
Excellent summary of the differences and trade-offs between RabbitMQ and Apache Kafka by Stuart Charlton, of Pivotal Software: http://www.quora.com/Which-one-is-better-for-durable-messaging-with-good-query-features-RabbitMQ-or-Kafka
a) Use Kafka if you have a fire hose of events (100k+/sec) you need delivered in partitioned order 'at least once' with a mix of online and batch consumers, you want to be able to re-read messages, you can deal with current limitations around node-level HA (or can use trunk code), and/or you don't mind supporting incubator-level software yourself via forums/IRC.
b) Use Rabbit if you have messages (20k+/sec) that need to be routed in complex ways to consumers, you want per-message delivery guarantees, you don't care about ordered delivery, you need HA at the cluster-node level now, and/or you need 24x7 paid support in addition to forums/IRC.
Basics
- What is Rabbit, and how does it help?
- Clients: Java, .NET, C/C++, PHP, Python, Ruby, Go, etc.
- Publishers, Brokers, Consumers and mortgages
- Post-offices and messages: broadcast, direct, handlers
- Message, exchange, queue, and binding
- Routing key and "earnest money"
- Connection, Channel
- Work queue & direct exchange
Notes
- Rabbit's AMQP concepts: https://www.rabbitmq.com/tutorials/amqp-concepts.html
- Glossary of terms: http://pythonhosted.org/nucleon.amqp/glossary.html
In the beginning
- Work queues
- Round-robin distribution :( -- more on this later
- Durability & Persistence
Notes
- **** ASK: Who uses RabbitMQ?
- **** ASK Who uses work queues? What are they for?
- Why is round-robin distribution bad?
- Which part of the system should ideally persist messages? As much as possible, the original producers, or the leaves in the graph.
Did you know?
- Good book: RabbitMQ in Action
- Practical advice, born from experience
- Discusses High-Availability Pitfalls & Solutions
- especially with regard to durable queues
Latency
- Delivery by carrier pigeon isn't good enough. Nor do we want delivery by cargo-ship
- "Latency is the root of all that is evil on the Internet... or so the saying goes." -- Theo Schlossnagle
- In the beginning: Round-robin distribution & slow workers
- Database: Contention, Slow queries, Expensive queries, Non-indexed queries
- CPU contention and VMs on the same hypervisor
- Slow third-party services
Latency & Prefetch
- Prefetch protects us against slow consumers
- Declared by the client, at channel creation time
channel = amqp.Connection("my.broker.com:5672", ...)
channel.basic_qos(
prefetch_size=0,
prefetch_count=10,
a_global=False)
channel.basic_consume("weather", ...)
Notes
import amqplib.client_0_8 as amqp
channel = amqp.Connection("mr.broker.com:5672", "myuser", "mypassword")
channel.basic_qos(prefetch_size=0, prefetch_count=10, a_global=False)
channel.basic_consume("weather", no_ack=False, callback=my_callback, consumer_tag=CONSUMER_TAG_CONST)
Did you know?
- When publishing to an exchange that isn't bound to a queue, there are no errors from the client perspective.
Throughput & Speed
- "Fast hands make light work"?
- Our max speed is ~10K messages/sec (w/o HiPE)
- HiPE
- 20-50% better performance
- not available on Windows
- Experimental. Disable if it segfaults.
- not available with RHEL/CentOS erlang
- Speed of hardware
- Is it running in VM? CPU/RAM contention.
- Do too many consumers slow it down?
I benchmarked the maximum throughput of our hardware around 10,000
messages/second for non-durable queues, non-persistent messages. Each
message was less than 4K. YMMV.
Throughput needs
- As of March, 2015:
- 50 user-initiated commands/second
- 1,500 inbound signals/second
- 7,000 messages/second on our busiest queue
- Doubling this year
- Need headroom for spikes and a buffer for growth beyond this year
- For coming year:
- 21,000 messages/second on our busiest queue
- Even with HiPE, a single work queue isn't sufficient
Throughput via sharding
"Many hands make light work"
Throughput: Are you my solution?
- Are you my solution?
- rabbitmq-consistent-hash-exchange
- rabbitmq-sharding
- client-side, shared nothing
- Federated Queues
- Try it, test it, benchmark it. Fail fast & move on.
Consistent Hash Exchange
- Plugin ships with Rabbit!
- Gracefully handles the disappearance of a queue on a crashed rabbit node
- We determine link between the exchange and downstream queues (where queues live, how many there are, etc.)
- A single RabbitMQ management console shows the entire cluster!
Notes
- rabbitmq-plugins enable rabbitmq_consistent_hash_exchange
- Use in combination with RabbitMQ clustering.
- https://github.com/rabbitmq/rabbitmq-consistent-hash-exchange
- http://rabbitmq.1065348.n5.nabble.com/Unexpected-Behavior-When-Using-the-quot-X-Consistent-Hash-quot-Exchange-Type-td30561.html
Consistent Hash Exchange configuration
$ umask 0022 -- important if you're doing this as root
$ rabbitmq-plugins enable rabbitmq_consistent_hash_exchange
The following plugins have been enabled:
rabbitmq_consistent_hash_exchange
Plugin configuration has changed. Restart RabbitMQ for changes to take effect.
$ service rabbitmq-server restart
$ /usr/sbin/rabbitmq-plugins list -E
Consistent Hash Exchange configuration
- Declare the exchange as type "x-consistent-hash"
- Bind the exchange to downstream queues (you decide where queues live, how many there are, etc.)
- When binding the queue to the exchange, the routing key must be a number-as-a-string
- Insufficient: "2", Better: "101"
- Producer must vary the per-message routing key -- random, or an id
- Not round-robin -- not an even distribution
- Don't forget cluster port and firewall configuration
Notes
- Gotchas of using the exchange: http://rabbitmq.1065348.n5.nabble.com/Unexpected-Behavior-When-Using-the-quot-X-Consistent-Hash-quot-Exchange-Type-td30561.html
- Port and firewall config: http://www.gettingcirrius.com/2013/01/configuring-iptables-for-rabbitmq.html
Documentation on the routing key:
The more points in the hash space each binding has, the closer the actual distribution will be to the desired distribution (as indicated by the ratio of points by binding). However, large numbers of points (many thousands) will substantially decrease performance of the exchange type.
Equally, it is important to ensure that the messages being published to the exchange have a range of different routing_keys: if a very small set of routing keys are being used then there's a possibility of messages not being evenly distributed between the various queues. If the routing key is a pseudo-random session ID or such, then good results should follow.
Consistent Hash Exchange Uneven Distribution
Notes
- The binding key for the queues to the exchange was "2"
Consistent Hash Exchange Uneven Distribution
Notes
- The binding key for the queues to the exchange was "10"
Consistent Hash Exchange Caveats
- Unlike a simple work-queue, requires differing routing keys -- random, or use an id
- Definitely consistent, but not an even distribution
- Has a chance of losing messages when unbinding or deleting a queue
- There's a moment of time between the determination of which queue a message should be sent to, and the time it's published
- Clients must be configured to evenly connect to the queue shards
- Try it, test it, benchmark it. Fail fast & move on.
rabbitmq-sharding
- Built by the Pivotal/RabbitMQ folks
- Based on the consistent hash exchange
- Auto-creates and binds the queues to the exchange!
- Transparently auto-connects client to the shard with the least number of consumers!
- Doesn't ship with Rabbit
- https://github.com/rabbitmq/rabbitmq-sharding
https://github.com/rabbitmq/rabbitmq-sharding/blob/master/README.extra.md
rabbitmq-sharding configuration
Download from https://www.rabbitmq.com/community-plugins/v3.3.x/
cp rabbitmq_sharding-3.3.x.ez \
/usr/lib/rabbitmq/lib/rabbitmq_server-3.3.5/plugins/
rabbitmq-plugins enable rabbitmq_sharding
rabbitmq-plugins enable rabbitmq_consistent_hash_exchange
service rabbitmq-server restart
rabbitmqctl set_policy history-shard "^history" \
'{"shards-per-node": 2, "routing-key": "1234"}' \
--apply-to exchanges
rabbitmq-sharding client usage
- Clients declare the exchange type as "x-modulus-hash"
- Equal message distrubituion using uniform random routing key
- Use same routing key, and all messages will go to a single queue shard
- When nodes appear and disappear, the exchange automatically creates queue shards on the new node.
- It re-creates missing shards from nodes where I deleted them.
rabbitmq-sharding caveats
- Auto-creates and binds the queues to the exchange!
- Must decide how many queue shards per node. Want 3 queue shards on a four-node Rabbit cluster? Sorry.
- Transparently auto-connects client to the shard with the least number of consumers!
- Only on a per node basis, not per cluster
- When one node fills up, producers block, even though another node may be ready to accept messages
- Has a chance of losing messages when unbinding or deleting a queue
- Try it, test it, benchmark it. Fail fast & move on.
Did you know?
- "Each connection uses a file descriptor on the server. Channels don't"
- Publishing a large message on one channel will block a connection while it goes out.
- "it's a good idea to separate publishing and consuming connections"
Source: http://stackoverflow.com/questions/18531072/rabbitmq-by-example-multiple-threads-channels-and-queues
Client-side sharding
- Doesn't ship with Rabbit -- more work to implement
- Shared-nothing RabbitMQ nodes
- Homegrown monitoring to combine metrics from multiple servers
- Implement once for each language we use: Python, Java
- Gives us fail-over!
- Must balance consumers among the shards
Client-side sharding
Producers, Consumers, YAML
- Producers: distribute messages round-robin
- Producers: auto-detect node failure and reconnect after a period of time
- Consumers: Choosing a shard to read from didn't give us an even distribution
- Consumers: "sticky" -- no fail-over to a separate shard
- need extra consumers to handle the failure of a single node
- Configuration is done via YAML
- Every producer knows about every shard
- Update and deploy using salt-cp
Client-side sharding
Benefits
- If a node dies, we don't have to remove it from the cluster
- Durable queues work well in the face of node failure (shared nothing)
- Mix-and-match rabbit versions (atypical)
Client-side sharding monitoring
Homegrown monitoring
Client-side sharding Caveats
- More work to implement and monitor
- Producers still block when one of the nodes exerts backpressure (flow control)
Federated Queues
- Distribute "the same 'logical' queue... over many brokers."
- Configured via policy
- Mix-and-match RabbitMQ versions
- Performs "best when there is some degree of locality...."
Notes
https://www.rabbitmq.com/federated-queues.html
Pitfall: "cannot currently cause messages to traverse multiple hops between brokers based solely on need for messages in one place. For example, if you federate queues on nodes A, B and C, with A and B connected and B and C connected, but not A and C, then if messages are available at A and consumers waiting at C then messages will not be transferred from A to C via B unless there is also a consumer at B."
Federated Queues
Better together?
- Combine with rabbit sharding plugins or with client-side sharding
- Keep queues drained
Did you know?
- Client can publish to an exchange that doesn't exist, without an error.
Default limits.conf
Default (insufficient) limits.conf for RHEL/CentOS/Fedora
* soft nproc 1024 # threads/processes
* soft nofile 1024 # Number of open files
* hard nofile 4096 # Number of open files
Notes
See /etc/security/limits.d/90-nproc.conf and /etc/security/limits.conf
I don't see "nofile" in any configuration on RHEL/CentOS
limits.conf for RabbitMQ
Here's what I'm using for /etc/security/limits.conf
rabbitmq soft nproc 16384
rabbitmq hard nofile 16000
rabbitmq soft nofile 16000
Did you know?
- A client can't read from a queue that doesn't exist. It causes an error.
- Queues balloon in size when being fed, and no consumers are listening
- Use message TTL
- Consider combining TTL with a dead letter exchange
- Dead-letter exchange can be configured by clients, or via RabbitMQ policy
- rabbitmqctl set_policy DLX ".*" '{"dead-letter-exchange":"my-dlx"}' --apply-to queues
Notes
- http://www.rabbitmq.com/ttl.html
- https://www.rabbitmq.com/dlx.html
Fail-over
- What happens to your system when a rabbit node dies?
- Load balancer
- Client-side fail-over
- Consumer capacity
- Note: Durable queues can't be re-created on a separate node in the cluster
Transient vs Durable
- Opinion: The best place to persist messages is at the origin
- Origin can resend when not acknowledged (publisher confirms)
- Transient is faster than durable, although durable survives RabbitMQ restarts
- RabbitMQ is fastest when few or no messages are in the queue
- RabbitMQ stores transient messages to disk when the queue balloons in size
Did you know?
- A queue can be bound to multiple exchanges, or the same exchange, multiple times, with different binding keys.
Backpressure and RabbitMQ death
- When RabbitMQ gets busy (persisting messages to disk), it exerts backpressure on producers. Producers block on publish.
- What if we don't want a producer to block?
- RabbitMQ monitors disk space for persistent queues.
- Bug: Out-of-disk space crashes RabbitMQ when transient messages are being paged to disk
- Restarting doesn't clear/fix the disk space consumed
- Remove /var/rabbitmq/mnesia/rabbit@YourHost/msg_store_transient/
- Consider using configuration setting "{vm_memory_high_watermark_paging_ratio, 1.1}"
Notes
- http://stackoverflow.com/questions/10030227
- http://www.rabbitmq.com/memory.html
- https://www.rabbitmq.com/disk-alarms.html
RabbitMQ will block producers when free disk space drops below a certain limit. This is a good idea since even transient messages can be paged to disk at any time, and running out of disk space can cause the server to crash. By default RabbitMQ will block producers, and prevent memory-based messages from being paged to disk, when free disk space drops below 50MB. This will reduce but not eliminate the likelihood of a crash due to disk space being exhausted. In particular, if messages are being paged out rapidly it is possible to run out of disk space and crash in the time between two runs of the disk space monitor. A more conservative approach would therefore be to set the limit to the same as the amount of memory installed on the system (see the configuration below).
....By default 50MB is required to be free on the database partition.
Possible method to prevent RabbitMQ from crashing: Add the following setting to /etc/rabbitmq/rabbitmq.config
"{vm_memory_high_watermark_paging_ratio, 1.1}"
Reference: http://lists.rabbitmq.com/pipermail/rabbitmq-discuss/2013-September/030458.html
Grouping messages
- It may be possible to achieve higher throughput by combining your messages into a single Rabbit message
- Increase in latency
Summary 1 of 2
- Configure prefetch
- Try HiPE
- When scaling horizontally, there's more than one way to do it
- Plan for fail-over
- Configure limits.conf
Summary 2 of 2
- Use good tools -- RabbitMQ is a good tool
- It's available, cost-effective, practical
- Try solutions, test solutions. If one idea doesn't work, get feedback, and try something else. Fail fast.