Hive

A Cross-Language,
Multiprocess Bot Framework

About a year ago, Chris Done and I discussed the possibility of a bot framework that was essentially an IRC bouncer, with bot ‘plugins’ being separate processes that connect to the bouncer and responded to whatever messages they felt like handling.

Chris took the model, ran with it, and built a basic working bot, but I, being the chronic bikeshedder that I am, got annoyed at the lack of data-sharing between plugins that could use it and the difficulties in having plugins call or filter other plugins, and started devising a series of increasingly stranger and more complex protocols that could be embedded into the IRC stream.

Eventually I decided it would be wise to switch to DBus, which allowed more structured data; the second generation of the protocol that arose thus is described here.

Goals

As far as I can tell, there are four different types of plugins that are useful to a bot framework:

Responders

are the simplest type of plugin; they receive a message from the server, and, if it matches what they handle, extract the necessary information from it and use it to construct a reply, which is sent straight back to the server. Examples of this type include simple ping-responders and handshaking plugins.

Automata

are very much like responders, but can send messages independent of server-activated invocation, as for example server bridges or announcers of some kind of external event. Responders might seem like the more fundamental operation, but in fact responders can be modelled as automata, with the event source being the server (or upstream plugin) itself. Examples of this type include announcement bots or bridging bots.

Filters

are plugins that don't create their own responses, but sit in front of a plugin, acting on its inputs and outputs without blocking them. They can modify their input (pagination, sanitation), attach metadata (authentication) or simply store information (command statistics loggers).

Aggregators

are the most complex type of plugin. They need to be able to query other plugins and combine the results into their own response, such as lambdabot's ‘compose’ plugin.

Design

The basic design is that of a tree: DBus' signals provide an ideal mechanism for connecting multiple children, as the children can subscribe to signals broadcast by the parent. Likewise, child→parent communication can be accomplished by way of method calls. Thus, the only information that needs to be shared when constructing the tree is the parent's address. The ability to change this with another signal introduces strong fault-tolerance: in the case that one plugin should fall over, its error-handler can simply broadcast a signal telling all its children to update their parent to its parent, cutting it out of the process altogether.

On receipt of a message, said message, along with metadata, is sent from the server-handling component, or ‘mouth’, converted into DBus format with some metadata attached, out to any plugins listening. These plugins can then decide whether or not to handle the message; if they consider the message not to have been completely handled, they may re-broadcast the message, edited or otherwise, and have their children handle it. If a plugin wishes to send a message out onto the server, they may send it to their parent plugin, which can process the message and possibly send it or an edited version up to its parent, until it reaches the mouth, which sends it on to the server proper. This provides a simple model for the first three plugin types, and a slightly more convoluted one for aggregators, as we will see shortly.

N.B. the DBus type of the messages, both inward- and outward-bound, is (aaya{sv}), a struct formed of an array of arrays of bytes followed by a dictionary mapping strings to variants (an array of string–variant dictionary items). This corresponds to the somewhat more expressive Haskell type:

data HiveMessage = HiveMessage {
     messages :: [Network.FastIRC.Message],
     metadata :: Data.Map.Map
                   Data.Text.Text
                   DBus.Types.Variant
}

The Messages must be expressed as ByteStrings (or aays) since we don't know their encodings. Dictionary keys (and values where applicable), however, must be valid UTF-8; when using the Haskell library, Data.Text and the Variable instance for HiveMessage should take care of this automatically. We send multiple IRC messages in one Hive message so that the sender can ensure they are processed together, rather than risking the possibility that they will be interwoven with something else's output.

This gives rise to the following protocols for the plugins:

Code and possibly diagrams are forthcoming.