# Hive

## A Cross-Language,Multiprocess Bot Framework

About a year ago, Chris Done and I discussed the possibility of a bot framework that was essentially an IRC bouncer, with bot ‘plugins’ being separate processes that connect to the bouncer and responded to whatever messages they felt like handling.

Chris took the model, ran with it, and built a basic working bot, but I, being the chronic bikeshedder that I am, got annoyed at the lack of data-sharing between plugins that could use it and the difficulties in having plugins call or filter other plugins, and started devising a series of increasingly stranger and more complex protocols that could be embedded into the IRC stream.

Eventually I decided it would be wise to switch to DBus, which allowed more structured data; the second generation of the protocol that arose thus is described here.

### Goals

As far as I can tell, there are four different types of plugins that are useful to a bot framework:

Responders

are the simplest type of plugin; they receive a message from the server, and, if it matches what they handle, extract the necessary information from it and use it to construct a reply, which is sent straight back to the server. Examples of this type include simple ping-responders and handshaking plugins.

Automata

are very much like responders, but can send messages independent of server-activated invocation, as for example server bridges or announcers of some kind of external event. Responders might seem like the more fundamental operation, but in fact responders can be modelled as automata, with the event source being the server (or upstream plugin) itself. Examples of this type include announcement bots or bridging bots.

Filters

are plugins that don't create their own responses, but sit in front of a plugin, acting on its inputs and outputs without blocking them. They can modify their input (pagination, sanitation), attach metadata (authentication) or simply store information (command statistics loggers).

Aggregators

are the most complex type of plugin. They need to be able to query other plugins and combine the results into their own response, such as lambdabot's ‘compose’ plugin.

### Design

The basic design is that of a tree: DBus' signals provide an ideal mechanism for connecting multiple children, as the children can subscribe to signals broadcast by the parent. Likewise, child→parent communication can be accomplished by way of method calls. Thus, the only information that needs to be shared when constructing the tree is the parent's address. The ability to change this with another signal introduces strong fault-tolerance: in the case that one plugin should fall over, its error-handler can simply broadcast a signal telling all its children to update their parent to its parent, cutting it out of the process altogether.

On receipt of a message, said message, along with metadata, is sent from the server-handling component, or ‘mouth’, converted into DBus format with some metadata attached, out to any plugins listening. These plugins can then decide whether or not to handle the message; if they consider the message not to have been completely handled, they may re-broadcast the message, edited or otherwise, and have their children handle it. If a plugin wishes to send a message out onto the server, they may send it to their parent plugin, which can process the message and possibly send it or an edited version up to its parent, until it reaches the mouth, which sends it on to the server proper. This provides a simple model for the first three plugin types, and a slightly more convoluted one for aggregators, as we will see shortly.

N.B. the DBus type of the messages, both inward- and outward-bound, is (aaya{sv}), a struct formed of an array of arrays of bytes followed by a dictionary mapping strings to variants (an array of string–variant dictionary items). This corresponds to the somewhat more expressive Haskell type:

data HiveMessage = HiveMessage {
messages :: [Network.FastIRC.Message],
Data.Text.Text
DBus.Types.Variant
}


The Messages must be expressed as ByteStrings (or aays) since we don't know their encodings. Dictionary keys (and values where applicable), however, must be valid UTF-8; when using the Haskell library, Data.Text and the Variable instance for HiveMessage should take care of this automatically. We send multiple IRC messages in one Hive message so that the sender can ensure they are processed together, rather than risking the possibility that they will be interwoven with something else's output.

This gives rise to the following protocols for the plugins:

• Responders listen for their parents' signals, extract the necessary information, and then send the reply (if any) back to their parent via the out method.

• Automata simply send messages to the parent's out method when appropriate.

• Filters listen for a message from the appropriate source and send the modified message in the opposite direction: in→out for input filters, out→in for output filters.

• Aggregators are more complicated, and consist of two parts: the aggregator itself, and a reflector. A reflector is a simple filter that sits near the front of the tree and performs two functions:

• when passing any message through, it will tag it with a "reflector" metadatum containing its bus name; components further down the tree may send messages to the in method on this address to have them passed inwards as if they came from a parent of the reflector.

• on receiving a message via its out method with a "target" value in its metadata, the reflector sends it back to the component whose bus name is contained within instead of forwarding it on, via the component's out method.

The aggregator works by listening for the appropriate command, like a responder. When it encounters it, the aggregator crafts the messages it requires to build its response, tags them with a unique ID and its own bus name as the "target", and sends them to the in method of the reflector named on the command. This reflector will pass the messages down the chain individually as normal, and, when they reach the reflector again, they will be outward-bound, so the reflector will return them back to the original aggregator, which will use the unique IDs to combine the results and perform its final action.

Code and possibly diagrams are forthcoming.