YAMI - Yet Another Messaging Infrastructure


YAMI Home

Concept

Specification

Implementation

What next?

Questions
& Answers


Comments

Specification

The intent of the YAMI project is to provide the infrastructure for use in different environments - namely, it should be possible to use it from both scripting languages as well as from more expressive languages like C++. Defining a protocol so that the scripts could comply would be very constraining for programmers writing in more powerful languages and vice-versa - a protocol that is convenient to use in C++ would be impossible to implement for scripts.

That's why this specification is divided into levels.

The intent of the first level is that it should be simple enough for use from scripts, which do not recognize data types. Higher levels can provide more functionality for more expressive languages.
Anyway, levels are built so that each higher level specifies the protocol which is a superset of the protocol defined by a lower level. It means, that the software written for level 2 specification should be able to understand requests coming from software written for level 1 specification and so on.
Of course, sending a request from this more powerful software to the one written for lower lever of specification will require that the former uses only facilities provided by the lower level.

Here, the format of the data sent over the network is defined using External Data Representation Standard (see RFC 1832 for the description of XDR) with one important variation. XDR imposes the big-endian byte ordering to ensure platform independence. The disadvantage of this approach is that when two communicating parties are running on little-endian machine, both sender and receiver have to swap bytes to the common protocol. YAMI uses the receiver-makes-right concept, which means, that the sender sends the data in its native format, and the receiver does the byte swapping only when it discovers that his native format is different from the format of the sender. This solution requires some flag in the data being sent that would mark the packet as little-endian or big-endian - every packet sent has to begin with four zero bytes, if it comes from little-endian machine, otherwise, at least one of those bytes should be non-zero.

Level 1 and Level 2 common assumptions:

  1. Agents communicate using TCP/IP protocol (stream).
  2. The communication channel can be opened in the simplex mode or in the duplex mode.
  3. There are five types of data packets sent between agents: requests, responses, rejections, unknown notifications and overflow notifications.
  4. Every agent keeps open one listening socket. It will receive requests to that socket, which means that agents that do not have a listening socket are not able to receive requests.
    Sending a request to another agent is accomplished by opening a connection and sending packet's data. Agent is allowed to keep the simplex connection open as long as it wants and close it between messages. This applies to both connections initiated by the agent (to send message), and connections initiated by the other agent (which are results of the listening on the socket). This allows to write agents that use connection pooling in order to speed up the communication. Agents should react properly if the connection was closed by the second part - the agent can try to set up different connection. Duplex connections should be kept open as long as possible.
  5. Agents should be fool-proof, which means that they should not crash if the peer closes the connection while writing or reading the socket.
  6. In each packet the first four bytes describe the endianness of the sending machine. They are zero bytes for little-endian machine and for big-endian machine at least one of those bytes should be non-zero. The next four bytes encode the level number of the sender, the next four encode the message identifier and the next four bytes encode the type of the message.
  7. All the packet's data (excluding the first four bytes) are encoded in the native byte-order of the sender.
  8. In simplex connections, agent should reply to each valid packet received by sending to the originating agent (through the connection by which the data were received) one byte of any value. This is the simple handshake protocol that can help implementing the connection pool. Even if the agent does not implement this facility, it should still confirm the message this way. Duplex connections do not require any handshakes.

Differences between Level1 and Level2:

  1. Level2 is an extension to the Level1 - it supports more data types in a parameter set.

Packet definition

(also in file specs.xdr)
/*
* general packet limitations
*/

/* max length of the parameter-set data */
const MAXPARAMSETSIZE = 1048576;

/* max length of the parameter's raw data */
const MAXPARAMSIZE = 65536;

/* max number of parameters in a set */
const MAXPARAMS = 65536;

/* max length of the name */
const MAXNAMELEN = 256;

/*
* packet types
*/
enum packettype
{
REQUEST = 0; /* request packet */
RESPONSE = 1; /* normal response packet */
REJECT = 2; /* rejection packet */
UNKOBJECT = 3; /* unknown object response packet */
OVERFLOW = 4; /* overflow on the server side */
REJECTBYAGENT = 5; /* reject by the destination agent */
};

/*
* connection modes
*/
enum connectionOptions
{
eNone = 0; /* simplex connection */
eFixedDuplex = 1; /* duplex connection */
};

/*
* IP socket address
*/
struct IPaddress
{
int ipaddr; /* four-byte IP address */
int ipport; /* port number */
};

/*
* wide char - not part of the XDR, so defined explicitely
* the wide characters are sent as integers
*/
typedef int wchar;

/*
* parameter type for level1
*/
enum paramtypel1
{
eString = 1; /* ASCII string */
eWString = 2; /* wide string */
};

/*
* parameter type for level2
*/
enum paramtypel2
{
eString = 1; /* ASCII string */
eWString = 2; /* wide string */
eInt = 3; /* integer */
eDouble = 4; /* double */
eByte = 5; /* opaque byte */
eBinary = 6; /* opaque binary data */
};

/*
* parameter for level1
*/
union parameterl1 switch (paramtypel1 type)
{
case eString:
/* max length of the ASCII string can be 65536 */
string strvalue<MAXPARAMSIZE>;
case eWString:
/* max length of the wide string can be 16384 */
wchar wstrvalue<MAXPARAMSIZE / 4>;
};

/*
* parameter for level2
*/
union parameterl2 switch (paramtypel2 type)
{
case eString:
/* max length of the ASCII string can be 65536 */
string strvalue<MAXPARAMSIZE>;
case eWString:
/* max length of the wide string can be 16384 */
wchar wstrvalue<MAXPARAMSIZE / 4>;
case eInt:
int intval;
case eDouble:
double dval;
case eByte:
opaque bval[1];
case eBinary:
/* max size of the raw binary data is 65536 */
opaque binval<MAXPARAMSIZE>;
};

/*
* level1 body definition
*/
union bodyl1 switch (packettype type)
{
case REQUEST:
/* return address for the reply */
IPaddress retaddr;

/* destination object's name */
string objname<MAXNAMELEN>;

/* name of the message sent */
string messagename<MAXNAMELEN>;

/* total size in bytes of the parameter set */
int paramsetsize;

/* parameter set */
parameterl1 paramset<MAXPARAMS>;

case RESPONSE:
/* total size in bytes of the parameter set */
int paramsetsize;

/* reply parameter set */
parameterl1 paramset<MAXPARAMS>;

case REJECT:
void;
case UNKOBJECT:
void;
case OVERFLOW:
void;
};

/*
* level2 body definition
*/
union bodyl2 switch (packettype type)
{
case REQUEST:
/* return address for the reply */
IPaddress retaddr;

/* destination object's name */
string objname<MAXNAMELEN>;

/* name of the message sent */
string messagename<MAXNAMELEN>;

/* total size in bytes of the parameter set */
int paramsetsize;

/* parameter set */
parameterl2 paramset<MAXPARAMS>;

case RESPONSE:
/* total size in bytes of the parameter set */
int paramsetsize;

/* reply parameter set */
parameterl2 paramset<MAXPARAMS>;

case REJECT:
void;
case UNKOBJECT:
void;
case OVERFLOW:
void;
};

/***********************************
*
* packet definition for both levels
*
***********************************/
union packet switch (int levelno)
{
case 1:
int msgid; /* message identifier */
bodyl1 body; /* body of the message */
case 2:
int msgid; /* message identifier */
bodyl2 body; /* body of the message */
};

/***********************************
*
* full packet definition
* (with endianness indicator)
*
***********************************/
struct fullpacket
{
/* endianness indicator for the rest of the packet */
opaque endianness[4];

/* the packet, in the sender's byte-order */
packet packettail;
};

Notes

  1. In the case of the REQUEST and RESPONSE messages, the total size of the parameter set is transmitted only to help agents allocate appropriate buffers.
  2. In the case of the REQUEST message, the msgid applies to the message itself.
    It should be generated by the sending agent and unique in that agent.
    However, in case of other message types, the msgid applies to the earlier message, for which the current one is a reply. Thus - message identifiers for types RESPONSE, REJECT, UNKOBJECT and OVERFLOW are not generated - they are copied from the original message.
  3. The packettype field is overloaded. Its lower 16 bits represent the actual packet type, as defined by the enum packettype type. The higher 16 bits are used for requested connection mode.