XEP-0115: Entity Capabilities
XEP-0115: Entity Capabilities
This document defines an XMPP protocol extension for broadcasting and discovering client, device, or generic entity capabilities in a way that minimizes network impact.
NOTICE: The protocol defined herein is a Draft Standard of the XMPP Standards Foundation. Implementations are encouraged and the protocol is appropriate for deployment in production systems, but some changes to the protocol are possible before it becomes a Final Standard.
Document Information
Series:
XEP
Number: 0115
Publisher:
XMPP Standards Foundation
Status:
Draft
Type:
Standards Track
Version: 1.3
Approving Body:
XMPP Council
Dependencies: XMPP Core, XMPP IM, XEP-0030
Supersedes: None
Superseded By: None
Short Name: caps
Schema: <
Wiki Page: <
Author Information
Joe Hildebrand
Email:
jhildebrand@jabber.com
JabberID:
hildjj@jabber.org
Peter Saint-Andre
Email:
stpeter@jabber.org
JabberID:
stpeter@jabber.org
Remko Tronçon
Email:
public@el-tramo.be
JabberID:
public@el-tramo.be
Legal Notice
This XMPP Extension Protocol is copyright 1999 - 2007 by the
XMPP Standards Foundation
(XSF) and is in full conformance with the XSF's Intellectual Property Rights Policy <
>. This material may be distributed only subject to the terms and conditions set forth in the Creative Commons Attribution License (<
>).
Discussion Venue
The preferred venue for discussion of this document is the Standards discussion list: <
>.
Relation to XMPP
The Extensible Messaging and Presence Protocol (XMPP) is defined in the XMPP Core (RFC 3920) and XMPP IM (RFC 3921) specifications contributed by the XMPP Standards Foundation to the Internet Standards Process, which is managed by the Internet Engineering Task Force in accordance with RFC 2026. Any protocol defined in this document has been developed outside the Internet Standards Process and is to be understood as an extension to XMPP rather than as an evolution, development, or modification of XMPP itself.
Conformance Terms
The following keywords as used in this document are to be interpreted as described in
RFC 2119
: "MUST", "SHALL", "REQUIRED"; "MUST NOT", "SHALL NOT"; "SHOULD", "RECOMMENDED"; "SHOULD NOT", "NOT RECOMMENDED"; "MAY", "OPTIONAL".
Table of Contents
1.
Introduction
1.1.
Motivation
1.2.
How It Works
2.
Assumptions
3.
Requirements
4.
Use Cases
4.1.
Advertising Capabilities
4.2.
Discovering Capabilities
4.3.
Stream Feature
5.
Server Optimizations
6.
Implementation Notes
7.
Error Codes
8.
Security Considerations
9.
IANA Considerations
10.
XMPP Registrar Considerations
11.
XML Schema
Notes
Revision History
1.
Introduction
1.1
Motivation
It is often desirable for a Jabber/XMPP application (commonly but not necessarily a client) to take different actions depending on the capabilities of another application from which it receives presence information. Examples include:
Showing a different set of icons depending on the capabilities of other clients.
Not sending
XHTML-IM
] content to plaintext clients such as cell phones.
Allowing the initiation of Voice over IP (VoIP) sessions only to clients that support VoIP.
Not showing a "Send a File" button if another user's client does not support
File Transfer
].
Some older Jabber clients send one
Service Discovery
] and one
Software Version
] request to each entity from which they received presence after login. That "disco+version flood" results in an excessive use of bandwidth and is impractical on a larger scale, particularly for users or applications with large rosters. Therefore this document proposes a more robust and scalable solution: namely, a presence-based mechanism [
] for exchanging information about entity capabilities. Clients SHOULD NOT engage in the older "disco+version flood" behavior and instead SHOULD use Entity Capabilities as specified herein.
1.2
How It Works
This section provides a friendly, non-normative introduction to the workings of entity capabilities.
Imagine that you are a Shakespearean character named Juliet and one of your contacts, a handsome fellow named Romeo, becomes available. His client wants to publish its capabilities, and does this by adding a element to its presence packets. As a result, your client receives the following presence packet:

node='http://exodus.jabberstudio.org/caps'
ver='0.9'/>

The 'node' attribute represents the client Romeo is using, and the 'ver' attribute represents the specific version of this client. At this point, your client has no idea what the capabilities are of someone with a client string 'http://exodus.jabberstudio.org/caps' and a version string '0.9'. Your client therefore sends a query to Romeo, asking what his client version can do (using service discovery):

node='http://exodus.jabberstudio.org/caps#0.9'/>

The response is:

node='http://exodus.jabberstudio.org/caps#0.9'>







At this point, your client knows that anyone using the client string 'http://exodus.jabberstudio.org/caps' and the version string '0.9' has a client that can do MUC. Your client remembers this information, such that it does not need to explicitly query the capabilities of a contact with the exact same client and version string. For example, Benvolio may send you the following presence:

node='http://exodus.jabberstudio.org/caps'
ver='0.9'/>

Now your client automatically knows that Benvolio can do MUC, without needing to ask him explicitly via service discovery.
On the other hand, for a person with the following presence ...

node='http://exodus.jabberstudio.org/caps'
ver='0.10'/>

... or the following presence ...

node='http://psi-im.org/caps'
ver='0.9'/>

... you have no information about what this contact's client is capable of (as he is using a different client/version), and you therefore need to query his capabilities explicitly again.
So that is how the 'node' and 'ver' attributes work. But most clients also allow certain features to be turned on or off at run time. For example, clients can allow users to turn off the exchange of chat states (for privacy reasons). For such a client, this means that it cannot just advertise its capabilities as was done before, since this capabilitiy depends on whether or not the user has turned the option on or off (and capabilities for the same client/version are assumed to be identical for all contacts). Such capabilities are advertised using "extensions". For example:

node='http://exodus.jabberstudio.org/caps'
ver='0.9'
ext='csn'/>

In this case, Benvolio is using an extension that his client calls 'csn'. Again, your client has no clue what 'csn' means, as it is a name arbitrarily chosen by Benvolio's client. It therefore queries Benvolio's client to find out what a client 'http://exodus.jabberstudio.org/caps' means when it says it supports 'csn'. It does this by sending a disco#info request to the node#ext (not the node#ver as was done to find the base capabilities):

node='http://exodus.jabberstudio.org/caps#csn'/>

The response is:

node='http://exodus.jabberstudio.org/caps#0.9'>



Now, your client knows that Benvolio's client (and anyone using the same client node as Benvolio and extension 'csn', since the 'ext' values MUST be stable across client versions) supports chat state notifications.
The sum total of what Benvolio's client supports is the set of features advertised in disco#info responses to the node#ver (base features) and the set of features advertised in disco#info responses to each node#ext (extension features).
However, suppose Bill logs on:

node='http://psi-im.org/caps'
ver='0.9'
ext='csn'/>

Although you know that 'csn' meant "chat state notifications" for Benvolio (and for everyone using the same client as Benvolio), you do not know what this means for Bill, because he is using a different client! So you need to query his client:

node='http://psi-im.org/caps#csn'/>

and the response ...

node='http://psi-im.org/caps#0.9'>



... reveals that this client uses 'csn' to denote the capability of "stanza session negotiation" (formerly known as "chat session negotiation"). So although the 'ext' values are stable for each client (and for all versions thereof), they are not stable across different clients.
2.
Assumptions
This document makes several assumptions:
The type of client I am using is of interest to the people on my roster.
Clients for the people on my roster might want to make user interface decisions based on my capabilities.
Different instances of the same client (including version) have the same base capabilities.
Some clients will have bundles of functionality that can be enabled and disabled.
One instance of a given client may not know about all of the possible bundles of functionality that can be enabled and disabled (for example, plugins written to a client SDK).
Members of a community tend to cluster around a small set of clients. More specifically, multiple people in my roster use the same client, and they upgrade versions relatively slowly (commonly a few times a year, perhaps once a week at most, certainly not once a minute).
Some clients are running against servers without server-to-server connectivity enabled, and without access to the Internet via HTTP.
Conversations are possible between users who are not on each other's roster.
Client capabilities may change over the course of a session, due to features being enabled and disabled.
3.
Requirements
The protocol defined herein addresses the following requirements:
Clients MUST be able to participate even if they support only
XMPP Core
],
XMPP IM
], and
XEP-0030
Clients MUST be able to participate even if they are on networks without connectivity to other XMPP servers, services offering specialized XMPP extensions, or HTTP servers. [
Clients MUST be able to retrieve information without querying each user.
Since presence is normally broadcasted to many users, the byte size of the proposed extension MUST be as small as possible.
It MUST be possible to write a
Multi-User Chat
10
] implementation that passes the given information along.
It MUST be possible to publish a change in capabilities within a single session.
Server infrastructure above and beyond that defined in
XMPP Core
and
XMPP IM
MUST NOT be required for this approach to work, although additional server infrastructure MAY be used for optimization purposes.
4.
Use Cases
4.1
Advertising Capabilities
Each time a conformant client sends presence, it annotates that presence with an element that specifies the client type, the version of that client, and which feature bundles (if any) are currently enabled. Unless the server optimizations shown later are being used, the client MUST send this with every presence change (except for unavailable presence) to enable existing servers to remember the last presence for use in responding to probes. The client MUST send the
'node'
and
'ver'
attributes.
In addition, the client MAY send an
'ext'
attribute (short for "extensions") if it has one or more feature bundles to advertise. A feature bundle is any non-standard addition or extension to the core application, such as a client plugin. If more than one feature bundle is advertised, the value of the
'ext'
attribute MUST be a space-separated list of bundle names.
11
The client MUST NOT send an
'ext'
attribute if there are no interesting non-core features enabled. The names of the feature bundles MUST NOT be used for semantic purposes: they are merely opaque identifiers that will be used in other use cases. However, a client MUST ensure that the same 'ext' value refers to the same feature bundle across client versions (i.e., different values of the 'ver' attribute). If bundles are added or substracted during an entity's session (e.g., a user plugs in a video camera), the entity SHOULD update the value of the 'ext' attribute to reflect the changed capabilities and send a new presence broadcast. If a feature bundle itself changes in any way (e.g., a user installs an updated version of a client plugin), the application MUST change the bundle name and SHOULD send a new presence broadcast.
Note: The values of the
'node'
'ver'
, and
'ext'
attributes MUST NOT contain the '#' character, since that character is used as a separator in the
Discovering Capabilities
use case.
Example 1. Annotated presence sent

node='http://exodus.jabberstudio.org/caps'
ver='0.9'/>

Example 2. Annotated presence sent, with feature extensions

node='http://exodus.jabberstudio.org/caps'
ver='0.9'
ext='93j 1g'/>

4.2
Discovering Capabilities
Once someone on my roster knows what client I am using, they need to be able to figure out what features are supported by that client. In the deprecated "disco flood" approach, this has been done by sending one "disco#info" request to each entity in a user's roster. Entity capabilities makes that unnecessary through the use of annotated presence. In particular, a client that receives the annotated presence sends a
disco#info
request (as defined in
XEP-0030: Service Discovery
) to
exactly
one of the users that sent a particular combination of
node
and
ver
. If the requestor has received the same annotation from multiple JIDs, the requestor SHOULD pick a random JID from that list to which the requestor will send the
disco#info
request.
The
disco#info
request is sent to a JID + node combination that consists of the chosen

JID and a service discovery
node
that is constructed as follows: concatenate (1) the value of the caps
'node'
attribute, (2) the "#" character, and (3) the version number specified in the caps
'ver'
attribute.
Example 3. Disco#info request for client#version

node='http://exodus.jabberstudio.org/caps#0.9'/>

The random user then returns all of the capabilities supported by the base installation of the application without plugins or other add-ons:
Example 4. Disco#info response for client#version

node='http://exodus.jabberstudio.org/caps#0.9'>







Subsequent requests MAY be made to determine the supported features associated with each extension. These requests MUST be sent to a random

JID that sent a caps annotation that included a particular
node
ext
combination. The
disco#info
request shall be sent to a JID + node combination that consists of the chosen JID and a service discovery
node
that is constructed as follows: concatenate (1) the value of the caps
'node'
attribute, (2) the "#" character, and (3) the extension name specified by one of the space-separated names in the caps
'ext'
attribute. The requestor SHOULD try to use different JIDs for each of these requests, as well as for the first request.
Example 5. Disco#info request for client#extension

node='http://exodus.jabberstudio.org/caps#93j'/>

Example 6. Disco#info response for client#extension

node='http://exodus.jabberstudio.org/caps#93j'>






Example 7. Disco#info request for client#extension

node='http://exodus.jabberstudio.org/caps#1g'/>

Example 8. Disco#info response for client#extension

node='http://exodus.jabberstudio.org/caps#1g'>




Note: The set of features that a given entity advertises in response to a "client#version" request and all "client#extension" requests MUST be equivalent to the response it gives to a
disco#info
request with no 'node' attribute:
Example 9. Generic disco#info response













All of the responses to the
disco#info
queries SHOULD be cached. If a particular entity cannot store the responses, it SHOULD NOT make the requests. An entity SHOULD NOT make the service discovery requests unless the information is required for some local functionality. An entity MUST NOT ever make a request to another entity that has the same version of the same application as the requesting entity, except for extensions that are not supported by the requestor's installation (e.g., one "Exodus 0.9" client MUST NOT query another "Exodus 0.9" client unless the second client has advertised an extension or plugin that the first client does not have).
4.3
Stream Feature
A server MAY include its own entity capabilities in a stream feature element so that connecting clients and peer servers do not need to send service discovery requests each time they connect:
Example 10. Stream feature element including capabilities

node='http://jabberd.org/entity'
ver='1.6.1'/>

5.
Server Optimizations
A server that is managing an entity's session MAY choose to optimize traffic through the server. In this case, the server MAY strip off redundant capabilities annotations. Because of this, receivers of annotations MUST NOT expect an annotation on every presence packet they receive. If the server wants to perform this traffic optimization, it MUST ensure that the first presence each subscriber receives contains the annotation. The server MUST also ensure that any changes in the annotation (typically in the
'ext'
attribute) are sent to all subscribers.
A client MAY query the server using
disco#info
to determine if the server supports the
'http://jabber.org/protocol/caps'
feature. If so, the server MUST perform the optimization delineated above, and the client MAY choose to only send the capabilities annotation on the first presence packet, as well as whenever its capabilities change.
Example 11. Disco#info request for server optimization
to='capulet.com'
type='get'>

to='juliet@capulet.com/balcony'
type='result'>

...

...


6.
Implementation Notes
If two entities exchanges messages but they do not normally exchange presence (i.e., via presence subscription), the entities MAY choose to send directed presence to each other, where the presence information SHOULD be annotated with the same capabilities information as each entity sends in broadcasted presence.
If capabilities information has not been received from another entity, an application MUST assume that the other entity does not support capabilities.
7.
Error Codes
No application-specific error codes are defined by this document. See
XEP-0030: Service Discovery
for a list of potential service discovery error codes.
8.
Security Considerations
Use of the protocol specified in this document might make some client-specific forms of attack slightly easier, since the attacker could more easily determine the type of client being used. However, since most clients respond to
jabber:iq:version
requests without performing access control checks, there is no new vulnerability. Entities that wish to restrict access to capabilities information SHOULD use the privacy lists protocol defined in
XMPP IM
to define appropriate communications blocking (e.g., an entity MAY choose to allow IQ requests only from "trusted" entities, such as those with whom it has a subscription of "both").
It is possible (though unlikely) for a bad actor or rogue application to poison other entities by providing incorrect information in response to disco#info requests. To guard against such poisoning, a requesting entity MAY send disco#info requests to multiple entities that match the same
node
ver
or
node
ext
combination and then compare the results to ensure consistency. The requesting entity SHOULD NOT send the same request to more than five entities and MUST ensure that the entities are truly different by not sending the same request to multiple entities for which the portion matches.
9.
IANA Considerations
This document requires no interaction with the
Internet Assigned Numbers Authority (IANA)
13
].
10.
XMPP Registrar Considerations
The
XMPP Registrar
14
] includes 'http://jabber.org/protocol/caps' in its registries of protocol namespaces and service discovery features.
If it is useful or interesting, the Registrar may also provide registration of the URIs to be used in the
'node'
attribute, but since these URIs can be scoped according to well-defined existing rules, this is not necessary.
11.
XML Schema

xmlns:xs='http://www.w3.org/2001/XMLSchema'
targetNamespace='http://jabber.org/protocol/caps'
xmlns='http://jabber.org/protocol/caps'
elementFormDefault='qualified'>



The protocol documented by this schema is defined in
XEP-0115: http://www.xmpp.org/extensions/xep-0115.html
















Notes
. XEP-0071: XHTML-IM <
>.
. XEP-0096: File Transfer <
>.
. XEP-0030: Service Discovery <
>.
. XEP-0092: Software Version <
>.
. This proposal is not limited to clients, and can be used by any entity that exchanges presence with another entity, e.g., a gateway. However, this document uses the example of clients throughout.
. RFC 3920: Extensible Messaging and Presence Protocol (XMPP): Core <
>.
. RFC 3921: Extensible Messaging and Presence Protocol (XMPP): Instant Messaging and Presence <
>.
. These first two requirements effectively eliminated
Publish-Subscribe
] as a possible implementation of entity capabilities.
. XEP-0060: Publish-Subscribe <
>.
10
. XEP-0045: Multi-User Chat <
>.
11
. Each extension name MUST be of type NMTOKEN, where multiple extension names are separated by the white space character #x20, resulting in a tokenized attribute type of NMTOKENS (see Section 3.3.1 of
XML 1.0
12
]).
12
. Extensible Markup Language (XML) 1.0 (Fourth Edition) <
>.
13
. The Internet Assigned Numbers Authority (IANA) is the central coordinator for the assignment of unique parameter values for Internet protocols, such as port numbers and URI schemes. For further information, see <
>.
14
. The XMPP Registrar maintains a list of reserved protocol namespaces as well as registries of parameters used in the context of XMPP extension protocols approved by the XMPP Standards Foundation. For further information, see <
>.
Revision History
Version 1.3 (2007-04-10)
Added developer-friendly introduction; specified that ext names must be stable across application versions; further clarified examples; added stream feature use case; removed message example (send directed presence instead).
(psa/rt/jjh)
Version 1.2 (2007-02-15)
Clarified motivation and handling of service discovery requests.
(psa)
Version 1.1 (2004-10-29)
Clarified meaning of service discovery results for client#ver and client#ext.
(psa)
Version 1.0 (2004-08-01)
Per a vote of the Jabber Council, advanced status to Draft.
(psa)
Version 0.7 (2004-06-29)
Added several items to the Security Considerations; clarified naming requirements regarding 'node', 'ver', and 'ext' attributes.
(jjh/psa)
Version 0.6 (2004-04-25)
Made a number of editorial adjustments.
(psa)
Version 0.5 (2004-01-05)
Specified that the protocol can be used whenever presence is used (e.g., by gateways); improved the XML schema; made several editorial adjustments.
(psa)
Version 0.4 (2003-09-04)
IQ eets must be to a resource, since they are intended to go to a particular session.
(jjh)
Version 0.3 (2003-09-02)
Servers MUST strip extras changed to MAY, due to implementer feedback.
(jjh)
Version 0.2 (2003-08-28)
Add more clarifying assumptions and requirements, make
it clear that clients don't have to send capabilities every
time if the server is optimizing.
(jjh)
Version 0.1 (2003-08-27)
Initial version.
(jjh)
END