protocol.tex

\section {Crossfire}

The Crossfire protocol is an asynchronous, bi-directional protocol designed to
enable the full functionality of the Firebug debugger in a multi-process or
remote scenario. Where it was possible, the design of the protocol took cues
from existing debug protocols such as DBGP\cite{dbgp}, Opera
Scope\cite{opera-scope}, Google's Chrome Dev Tools\cite{chrome-dev-tools} , as
well as common Web technologies (e.g. HTTP, JSON\cite{json}). Certain
features unique to Firebug and to debugging code running inside a Web Browser
had to be taken into account in the design of the protocol.
We give an overview of the protocol and discuss some aspects that are important to its design.

\subsection {Overview}
Debugging the code that implements a Web page or Web application differs in some
significant ways from debugging applications developed for other types of
systems. HTML and CSS are used to declaratively specify the structure and style of the
user-interface, which is rendered by the Web browser. Developers cannot (easily)
debug the rendering code itself. Instead, built-in tools like Firebug allow the
developers to interact with the rendering engine by modifying the input and
observing the output in real time, via live CSS and DOM editing. JavaScript code
on the page can be stepped through when it is executing, however there is no
guarantee that any JavaScript code is necessarily running at any given moment.
JavaScript code on a Web page can be triggered by timers, user interactions or
network events. There is no outermost main or idle loop to return to; when a
section of JavaScript code is finished executing, control returns to the Web
browser. This confounds attempts at things such as a simple 'halt' command,
which is common among debuggers for other systems. The closest analog in Firebug
is the \textit{break-on-next} feature.

The Crossfire protocol is heavily event-driven, and requests made via Crossfire
are asynchronous. This differs from many other debug protocols which are often
synchronous or a combination of synchronous and asynchronous calls.  The reasons
for this decision stem from the nature of code running in a web page, and the
fact that in some scenarios, especially the intermediate scenarios we wished to
achieve, we would have a remote client connected to a server which also had a
co-resident debug UI (Firebug). In other words, a complete Firebug and Crossfire
server implementation would be running in a single Firefox process, and clients
could connect to the Crossfire server. The result of this scenario is that
clients cannot assume or rely on being the only agent acting on the runtime
engine. For instance, a Crossfire client cannot safely assume that the debugger
will remain suspended on a line of code until the client issues a request to
resume. It is also possible that this action was triggered by the user from
another client (in this case it is helpful to think of Firebug's in-process UI
as another client to the debugger). However any connected Crossfire client will
receive an event whenever the JavaScript debugger suspends or resumes, and
should react accordingly.

Implementations of the protocol differ based on whether the implementation is
intended to operate as a client or server. A Crossfire server resides in or is
connected to the process which is acting as the runtime platform for the Web
page, application, or other code which is to be debugged. This is typically a
Web Browser, although supporting other runtime environments is envisioned. A
Crossfire client connects to a server in order to receive events and issue
requests, typically in order to provide a user-interface
for debugging, (e.g. GUI or command-line debugger). It is not necessary for the
client and server to reside in the same process or even the same host machine.


%\begin{figure}[htp]
%  \includegraphics  [width = 86 mm] {figures/crossfire-example-packet.png}
%  \caption{An example of a Crossfire message packet.}
% \label{fig:crossfire-packet}
%\end{figure}
% \texttt{
% Content-Length:219
% \\r\\n\\r\\n
% {
%   "type":"request",
%   "command":"setbreakpoint",
%   "context_id":"http://localhost:8080/test.html",
%   "seq":21,
%   "arguments": {
%                  "position":0,
%                  "enabled":true,
%                  "target":"http://localhost:8080/test/some_javascript.js",
%                  "line":9,
%                  "ignoreCount":0,
%                  "type":"line"
%                }
% }
%\\r\\n
%}
\subsection {Connection and Handshake}
To avoid conflicts with existing ports,
Crossfire does not specify a standard or well-known port. Port agreement is left
up to the user, or the client software must start the server listening on
the same port it will attempt to connect to.

The connection protocol is purposefully conventional.
The Crossfire server listens for a TCP connection on the specified port (greater
than 1024).  A client wishing to connect sends the string ``CrossfireHandshake''
followed by a CRLF (a blank line specified by a carriage-return followed by a
line-feed character, as with HTTP). An optional second handshake line may
contain a comma-separated list of tool names to be enabled immediately, followed
by another CRLF. The server replies with the same handshake string, at which
point the connection is established and the client may begin sending requests and
receiving events from the server.

\subsection {Client/Server Behavior}
Once a connection has been established and a successful handshake is completed,
the server may begin sending events to the connected client using the same TCP
connection used for the handshake. A client may also begin sending requests to
the server using the same connection. Clients should not expect the server to
respond synchronously to requests or in order. The most common example is a
Crossfire server sending one or more events to the client before responding to a
client's request, because the events occurred during the time the request was
being sent or processed.

\subsection {Message Packets}
As described above, the packet format follows the design of the Google Chrome browser\cite{V8}.
A well-formed Crossfire packet contains one or more headers consisting of the
header name, followed by a colon (``:``), the header value, and terminated by a
CRLF. A ``Content-Length'' header containing the number of characters in the
message body is required, and additional headers are allowed.

The message body is separated from the headers by another CRLF blank line.
The blank line is followed by a well-formed JSON string. The message must contain a ``type'' field with the value one of
``request'', ``response'', or ``event'', and a ``seq'' field which contains the
sequence number of the packet. The sequence number of each message should
be greater than that of the last message received.

\subsection {Contexts}
Unlike desktop or server application debuggers, a Web browser typically runs
multiple applications or Web pages. A developer is likely to debug one or two of these applications,
while the rest are unrelated to the application. The debugger must
have a mechanism to focus on the particular page being debugged.
Firebug represents an instance of a Web page via an object called a
\textit{Context}. The context object allows Firebug's panels and modules to
share information about a web page that is being debugged, therefore it has a
central role in Firebug's architecture.

Most of the events that occur in a Web browser that are of interest to a debug
UI are related to individual pages, and therefore individual Firebug contexts.
Examples of context-specific events include loading (or reloading) of a page,
loading and compiling a script, errors being generated from an executing script,
DOM elements being added or modified, a breakpoint being added to a script, etc.

The Crossfire protocol uses contexts for
most requests / events. Crossfire represents a context as a mapping of the
unique context ID and the URL of the page. This allows a connected Crossfire
client to distinguish between separate loads of the same URL, as is often the
case when a developer reloads a page several times in the course of developing
or debugging the page.
Firebug's TabWatcher component monitors loading and unloading of Firefox windows
and tabs.  The Crossfire server assigns a unique identifier to each
context, and passes this ID as part of most event and response packets.

\subsection {Breakpoints}
Breakpoint debugging is a standard tool for debugging software at runtime in
many languages and environments. The Web Browser environment creates several
challenges for designing a remote protocol which supports breakpoint debugging.
Firebug also introduces several types of breakpoints which are not present in
other environments \cite{jjb-www2010}.

Even the simplest case, a JavaScript line breakpoint, has design implications
that must be considered. Typically, such a breakpoint is identified by a line
number and the URL of the script. However, existing JavaScript debugging APIs
such as Firefox's do not contain the concept of Firebug's contexts.  Therefore
if a user places a breakpoint on line 23 of a URL http://localhost/script.js,
then that breakpoint will exist for all occurrences of that URL. This may or may
not be what the user actually desires.  Considering the increasing use of
JavaScript libraries, it is entirely likely that a script from the same URL is
loaded into two completely unrelated pages. It is conceivable that a user would
wish to debug his or her code and the interaction with the library, without
affecting code running in another tab or window.

Crossfire's breakpoint protocol allows breakpoints to be set in one of two ways,
either with or without a context ID. If a Crossfire client specifies a context
ID along with a request to set a breakpoint, then that breakpoint should be
enabled for that location in any existing context, or a future context which is
created with the same URL in the same container (i.e. the page is reloaded).

If a client does not specify a context (by passing null as the value of the
context ID), then the behavior is to set a breakpoint for the specified location
in any future contexts. The intended use case for this behavior is setting a
breakpoint in a client UI such as an editor, where the source code location has
changed (due to editing), but the changes have not yet been applied to the page
in the browser.

More advanced breakpoints, such as the HTML element breakpoints, are supported
by specifying that the \textit{location} property of a breakpoint in Crossfire
is an arbitrary JSON object. A location object is defined depending on
the type of breakpoint.  A JavaScript line breakpoint has a location object
which consists of a target URL and line number. Firebug's HTML breakpoints have
a location object which consists of an XPath expression that identifies the
target element.  Eventually other location types may be added, e.g. to support
network-related breakpoints such as Firebug's \textit{Break-On-XHR}
feature\cite{jjb-www2010}.

\subsection {Extensibility}
One of the goals of Crossfire is to support remote and multi-process versions of
Firebug. One of Firebug's features is its ability to be extended, and there are
already many existing extensions. Therefore, we have developed what an API for
Crossfire, called the \textit{Crossfire Tools API}. The Tools API allows Firebug
extension developers a clean and consistent way to access the Crossfire client
and/or server connection.

On the server-side, the Tools API allows an extension to send custom events
and handle custom requests using Crossfire's connection and transport mechanism.
A client extension can listen for these events and respond to the requests.
Using this API, it will be possible for Firebug extensions to continue to adapt
to architectural changes in future versions of Firebug.

One consequence of this design choice is that the set of possible commands or
event names cannot be specified definitively by the protocol. A Crossfire
client or server must therefore be able to accept and respond to any well-formed
message packet, even if it may not know how to handle a particular command or
event type.