Thursday, January 28, 2010

Academic paper on Chromium extensions

Protecting Browsers from Extension Vulnerabilities is a paper that covers some of the interesting security features of the Chromium extension system.

The isolated worlds feature that I wrote about earlier is described toward the end. Isolated worlds separate each JavaScript program that has access to a web page's DOM. Each program can modify the DOM and see changes made by other programs, but programs cannot exchange JavaScript references. This setup prevents privileges from accidentally leaking between programs. Isolated worlds are now implemented directly in WebKit (thanks to Adam Barth), so they could show up in other WebKit applications in the future.

My other favorite feature is that an extension's unique ID is a public key. The extension is signed with the corresponding private key, which means it is impossible to have ID collisions. Even if a developer copies an existing extension to get started, he won't be able to copy the extension's ID because he would need the private key in order to sign it.

Thursday, December 24, 2009

Hey I used that license, too

Was amused by Evan's take on JSMin's software license. Apparently all of Crockford's work includes the clause "The software shall be used for Good, not Evil" in addition the standard MIT license text.

I think at one point I saw this, internalized it, and spat it back out as "This code is public domain. Please use it for good, not evil."

I know that this stuff is really serious business to some people, and they get frustrated by these random acts of frivolity. "Don't you know," they say with a stern face, "software licenses are no place for fun and games".

Pfffft. The license is awesome. I appreciated the joke when I first saw it, and am appreciating the ongoing humor it is providing.

I do see how it would prevent some organizations from using the code. Mostly those that have become large, risk-averse, and so disconnected from the community that they are actually, honestly afraid that Crockford is going to come after them for using JSMin for evil.

Sadly, Google apparently now falls into that bucket.

Friday, July 17, 2009

Bundling multiple versions of binary XPCOM components

I happened to come up with what I think is a clever hack for getting around a sticky compatibility problem in Firefox extensions.

The problem is that if your extension includes a binary XPCOM component that uses unfrozen interfaces (and it is hard not to, lots of things are unfrozen), then it is highly likely that you will have to recompile your component for each new version of the Gecko SDK (and therefore most new major versions of Fierfox).

The Mozilla wiki proposed the idea of using a "stub component" to sniff the environment version and load the right XPCOM component, but this is pretty complex to implement.

I realized that you can just use JavaScript to do the same, and it is trivial. Hope this helps someone else out (I've also updated the wiki).

Friday, June 26, 2009

The designs of dustincurtis.com

Are beautiful.







I say designs because I just realized that each article has its own unique design.

I read most of these when they were posted, but I think the fact that they were posted relatively far apart made me miss the fact that each one has a different look.

Some of the designs are very intricate and clearly were a lot of work. For example, check out about.html. Tying things together is a common header and footer that brings continuity and allows you to navigate between articles.

I attempted to do something like this once for my own blog, but found that I simply did not have the time and energy to come up with a good design for every single post.

I'm glad that someone else did, and I'm looking forward to the next article.

Wednesday, April 15, 2009

JSON Schema, part 2

So the JSON Schema stuff worked out:

chromium.tabs.createTab = function(tab, callback) {
validate(arguments, arguments.callee.params);
sendRequest(CreateTab, tab, callback);
};
chromium.tabs.createTab.params = [
{
type: "object",
properties: {
windowId: chromium.types.optPInt,
url: chromium.types.optStr,
selected: chromium.types.optBool
},
additionalProperties: false
},
chromium.types.optFun
];

--http://codereview.chromium.org/66006/diff/1087/1101

I'm really happy with how this came out because it means that all our APIs will get great error messages for free; it's not something that we'll have to think about (and therefore get wrong) on a case-by-case basis.

I tend to obsess about errors messages because they are basically the first-run experience for any library. No matter how well designed, people are going to typically call your API wrong the first time, especially in a loosely-typed language like JavaScript. What happens in that case? Does the API stare blankly back at them, leaving them to wonder if their code is even running? Or does it helpfully tell them what they did wrong?

Chromium APIs will give errors. Specifically, they will retort something like:
Invalid value for parameter 0. Property windowId: expected integer, got string.

Monday, April 13, 2009

Content Scripts in Chromium

Here's an interesting factoid about browser extensions: lots of them are not about extending the browser at all. By my count, about 75% of the this week's top 20 Firefox extensions are more about extending the web content rendered by the browser than extending the browser itself. Similar trends exist in other browser extension systems.

Chromium extensions will be able to interact with web content too, using a feature we're calling content scripts (we've gone around and around on the name, this may not be final). The code for this is at a pretty good stopping point now, so I wanted to pause and write down what we did, why we did it, and some ideas I have for future improvements.

If you want to try it out, you can check out the beginnings of our Extension Tutorial, which covers most of what I'll talk about here.

First, some background on the feature...

Content scripts are basically the same thing as Greasemonkey scripts, with some important improvements.

You register your content scripts declaratively in your extension's manifest, like this:
{
"name": "My first extension",
"description": "The first extension that I made",
"version": "1.0",
"content_scripts": [
{
"matches": ["http://www.google.com/*", "http://mail.google.com/"],
"css": ["foo.css", "bar.css"],
"js": ["hot.js", "dog.js"],
"run_at": "document_start"
}
]
}
The syntax for matching URLs is slightly different than in Greasemonkey. The reason for this is that we wanted to eliminate a common bug in Greasemonkey scripts, where people accidentally match URLs more loosely than they intend. A classic example is the common Greasemonkey pattern @include *.google.com*, which matches every domain, not just google.com and its subdomains.

The matching syntax used in content scripts separates the domain portion of the pattern from the path portion, making it more explicit which sites a script will run on. One way we could use this is to someday do UI like this:

==============================================
Install 'My extension'?
----------------------------------------------
This extension will be able to interact with
web pages on:

www.google.com
mail.google.com

[ok] [cancel]
==============================================

Other minor feature differences:
  • A content script can consist of multiple physical JavaScript files or CSS files, and it can also reference images or other resources included in the extension by URL.
  • Content scripts support "early injection", which allows them to request being injected before any nodes have been added to the document by using the optional "run_at" key.


Execution Environment

To understand the execution environment for content scripts, it helps to first understand the execution environment of normal web page JavaScript.

All JavaScript is defined in a context. Each DOM window gets its own context, one purpose of which is to hold the prototypes of all the global objects (Object, Array, String, and so-on). This is why when you extend Array.prototype in one frame, it doesn't affect Arrays created in other frames.

Importantly, you can call functions and access objects across contexts. This happens normally when you do something like window.frames['otherframe'].someFunction().

Here's a diagram that explains the relationship between the various objects in pretty picture form (thanks, Gliffy!):


Each context also has a single global object. When you access global variables in a JavaScript program, you are really interacting with the properties of this global object. In HTML, the global object is of course the Window object.

To make property hiding work, in Chromium's implementation, the global object is not actually the same JavaScript object that represents ("wraps") the C++ DOMWindow. There is actually a separate JavaScript object whose __proto__ points to that object. When you define global variables, it is this object where the properties are actually defined.

Ok, so how do content scripts fit into this?

Content scripts run in a very similar-looking environment. They run in a separate context, and have a separate global object. But that global object's __proto__ points at the same JS object that represents the Window.


So content scripts get their own global scope and their own set of prototypes. Variables defined in the web page won't be "visible" by default in content scripts, and the same is true in reverse. Other than that, the environment for content scripts is exactly the same as for normal JavaScript running in web pages. Writing content scripts should be exactly the same as writing JavaScript for web pages.

Sometimes it is useful to access the page's global variables. For example, in Gmail there is an API that allows Greasemonkey scripts to drive some parts of the UI. To allow this kind of functionality, the content script envionment has a special contentWindow global variable defined that can be used to access the global scope of the page's JavaScript.


Permissions

Another difference from Greasemonkey is the model for accessing privileged APIs. Greasemonkey scripts have direct access to some privileged APIs. The most popular of these is GM_xmlhttpRequest, which provides access to origins other than the one for the current document. These APIs are very useful, but there have been bugs where they leaked into web content, which was bad.

In order to prevent this from being possible, Chromium extensions are split into two main pieces: a privileged part (I'll call it just 'the extension' from now on) that has access to special powerful APIs, and an unprivileged part (the content script) that runs in the renderer and has no special APIs.

The two parts cannot interact directly. In fact, they run in separate OS processes, so direct interaction is impossible. The only way they can communicate is via message passing APIs, similar to postMessage().


(NOTE: The implementation of content script messaging is still in progress and is incomplete in current trunk and dev builds)

It is the extension developer's responsibility to send only specific messages to the extension process from the renderer, and to validate those messages carefully. Extension developers need to be aware that malicious web pages could send them messages exactly the same way their content scripts can.

This design is modeled after the way Chromium itself works, where the renderers are untrusted and have to send messages to the browser process to get interesting work done.


Future Directions

I have a couple ideas for where I'd like to take this next...


Idea 1: Completely separate content scripts and page JavaScript

Right now, the way that JavaScript access to the DOM is implemented, there is essentially a global table of JavaScript wrappers for each C++ DOM object. Whenever code needs to find the JS object for a given C++ object, it consults this table:


This single table creates a bridge between any two JavaScript contexts that have access to the same DOM nodes. For example if page JavaScript does something like document.body.onclick = function() { ... }, any other code that has access to document.body will also have access to the onclick function handler that the page JavaScript defined .

This makes sense for web pages, where you want frames in the same origin to see the same sets of JavaScript variables. But for content scripts, it would be nice to wall these two worlds off from each other. It is relatively infrequent for content scripts to need to see the JavaScript enironment fo pages. It is more typical to only need access to the DOM.

In order to isolate content scripts from page JavaScript, we'd have to have separate mapping tables: one for the page JavaScript, and one for each content script. A C++ DOM node could have multiple wrappers, one for each of these "worlds". Then, when we needed to get a JavaScript object for a particular C++ object, we'd decide which table to look in based on which context the calling code was running in. Every context could only be in one "world".

We could even add assertions to the JavaScript engine that worlds are never bridged. That way if we ever had a bug, in the worst case we'd crash the renderer, not have a security problem.

If we can wall these worlds off from each other, then we can offer some increased privileges to content scripts directly, because we'd be confident that they couldn't leak to web content. You'd no longer have to go to the extension process to get cross-origin XHR, for example. This would also have the advantage of not requiring extension developers to carefully validate their messages, since we would know that page JavaScript could not send extensions messages.

We'd still probably need content scripts as they exist today if you want to interact with the JS defined by the page (for example for the Gmail API). But lots of use cases don't need that, and this idea would decrease complexity for those cases.


Idea 2: DOM Access from Extension Processes

Another idea is to offer some form of DOM access directly to extension processes. There is a team in Chromium working on an out-of-process version of the web inspector. This will clearly need some form of DOM access to work, so we can probably reuse what they do to give extension developers the ability interact with page DOM directly from their extension process.

I can imagine something simple based on querySelectorAll(). You ask for some nodes based on a CSS expression, get back a snapshot, and then send some updates. Of course, there are problems with races: the nodes might be gone by the time you send the update. But I think in most cases this would work pretty nicely I think. Again, I think we'd want to keep content scripts as they are today for more complex needs.


Yawn... Greasemonkey is great, but when do we get real extensions?

I know, I know. These aren't "real" extensions. You want to know when you'll be able to put things in the Chrome UI. Good news: that is well underway. Hopefully my next blog post will be about how to add "toolstrips" to Chromium.

Until then, have a look at content scripts and let us know what you think.

Sunday, April 5, 2009

Yo dawg...

So on the extensions project, I've been working on what I think is the somewhat novel idea of API non-design.

To be more specific, I'm using the CRUD pattern as a starting point for all the major sub-systems' APIs. My hope is that this will have a number of positive effects:
  • Minimize API design hand-wringing
  • Provide a large base of functionality quickly
  • Make it easy for Chromium developers to add new APIs
  • Make it easy for extension developers to learn new APIs
We decided to use JSON heavily in the implementation. For example, the createTab() API looks like this:
chromium.tabs.createTab({
"url": "http://www.google.com/",
"selected": true,
"tabIndex": 3
});
So I got this all working for a few methods, and then I got to writing the validation code. I could write the code by hand, but that's so much work. And why bother when somebody has gone and invented JSON Schema.

That's right, it's a schema language for JSON. And of course it has a schema, written in JSON schema. Whee!

So we should be able to just declare the expected structure for our API parameters and push the validate() button. Probably there will have to be extra stuff around the edges, but this should get rid of a majority of the grunt work.