The browser machine learning API proposals are rapidly maturing, paving the way for standardised on-device AI capabilities across the web. For developers working with these cutting-edge technologies, staying current is key. AiBrow, our on-device AI browser extension, is committed to this journey, and our latest release, AiBrow 2.0.0, brings full support for the newest API specifications.

Over recent iterations, the API landscape has seen significant positive changes, moving towards a more finalised and robust structure. If you're building with browser-based AI, you'll know these proposals are now segmented into three distinct feature sets:

This restructuring is a positive step, offering clearer and more focused interfaces. However, it does mean that users of older API versions will need to migrate to leverage the latest advancements in browser AI and continue using AiBrow. Thankfully, this migration largely involves adapting to improved namespacing.

Key updates

The key updates in version 2.0.0 include:

Removal of the CoreModel API: The LanguageModel API now consolidates these functionalities, offering a more unified experience. More info.

Renaming grammar to responseConstraint: The API spec now supports generating JSON output and restricting output as required through the responseConstraint option. AiBrow is following this and migrating the grammar field that we first introduced. More info.

const grammar = {
  "type": "object",
  "properties": {
    "first_name": {
      "type": "string"
    },
    "last_name": {
      "type": "string"
    },
    "country": {
      "type": "string"
    }
  }
}

// Old
const stream = await session.promptStreaming(prompt, { grammar: grammar })

// New
const stream = await session.promptStreaming(prompt, { responseConstraint: grammar })

Changing the top-level API namespace: The API originally placed all the APIs under the window.ai namespace. These have now moved to be top-level APIs. AiBrow continues to place its custom APIs under the window.aibrow namespace when the extension is installed. More info.

// Old
await window.ai.languageModel.create()
await window.aibrow.languageModel.create()

// New
await window.LanguageModel.create()
await window.aibrow.LanguageModel.create()

Moving capabilities to availability: The original API proposal introduced a capabilities call to most APIs. In the latest iteration, this has been changed to provide a single availability enum. AiBrow moves forwards with this change, but also adds a compatability call to give additional information about model compatibility

// Old
console.log(await window.ai.languageModel.capabilities()) // { availability: 'readily', score: 0.9, ... }

// New
console.log(await window.LanguageModel.availability()) // 'available'
console.log(await window.aibrow.LanguageModel.compatibility()) // { score: 0.9, ... }

The plan ahead

AiBrow is already highly aligned with the current browser specifications with about 95% compatibility. The main parts that still need to be implemented are the multimodal capabilities of the LanguageModel API, enabling interactions with images and audio.

We're already hard at work bringing these into AiBrow and technically, everything is already set up with a solid foundation:

  • The underlying llama-cpp library that powers AiBrow already has support for image processing so we plan on adding this in
  • There are already robust audio processing tools that allow you to turn speech into text. We'll be mixing this into the AiBrow library to add easy audio capabilities to the extension.

Our goal is to continue bringing these powerful multimodal features to both the native AiBrow extension and our WebGPU implementation, unlocking a new dimension of intelligent, on-device browser experiences. Stay tuned!