TasteBrain Documentation

A transformer-based, multimodal model for understanding human taste and aesthetic preference.

Bestomer has built a foundation-derived model, trained on Qwen-Omni, at scale across ~20M curated product images spanning ~7M products. The model fine-tunes attention layers and uses a custom projector to map per-token hidden states into a structured 128-dimensional multi-vector embedding space, plus a pooled embedding.

This shared embedding space represents the global geometry of taste: styles, brands, materials, form factors, and aesthetic signals form coherent regions. On top of this shared space, the system learns lightweight per-user models that identify each individual's personal taste manifold — modeling attraction and aversion without retraining or fragmenting the backbone.

What the system does

TasteBrain encodes a universal representation of aesthetic preference. Given any input — an image, a text description, a product, an audio clip — the model projects it into a shared 128-dimensional space where proximity means aesthetic similarity. This enables:

Cross-modal search — image-to-products, text-to-products, product-to-products, audio-to-products
Personalized ranking — reorder results through a lightweight per-user taste model, without retraining
Brand and domain matching — identify which brands or retailers share the closest aesthetic signature to a given input
Grounded language generation — VQA-style supervision unifying representation, personalization, and explanation in a single system

How the data stays fresh

Bestomer operates proprietary crawling infrastructure that revisits every retail domain weekly, tracking how catalogs, assortments, and styles change over time. Beyond products and brands, the system crawls restaurants, music, reviews, and cultural sources — allowing taste to be modeled as a living, evolving system rather than a static dataset.

Architecture levels

The platform is organized into three tiers, each designed for different developer needs. Every level is built for LLM-assisted development — the tooling, documentation, and APIs are collectively called the Software Vibe Kit (SVK).

The Model

Direct access to the Qwen-Omni-derived backbone, multi-vector embeddings, and the raw 128-dimensional taste space. For researchers and teams running custom inference.

SVK Primitives

Stateless API primitives — Prism (personalized search) and Shopkeep (non-personalized search) — for cross-modal queries, sentiment reranking, and brand matching.

Persona Services

Stateful services that maintain per-user taste models — Windows, Sentiments, Tasteboards — plus retailer intelligence tools and context ingest pipelines.

The SVK

The Software Vibe Kit ships as a Claude Code plugin with skills for API interaction, plus reference implementations you can fork and extend:

Claude Code plugin — skills for calling Prism, Shopkeep, and using UI components
Reference apps — InstaShop, Streams, BookTaste, Dress Code
API reference — full endpoint documentation for both Prism and Shopkeep
Code cookbook — ready-to-use TypeScript examples with error handling

Quickstart Core Concepts