TasteBrain Documentation

A transformer-based, multimodal model for understanding human taste and aesthetic preference.

Bestomer has built a foundation-derived model, trained on Qwen-Omni, at scale across ~20M curated product images spanning ~7M products. The model fine-tunes attention layers and uses a custom projector to map per-token hidden states into a structured 128-dimensional multi-vector embedding space, plus a pooled embedding.

This shared embedding space represents the global geometry of taste: styles, brands, materials, form factors, and aesthetic signals form coherent regions. On top of this shared space, the system learns lightweight per-user models that identify each individual's personal taste manifold — modeling attraction and aversion without retraining or fragmenting the backbone.

What the system does

TasteBrain encodes a universal representation of aesthetic preference. Given any input — an image, a text description, a product, an audio clip — the model projects it into a shared 128-dimensional space where proximity means aesthetic similarity. This enables:

  • Cross-modal search — image-to-products, text-to-products, product-to-products, audio-to-products
  • Personalized ranking — reorder results through a lightweight per-user taste model, without retraining
  • Brand and domain matching — identify which brands or retailers share the closest aesthetic signature to a given input
  • Grounded language generation — VQA-style supervision unifying representation, personalization, and explanation in a single system

How the data stays fresh

Bestomer operates proprietary crawling infrastructure that revisits every retail domain weekly, tracking how catalogs, assortments, and styles change over time. Beyond products and brands, the system crawls restaurants, music, reviews, and cultural sources — allowing taste to be modeled as a living, evolving system rather than a static dataset.

Architecture levels

The platform is organized into three tiers, each designed for different developer needs. Every level is built for LLM-assisted development — the tooling, documentation, and APIs are collectively called the Software Vibe Kit (SVK).

The SVK

The Software Vibe Kit ships as a Claude Code plugin with skills for API interaction, plus reference implementations you can fork and extend:

  • Claude Code plugin — skills for calling Prism, Shopkeep, and using UI components
  • Reference appsInstaShop, Streams, BookTaste, Dress Code
  • API reference — full endpoint documentation for both Prism and Shopkeep
  • Code cookbook — ready-to-use TypeScript examples with error handling