TasteBrain Documentation
A transformer-based, multimodal model for understanding human taste and aesthetic preference.
Bestomer has built a foundation-derived model, trained on Qwen-Omni, at scale across ~20M curated product images spanning ~7M products. The model fine-tunes attention layers and uses a custom projector to map per-token hidden states into a structured 128-dimensional multi-vector embedding space, plus a pooled embedding.
This shared embedding space represents the global geometry of taste: styles, brands, materials, form factors, and aesthetic signals form coherent regions. On top of this shared space, the system learns lightweight per-user models that identify each individual's personal taste manifold — modeling attraction and aversion without retraining or fragmenting the backbone.
What the system does
TasteBrain encodes a universal representation of aesthetic preference. Given any input — an image, a text description, a product, an audio clip — the model projects it into a shared 128-dimensional space where proximity means aesthetic similarity. This enables:
- Cross-modal search — image-to-products, text-to-products, product-to-products, audio-to-products
- Personalized ranking — reorder results through a lightweight per-user taste model, without retraining
- Brand and domain matching — identify which brands or retailers share the closest aesthetic signature to a given input
- Grounded language generation — VQA-style supervision unifying representation, personalization, and explanation in a single system
How the data stays fresh
Bestomer operates proprietary crawling infrastructure that revisits every retail domain weekly, tracking how catalogs, assortments, and styles change over time. Beyond products and brands, the system crawls restaurants, music, reviews, and cultural sources — allowing taste to be modeled as a living, evolving system rather than a static dataset.
Architecture levels
The platform is organized into three tiers, each designed for different developer needs. Every level is built for LLM-assisted development — the tooling, documentation, and APIs are collectively called the Software Vibe Kit (SVK).
Direct access to the Qwen-Omni-derived backbone, multi-vector embeddings, and the raw 128-dimensional taste space. For researchers and teams running custom inference.
Stateless API primitives — Prism (personalized search) and Shopkeep (non-personalized search) — for cross-modal queries, sentiment reranking, and brand matching.
Stateful services that maintain per-user taste models — Windows, Sentiments, Tasteboards — plus retailer intelligence tools and context ingest pipelines.
The SVK
The Software Vibe Kit ships as a Claude Code plugin with skills for API interaction, plus reference implementations you can fork and extend:
- Claude Code plugin — skills for calling Prism, Shopkeep, and using UI components
- Reference apps — InstaShop, Streams, BookTaste, Dress Code
- API reference — full endpoint documentation for both Prism and Shopkeep
- Code cookbook — ready-to-use TypeScript examples with error handling