MXP Platform
Solutions

Recommendations

How MXP's Recommendation System works — models, containers, and inference

The Recommendation System provides ML-driven product recommendations across different page types. It is built as two independent FastAPI microservices: edit_app (model management) and predict_app (real-time inference).

What it solves

Algorithmic ranking handles the search use case but doesn't serve adjacent needs — what to show on a product detail page when there's no active query, what to suggest in the cart, or what to show on the homepage. Recommendations fill that gap with personalized, context-aware product lists.

Architecture

edit_app  (model config + training)

    └── stores model configs in GCS per tenant
    └── triggers training pipelines (optional)

predict_app  (inference)

    └── reads model configs from GCS at request time
    └── routes requests to backend model (Elasticsearch, external ML model)
    └── returns ranked product list

Recommendation models

Each tenant can have multiple recommendation models configured, one per use case. Model types:

TypeWhen to use
similar_items"You may also like" on product detail pages
frequently_bought_together"Customers also bought" on cart or checkout
recently_viewedPersonalized history on home or category pages
genericConfigurable for custom scoring — can wrap any backend model

Recommendation containers

A container maps a page type to a model. When predict_app receives a /get_predict request with pageType=pdp, it looks up the container configured for pdp in the tenant's config and routes the request to the right model.

This indirection lets merchants switch models per placement without code changes — only the container configuration needs updating, done via the Merch Module UI.

Training pipelines

The pipelines/ directory in the recommendation system repo contains training jobs for offline model training. Trained models are stored and referenced from model configurations in edit_app.

Inference flow

  1. Client calls POST /get_predict with tenant, pageType, and request context
  2. predict_app loads the tenant's container configuration from GCS
  3. Container resolves which model to use for the given pageType
  4. Request is routed to the model's backend (Elasticsearch similarity, external model endpoint)
  5. Results are re-ranked and returned to the caller