Getting started with Feast

How to set up Feast and walk through examples

This guide provides the necessary resources to install Feast alongside Kubeflow, describes the usage of Feast with Kubeflow components, and provides examples that users can follow to test their setup.

For an overview of Feast, please read Introduction to Feast.

Installing Feast with Kubeflow

Overview

  • This guide assumes that you have a running Kubeflow cluster already. If you don’t have Kubeflow installed, then head on over to the Kubeflow installation guide.
  • This guide also assumes that you have a running online feature store that Feast supports (Redis, Datastore, DynamoDB).
  • The latest version of Feast does not need to be installed into Kubernetes. It is possible to run Feast completely from CI or as a client library (during training or inference)
  • Feast requires a bucket (S3, GCS, Minio, etc) to maintain a feature registry, requires an online feature store for serving feature values, and it requires a scheduler to keep the online store up to date.

Installation

To use Feast with Kubeflow, please follow the following steps

Advanced

  • Please see this guide which provides best practices for running Feast in a production context.
  • Please see this guide for upgrading from Feast 0.9 (Spark-based) to the latest Feast (0.12+).

Accessing Feast from Kubeflow

Once Feast is installed within the same Kubernetes cluster as Kubeflow, users can access its APIs directly without any additional steps.

Feast APIs can roughly be grouped into the following sections:

  • Feature definition and management: Feast provides both a Python SDK and CLI for interacting with Feast Core. Feast Core allows users to define and register features and entities and their associated metadata and schemas. The Python SDK is typically used from within a Jupyter notebook by end users to administer Feast, but ML teams may opt to version control feature specifications in order to follow a GitOps based approach.

  • Model training: The Feast Python SDK can be used to trigger the creation of training datasets. The most natural place to use this SDK is to create a training dataset as part of a Kubeflow Pipeline prior to model training.

  • Model serving: The Feast Python SDK can also be used for online feature retrieval. This client is used to retrieve feature values for inference with Model Serving systems like KFServing, TFX, or Seldon.

Examples

Please see our tutorials section for a full list of examples

Next steps