Skip to main content
Version: 4.4

Feature Store

Initialize Feature Storeโ€‹

Initialize a Feature Store.

from katonic.fs.feature_store import FeatureStore
feature_store = FeatureStore(
user_name = "your-name",
project_name = "new-project",
description = "project-description",
)

Define Entity Keyโ€‹

Entity keys (Unique Id) will act as Primary keys to retrieve features.

from katonic.fs.entities import Entity
from katonic.fs.value_type import ValueType

entity = Entity(name="id", value_type=ValueType.INT64)

Define Data Sourceโ€‹

File Source - CSVโ€‹

from katonic.fs.core.offline_stores import FileSource

data_source = FileSource(
path = "path/to/your/csv/data/source/file",
file_format = "csv",
event_timestamp_column = "event-timestamp-column"
)

File Source - Parquetโ€‹

from katonic.fs.core.offline_stores import FileSource

data_source = FileSource(
path = "path/to/your/parquet/data/source/file",
file_format = "parquet",
event_timestamp_column = "event-timestamp-column"
)

DataFrame Source - Pandasโ€‹

from katonic.fs.core.offline_stores import DataFrameSource

batch_source = DataFrameSource(
df=pandas_dataframe,
event_timestamp_column="event-timestamp-column",
created_timestamp_column="created-timestamp-column",
)

DataFrame Source - Sparkโ€‹

from katonic.fs.core.offline_stores import DataFrameSource

batch_source = DataFrameSource(
df=spark_dataframe,
mode="append",
event_timestamp_column="event-timestamp-column",
created_timestamp_column="created-timestamp-column",
)

Feature Viewโ€‹

A feature view is a group of features.

from katonic.fs.entities import FeatureView

feature_view = FeatureView(
name="feature-view-name",
entities=["entity-key"],
ttl="2d", # no of days/months/years/hours
features=features_list,
batch_source=batch_source,
)

Write Data to Offline Storeโ€‹

Store data to Offline store.

from katonic.fs.entities import Entity, FeatureView

entity = Entity(name="id", value_type=ValueType.INT64)
feature_view = FeatureView(
name="feature-view-name",
entities=["entity-key"],
ttl="2d", # no of days/months/years/hours
features=features_list,
batch_source=batch_source,
)
feature_store.write_table([entity, feature_view])

Historical Data Retrievalโ€‹

Retrieve training data from Offline store.

training_df = feature_store.get_historical_features(
entity_df=entity-df,
feature_view=["feature-view-name"],
features=features_list,
).to_df()

Publish Data to Online Storeโ€‹

It materializes the latest features from the Offline feature store to an Online store.

feature_store.publish_table(
start_ts = start_date_as_datetime_object,
end_ts = end_date_as_datetime_object
)

Online Features Retrievalโ€‹

It will used to get the latest features at low latency and also for the online serving.

feature_store.get_online_features(
entity_rows=[{"entity-key": entity-value}],
feature_view=["feature-view-name"],
features=features_list,
).to_df()

Feature Store Registryโ€‹

Feature Store Registy is a tracking engine for the feature definitions and their related metadata.

List Entitiesโ€‹

It will list all the entities present in the Feature Registry from all the project.

from katonic.fs.feature_store import FeatureStore
feature_store = FeatureStore(
user_name = "your-name",
project_name = "new-project",
description = "project-description",
)
feature_store.list_entities()

List Feature Viewโ€‹

It will list all the Feature Views present in the Feature Registry from all the project.

feature_store.list_feature_views()

Get Registry Info - Given User Nameโ€‹

It will Get all the Meta present in the Feature Registry related with given user name.

feature_store.get_registry_info(user_name='user')

Get Registry Info - Given Project Nameโ€‹

It will Get all the Meta present in the Feature Registry related with given project name.

feature_store.get_registry_info(project_name='housing_price')

Get Registry Info - Given User Name, Project Nameโ€‹

It will Get all the Meta present in the Feature Registry related with given project name and user name.

feature_store.get_registry_info(user_name='user', project_name='housing_price')