Feature Store
Initialize Feature Storeโ
Initialize a Feature Store.
from katonic.fs.feature_store import FeatureStore
feature_store = FeatureStore(
user_name = "your-name",
project_name = "new-project",
description = "project-description",
)
Define Entity Keyโ
Entity keys (Unique Id) will act as Primary keys to retrieve features.
from katonic.fs.entities import Entity
from katonic.fs.value_type import ValueType
entity = Entity(name="id", value_type=ValueType.INT64)
Define Data Sourceโ
File Source - CSVโ
from katonic.fs.core.offline_stores import FileSource
data_source = FileSource(
path = "path/to/your/csv/data/source/file",
file_format = "csv",
event_timestamp_column = "event-timestamp-column"
)
File Source - Parquetโ
from katonic.fs.core.offline_stores import FileSource
data_source = FileSource(
path = "path/to/your/parquet/data/source/file",
file_format = "parquet",
event_timestamp_column = "event-timestamp-column"
)
DataFrame Source - Pandasโ
from katonic.fs.core.offline_stores import DataFrameSource
batch_source = DataFrameSource(
df=pandas_dataframe,
event_timestamp_column="event-timestamp-column",
created_timestamp_column="created-timestamp-column",
)
DataFrame Source - Sparkโ
from katonic.fs.core.offline_stores import DataFrameSource
batch_source = DataFrameSource(
df=spark_dataframe,
mode="append",
event_timestamp_column="event-timestamp-column",
created_timestamp_column="created-timestamp-column",
)
Feature Viewโ
A feature view is a group of features.
from katonic.fs.entities import FeatureView
feature_view = FeatureView(
name="feature-view-name",
entities=["entity-key"],
ttl="2d", # no of days/months/years/hours
features=features_list,
batch_source=batch_source,
)
Write Data to Offline Storeโ
Store data to Offline store.
from katonic.fs.entities import Entity, FeatureView
entity = Entity(name="id", value_type=ValueType.INT64)
feature_view = FeatureView(
name="feature-view-name",
entities=["entity-key"],
ttl="2d", # no of days/months/years/hours
features=features_list,
batch_source=batch_source,
)
feature_store.write_table([entity, feature_view])
Historical Data Retrievalโ
Retrieve training data from Offline store.
training_df = feature_store.get_historical_features(
entity_df=entity-df,
feature_view=["feature-view-name"],
features=features_list,
).to_df()
Publish Data to Online Storeโ
It materializes the latest features from the Offline feature store to an Online store.
feature_store.publish_table(
start_ts = start_date_as_datetime_object,
end_ts = end_date_as_datetime_object
)
Online Features Retrievalโ
It will used to get the latest features at low latency and also for the online serving.
feature_store.get_online_features(
entity_rows=[{"entity-key": entity-value}],
feature_view=["feature-view-name"],
features=features_list,
).to_df()
Feature Store Registryโ
Feature Store Registy is a tracking engine for the feature definitions and their related metadata.
List Entitiesโ
It will list all the entities present in the Feature Registry from all the project.
from katonic.fs.feature_store import FeatureStore
feature_store = FeatureStore(
user_name = "your-name",
project_name = "new-project",
description = "project-description",
)
feature_store.list_entities()
List Feature Viewโ
It will list all the Feature Views present in the Feature Registry from all the project.
feature_store.list_feature_views()
Get Registry Info - Given User Nameโ
It will Get all the Meta present in the Feature Registry related with given user name.
feature_store.get_registry_info(user_name='user')
Get Registry Info - Given Project Nameโ
It will Get all the Meta present in the Feature Registry related with given project name.
feature_store.get_registry_info(project_name='housing_price')
Get Registry Info - Given User Name, Project Nameโ
It will Get all the Meta present in the Feature Registry related with given project name and user name.
feature_store.get_registry_info(user_name='user', project_name='housing_price')