Retrieving the Training data from Feature Store.
We can use the Offline Store to create a training dataset & train the models with it.
Let's create a Training Dataset.โ
- To keep the consistency between Training & Testing Datasets, we will use an entity dataframe which consists of
id
&event timestamp
column. - By using this entity dataframe, feature store will find the historical records by doing point-in-time joins.
# Let's load the Entity Dataframe.
entity_df = pd.read_csv("entity_df.csv")
Note: Make sure that this dataset's data types are also accurate.
Retrieving Historical Features.โ
# Now let's use this entity df to create a training dataset with the historical features.
train_data = fs.get_historical_features(
entity_df = entity_df, # Your Entity Dataframe
feature_view = ["default_loan_feature_view"], # The name that you gave to create your feature view.
features = cols, # The features that we want to retrieve from offline store.
).to_df() # We are directly converting it to a pandas Dataframe.
Your training dataset is ready. You can use the same to train your models.