Module: hp_tune_select

The hp_tune_select module in the RAISING folder implements RAISING: a two stage supervised deep learning framework for hyperparameter tuning and feature selection. It contains two primary functions: hp_optimization and feature_importance. In the first stage user performs hyperparameter tuning through hp_optimization function and passes the optimal neural network(NN) architecture to feature_importance to train the architecture on entire data and estimate the feature importance. Module is designed for outputs of continuous, binary and multiclass nature (single output variable with more than two categories).

Functions in the Module

1. Hyperparameter Optimization Function

The hp_optimization function is designed for conducting hyperparameter optimization using various methods such as Bayesian optimization, Hyperband, RandomSearch, and RSLM. It aims to identify the optimal Neural Network (NN) architecture.

Function Signature

def hp_optimization(input_data, output_data, output_class, objective_fun, utag, Standardize=True, window_size=None, algorithm="Bayesian", config_file=None, cross_validation=False,model_file = "ANN_architecture_HP.keras",NCombination = 100000, **kwargs):

Parameters

  • input_data: Dataframe or array-like input dataset.
  • output_data: Dataframe or array like output dataset.
  • output_class: “continuous”,“binary” or “multiclass”. If output class is “multiclass” then function expects a one-hot encoded dataframe.
  • objective_fun: “loss”,“val_loss” or multiobjective function like [keras_tuner.Objective(“loss”,direction=“min”),keras_tuner.Objective(“val_loss”, direction=“min”)]
  • utag (Optional, default=““): str, user-defined tag to be appended in default ouput file and algorithm directory names.
  • algorithm (Optional, default=“Bayesian”): str, “Bayesian”,“Hyperband”,“RandomSearch” or “RSLM” Hyperparameter optimization algorithm.
  • Standardize (Optional, default=True): Boolean flag to standardize the output data if output_class is “continuous”.
  • window (Optional): Required, If hyperparameter tuning is performed on subset of input data. window = [start,end]
  • config_file (Optional): User defined hyperparameter space in a config file of .json format, see example hyperparameter_config.json file
  • cross_validation (Optional, default=False): Set True if cross-validation to be performed.
  • model_file (Optional, default=“ANN_architecture_HP.keras”): str, user-defined file to save model architecture.
  • NCombination (Optional,default = 100000): specify number of hyperparameter combination on which “RSLM” algorithm will perform prediction.
  • **kwargs: Additional keyword arguments specific to keras and tensorflow libraries used in the package.

Returns

The function returns the best ANN architecture.

Usage

from RAISING.hp_tune_select import hp_optimization
objective_test = "val_loss"
X_data # input dataset
y_data # output dataset
model = hp_optimization(input_data=X_data, output_data = y_data,objective_fun = objective_test,output_class ="continuous",**kwargs)

2. Feature Importance Estimation Function

The feature_importance function is designed for performing the NN architecture training and feature selection. The function requires the NN architecture to perform training and implements DeepFeatImp, DeepExplainer, KernelExplainer methods for feature selection.

Function Signature

def feature_importance(input_data,output_data,feature_set,model_file,iteration = 1,window = None,Standardize = True,feature_method = "DeepFeatImp",train_model_file = "Trained_ANN_Architecture.h5", output_class = "continuous",**kwargs):

Parameters

  • input_data: Dataframe or array-like input dataset.
  • output_data: Dataframe like output dataset. Required to extract column names.
  • feature_set: a list of input dataset feature names.
  • model_file (Optional, default=“ANN_architecture_HP.keras”): str, specify name of model architecture(model_file) used in function hp_optimization.
  • iteration (Optional, default=1): int, Number of times replicate DNN architecture training
  • window (Optional): Required, If hyperparameter tuning is performed on subset of input data, should be of same length as specified in hp_optimization. window = [start,end]
  • Standardize (Optional, default=True): Boolean flag to standardize the output data is output_class is “continuous”.
  • feature_method(Optional, default=“DeepFeatImp”): str, “DeepFeatImp”, “DeepExplainer” or “KernelExplainer” Feature selection method
  • train_model_file (Optional, default=“Trained_ANN_Architecture.h5”): str, file name to save trained DNN architecture
  • output_class: str, “continuous”,“binary” or “multiclass”. If output class is “multiclass” then function expects a one-hot encoded dataframe.
  • **kwargs: Additional keyword arguments specific to keras and tensorflow libraries used in the package.

Returns

The function returns the trained ANN architecture and feature importance dataframe.

Usage

from RAISING.hp_tune_select import *
objective_test = "val_loss"
X_data # input dataset
y_data # output dataset
model_DNN = hp_optimization(input_data=X_data, output_data = y_data,objective_fun = objective_test,output_class ="continuous". **kwargs)
feature_est = feature_importance(input_data=X_data, output_data = y_data,feature_set = X_data.columns.to_list(),**kwargs)

This README is part of the documentation for the RAISING package.