The hp_tune_select
module in the RAISING folder
implements RAISING
: a two stage supervised deep learning
framework for hyperparameter tuning and feature selection. It contains
two primary functions: hp_optimization
and
feature_importance
. In the first stage user performs
hyperparameter tuning through hp_optimization
function and
passes the optimal neural network(NN) architecture to
feature_importance
to train the architecture on entire data
and estimate the feature importance. Module is designed for outputs of
continuous, binary and multiclass nature (single output variable with
more than two categories).
The hp_optimization
function is designed for conducting
hyperparameter optimization using various methods such as Bayesian
optimization, Hyperband, RandomSearch, and RSLM. It aims to identify the
optimal Neural Network (NN) architecture.
def hp_optimization(input_data, output_data, output_class, objective_fun, utag, Standardize=True, window_size=None, algorithm="Bayesian", config_file=None, cross_validation=False,model_file = "ANN_architecture_HP.keras",NCombination = 100000, **kwargs):
input_data
: Dataframe or array-like input dataset.output_data
: Dataframe or array like output
dataset.output_class
: “continuous”,“binary” or “multiclass”. If
output class is “multiclass” then function expects a one-hot encoded
dataframe.objective_fun
: “loss”,“val_loss” or multiobjective
function like
[keras_tuner.Objective(“loss”,direction=“min”),keras_tuner.Objective(“val_loss”,
direction=“min”)]utag
(Optional, default=““): str, user-defined tag to
be appended in default ouput file and algorithm directory names.algorithm
(Optional, default=“Bayesian”): str,
“Bayesian”,“Hyperband”,“RandomSearch” or “RSLM” Hyperparameter
optimization algorithm.Standardize
(Optional, default=True): Boolean flag to
standardize the output data if output_class is “continuous”.window
(Optional): Required, If hyperparameter tuning
is performed on subset of input data. window = [start,end]config_file
(Optional): User defined hyperparameter
space in a config file of .json format, see example
hyperparameter_config.json filecross_validation
(Optional, default=False): Set True if
cross-validation to be performed.model_file
(Optional,
default=“ANN_architecture_HP.keras”): str, user-defined file to save
model architecture.NCombination
(Optional,default = 100000): specify
number of hyperparameter combination on which “RSLM” algorithm will
perform prediction.**kwargs
: Additional keyword arguments specific to
keras and tensorflow libraries used in the package.The function returns the best ANN architecture.
from RAISING.hp_tune_select import hp_optimization
objective_test = "val_loss"
X_data # input dataset
y_data # output dataset
model = hp_optimization(input_data=X_data, output_data = y_data,objective_fun = objective_test,output_class ="continuous",**kwargs)
The feature_importance
function is designed for
performing the NN architecture training and feature selection. The
function requires the NN architecture to perform training and implements
DeepFeatImp, DeepExplainer, KernelExplainer methods for feature
selection.
def feature_importance(input_data,output_data,feature_set,model_file,iteration = 1,window = None,Standardize = True,feature_method = "DeepFeatImp",train_model_file = "Trained_ANN_Architecture.h5", output_class = "continuous",**kwargs):
input_data
: Dataframe or array-like input dataset.output_data
: Dataframe like output dataset. Required to
extract column names.feature_set
: a list of input dataset feature
names.model_file
(Optional,
default=“ANN_architecture_HP.keras”): str, specify name of model
architecture(model_file) used in function
hp_optimization
.iteration
(Optional, default=1): int, Number of times
replicate DNN architecture trainingwindow
(Optional): Required, If hyperparameter tuning
is performed on subset of input data, should be of same length as
specified in hp_optimization
. window = [start,end]Standardize
(Optional, default=True): Boolean flag to
standardize the output data is output_class is “continuous”.feature_method
(Optional, default=“DeepFeatImp”): str,
“DeepFeatImp”, “DeepExplainer” or “KernelExplainer” Feature selection
methodtrain_model_file
(Optional,
default=“Trained_ANN_Architecture.h5”): str, file name to save trained
DNN architectureoutput_class
: str, “continuous”,“binary” or
“multiclass”. If output class is “multiclass” then function expects a
one-hot encoded dataframe.**kwargs
: Additional keyword arguments specific to
keras and tensorflow libraries used in the package.The function returns the trained ANN architecture and feature importance dataframe.
from RAISING.hp_tune_select import *
objective_test = "val_loss"
X_data # input dataset
y_data # output dataset
model_DNN = hp_optimization(input_data=X_data, output_data = y_data,objective_fun = objective_test,output_class ="continuous". **kwargs)
feature_est = feature_importance(input_data=X_data, output_data = y_data,feature_set = X_data.columns.to_list(),**kwargs)
This README is part of the documentation for the
RAISING
package.