data_types.Categorical.Categorical

class data_types.Categorical.Categorical

Support for categorical variables, such as fruits: [‘apple’, ‘banana’, ‘coconut’], or home_ownership: [‘rent’, ‘own].

__init__()

Initialize self. See help(type(self)) for accurate signature.

Methods

__init__() Initialize self.
input_nub_generator(variable, …) Generate an input layer and input ‘nub’ for a Keras network.
output_inverse_transform(y_pred, …) Undo the transforming that was done to get data into a keras model.
output_nub_generator(variable, …) Generate an output layer for a Keras network.
output_suggested_loss()
_check_output_support()
static input_nub_generator(variable, transformed_observations)

Generate an input layer and input ‘nub’ for a Keras network.

  • input_layer: The input layer accepts data from the outside world.
  • input_nub: The input nub will always include the input_layer as its first layer. It may also include

other layers for handling the data type in specific ways

Parameters:
  • variable (str) – Name of the variable
  • transformed_observations (pandas.DataFrame) – A dataframe, containing either the specified variable, or derived variables
Returns:

A tuple containing the input layer, and the last layer of the nub

output_inverse_transform(y_pred, response_transform_pipeline)

Undo the transforming that was done to get data into a keras model. This inverse transformation will render the observations so they can be compared to the data in the natural scale provided by the user :param response_transform_pipeline: An SKLearn transformation pipeline, trained on the same variable as the model which produced y_pred :param y_pred: The data predicted by keras :return: The same data, in the natural basis

output_nub_generator(variable, transformed_observations)

Generate an output layer for a Keras network.

  • output_layer: A keras layer, which is formatted to correctly accept the response variable
Parameters:
  • variable (str) – A Variable contained in the input_df
  • transformed_observations (pandas.DataFrame) – A dataframe, containing either the specified variable, or derived variables
Returns:

output_layer

output_suggested_loss()