RACER package
Submodules
RACER.RACER module
- class RACER.RACER.RACER(alpha=0.9, suppress_warnings=False, benchmark=False)[source]
Bases:
object
- _bool2str(bool_arr: ndarray) str [source]
Converts a boolean array to a human-readable string
- Args:
bool_arr (np.ndarray): The input boolean array
- Returns:
str: Human-readable string output
- _closest_match(X: ndarray) ndarray [source]
Find the closest matching rule to X (This will be extended later)
- Args:
X (np.ndarray): Input rule X
- Returns:
np.ndarray: Matched rule
- _composable(idx1: int, idx2: int) bool [source]
Returns true if two rules indicated by their indices are composable
- Args:
idx1 (int): Index of the first rule idx2 (int): Index of the second rule
- Returns:
bool: True if labels match and neither of the rules are covered. False otherwise.
- _compose(rule1: ndarray, rule2: ndarray) ndarray [source]
Composes rule1 with rule2
- Args:
rule1 (np.ndarray): The first rule rule2 (np.ndarray): The second rule
- Returns:
np.ndarray: The composed rule which is simply the bitwise OR of the two rules
- _confusion(rule_if: ndarray, rule_then: ndarray) Tuple[ndarray, ndarray] [source]
Returns n_correct and n_covered for instances classified by a rule.
- Args:
rule_if (np.ndarray): If part of rule (x) rule_then (np.ndarray): Then part of rule (y)
- Returns:
Tuple[np.ndarray, np.ndarray]: (n_covered, n_correct)
- _covered(X: ndarray, rule_if: ndarray) ndarray [source]
Returns indices of instances if X that are covered by rule_if. Note that rule covers instance if EITHER of the following holds in a bitwise manner: 1. instance[i] == 0 2. instance[i] == 1 AND rule[i] == 1
- Args:
X (np.ndarray): Instances rule_if (np.ndarray): If part of rule (x)
- Returns:
np.ndarray: An array containing indices in X that are covered by rule_if
- _fitness_fn(rule_if: ndarray, rule_then: ndarray) ndarray [source]
Returns fitness for a given rule according to the RACER paper
- Args:
rule_if (np.ndarray): If part of a rule (x) rule_then (np.ndarray): Then part of a rule (y)
- Returns:
np.ndarray: Fitness score for the rule as defined in the RACER paper
- _generalize_extants() None [source]
Generalize the extants by flipping every 0 to a 1 and checking if the fitness improves.
- _get_majority() ndarray [source]
Return the majority rule_then from self._y
- Returns:
np.ndarray: Majority rule_then
- _label_to_int(label: ndarray) int [source]
Converts dummy label to int
- Args:
label (np.ndarray): Label to convert
- Returns:
int: Converted label
- _process_rules(idx1: int, idx2: int) None [source]
Process two rules indiciated by their indices
- Args:
idx1 (int): Index of the first rule idx2 (int): Index of the second rule
- _update_extants(index: int, new_rule_if: ndarray, new_rule_then: ndarray, new_rule_fitness: ndarray)[source]
Remove all rules from current extants that are covered by new_rule. Then append new rule to extants.
- Args:
index (int): Index of new_rule new_rule_if (np.ndarray): If part of new_rule (x) new_rule_then (np.ndarray): Then part of new_rule (y) new_rule_fitness (np.ndarray): Fitness of the new_rule
- fit(X: ndarray, y: ndarray) None [source]
Fits the RACER algorithm on top of input data X and targets y. The code is written in close correlation to the pseudo-code provided in the RACER paper with some slight modifications.
- Args:
X (np.ndarray): Features vector y (np.ndarray): Targets vector
- predict(X: ndarray, convert_dummies=True) ndarray [source]
Given input X, predict label using RACER
- Args:
X (np.ndarray): Input features vector convert_dummies (bool, optional): Whether to convert dummy labels back to integert format. Defaults to True.
- Returns:
np.ndarray: Label as predicted by RACER
RACER.preprocessing module
- class RACER.preprocessing.RACERPreprocessor(target: str = 'auto', max_n_bins=32, max_num_splits=32, use_optimal_quantizer=False)[source]
Bases:
object
- fit(X: DataFrame | ndarray, y: DataFrame | ndarray)[source]
Fits the preprocessor on X and y for downstream transformations.
- Args:
X (Union[pd.DataFrame, np.ndarray]): Features vector y (Union[pd.DataFrame, np.ndarray]): Targets vector
- fit_transform(X: DataFrame | ndarray, y: DataFrame | ndarray) Tuple[ndarray, ndarray] [source]
Preprocesses the dataset by replacing nominal vaues with dummy variables. Converts to numpy boolean arrays and returns the dataset. All numerical values are discretized using an optimal binning strategy that employs a decision tree as a preprocessing step.
- Args:
X (Union[pd.DataFrame, np.ndarray]): Features matrix y (Union[pd.DataFrame, np.ndarray]): Targets vector
- Returns:
Tuple[np.ndarray, np.ndarray]: Transformed features matrix and targets vectors.
- fit_transform_pandas(X: DataFrame | ndarray, y: DataFrame | ndarray) Tuple[ndarray, ndarray] [source]
Preprocesses the dataset by replacing nominal vaues with dummy variables. Converts to numpy boolean arrays and returns the dataset. All numerical values are discretized using an optimal binning strategy that employs a decision tree as a preprocessing step. (This uses the legacy pandas dummy encoder. You can use this to retain total backward compatibility with previous code)
- Args:
X (Union[pd.DataFrame, np.ndarray]): Features matrix y (Union[pd.DataFrame, np.ndarray]): Targets vector
- Returns:
Tuple[np.ndarray, np.ndarray]: Transformed features matrix and targets vectors.
- transform(X: DataFrame | ndarray, y: DataFrame | ndarray) Tuple[ndarray, ndarray] [source]
Transforms the provided new X and y with previously fitted preprocessor.
- Args:
X (Union[pd.DataFrame, np.ndarray]): Features matrix y (Union[pd.DataFrame, np.ndarray]): Targets vector
- Returns:
Tuple[np.ndarray, np.ndarray]: Transformed features matrix and targets vectors.