Getting Started with Logistic Regression ========================================= With Mlektic, you can perform univariate or multivariate logistic regression (with one or more classes), using the log loss method or by considering various optimization options available in the `optimizer_archt` module. Supported optimization methods include: - ‘mle’ - ‘sgd-standard’ - ‘sgd-stochastic’ - ‘sgd-mini-batch’ - ‘sgd-momentum’ - ‘nesterov’ - ‘adagrad’ - ‘adadelta’ - ‘rmsprop’ - ‘adam’ - ‘adamax’ - ‘nadam’ For more details on the `optimizer_archt` module, please refer to the `optimizer_archt documentation `_. You can also apply regularization to improve model generalization. The `regularizer_archt` module supports the following regularization methods: - ‘l1’ (default) - ‘l2’ - ‘elastic_net’ To learn more about the `regularizer_archt` module, please refer to the `regularizer_archt documentation `_. For example, you can train a model using logistic regression with standard gradient descent and L1 regularization with the `LogisticRegressionArcht` module as follows: .. code-block:: python import pandas as pd import numpy as np from mlektic.logistic_reg import LogisticRegressionArcht from mlektic import preprocessing from mlektic import methods # Generate random data. np.random.seed(42) n_samples = 100 feature1 = np.random.rand(n_samples) feature2 = np.random.rand(n_samples) target = (3 * feature1 + 5 * feature2 + np.random.randn(n_samples) * 0.5) > 4.0 target = target.astype(np.float32) # Create pandas dataframe from the data. df = pd.DataFrame({ 'feature1': feature1, 'feature2': feature2, 'target': target }) # Create train and test sets. train_set, test_set = preprocessing.pd_dataset(df, ['feature1', 'feature2'], 'target', 0.8) # Define regularizer and optimizer. regularizer = methods.regularizer_archt('l1', lambda_value=0.01) optimizer = methods.optimizer_archt('sgd-standard', learning_rate=0.1) # Configure the model. log_reg = LogisticRegressionArcht(iterations=1000, optimizer=optimizer, regularizer=regularizer) # Train the model. log_reg.train(train_set) .. code-block:: text Epoch 100, Loss: 0.5152596235275269, Accuracy: 0.862500011920929 Epoch 200, Loss: 0.4489741921424866, Accuracy: 0.862500011920929 Epoch 300, Loss: 0.4166463613510132, Accuracy: 0.875 Epoch 400, Loss: 0.39809101819992065, Accuracy: 0.887499988079071 Epoch 500, Loss: 0.38631850481033325, Accuracy: 0.887499988079071 Epoch 600, Loss: 0.37834054231643677, Accuracy: 0.887499988079071 Epoch 700, Loss: 0.37267810106277466, Accuracy: 0.875 Epoch 800, Loss: 0.3685190677642822, Accuracy: 0.875 Epoch 900, Loss: 0.36538296937942505, Accuracy: 0.875 Epoch 1000, Loss: 0.362968385219574, Accuracy: 0.875 To learn more about the `LogisticRegressionArcht` module, please refer to the `LogisticRegressionArcht documentation `_. The cost evolution can be plotted with the `plot_cost `_ method: .. code-block:: python from mlektic.plot_utils import plot_cost cost_history = log_reg.get_cost_history() plot_cost(cost_history, dim=(7, 5)) .. raw:: html

cost plot

Different evaluation metrics can be obtained: .. code-block:: python categorical_crossentropy = log_reg.eval(test_set, 'categorical_crossentropy') binary_crossentropy = log_reg.eval(test_set, 'binary_crossentropy') accuracy = log_reg.eval(test_set, 'accuracy') precision = log_reg.eval(test_set, 'precision') recall = log_reg.eval(test_set, 'recall') f1_score = log_reg.eval(test_set, 'f1_score') confusion_matrix = log_reg.eval(test_set, 'confusion_matrix') print(f'Categorical Crossentropy: {categorical_crossentropy}') print(f'Binary Crossentropy: {binary_crossentropy}') print(f'Accuracy: {accuracy}') print(f'Precision: {precision}') print(f'Recall: {recall}') print(f'F1 Score: {f1_score}') print(f'Confusion Matrix: \n{confusion_matrix}') .. code-block:: text Categorical Crossentropy: 0.22856256365776062 Accuracy: 0.949999988079071 Precision: 1.0 Recall: 0.9090909361839294 F1 Score: 0.952380895614624 Confusion Matrix: [[10. 0.] [ 1. 9.]] Print the parameters obtained by training: .. code-block:: python print("Weights:", log_reg.get_parameters()) print("Intercept:", log_reg.get_intercept()) .. code-block:: text Weights: [[-1.506539 1.506539 ] [-3.4472232 3.4472237]] Intercept: [ 2.4116907 -2.4116907] And make predictions: .. code-block:: python prob_prediction = log_reg.predict_prob([2.0, 3.0]) print(f'Predicted probability for class 0: {prob_prediction[0][0]}') print(f'Predicted probability for class 1: {prob_prediction[0][1]}') .. code-block:: text Predicted probability for class 0: 3.1259400623540046e-10 Predicted probability for class 1: 1.0 .. code-block:: python class_prediction = log_reg.predict_class([2.0, 3.0]) print('Predicted class:', class_prediction[0]) .. code-block:: text Predicted class: 1 Finally, you can save the model parameters in a JSON format for future use: .. code-block:: python log_reg.save_model('logistic_regression_model.json')