Skip to content Skip to sidebar Skip to footer

Collate Model Coefficients Across Multiple Test-train Splits From Sklearn

I would like to combine the model/feature coefficients from multiple (random) test-train splits into a single dataframe in python. Currently, my approach this is to generate model

Solution 1:

As you mentioned, you can do this with a for loop:

# start by creating the first features column
coeff_table = pd.DataFrame(X.columns, columns=["features"])

# iterate over random states while keeping track of `i`for i, state inenumerate([11, 444, 21, 109, 1900]):
    train_x, test_x, train_y, test_y = train_test_split(
        X, y, stratify=y, test_size=0.3, random_state=state)
    log.fit(train_x, train_y) #fit final model 

    coeff_table[f"coefficients_{i+1}"] = np.transpose(log.coef_)

Note that we are dropping the predict and predict_proba calls in this loop since those values are being thrown away (overwritten each time in your code), however you can add them back using similar logic in the loop to create new columns in your table.

Post a Comment for "Collate Model Coefficients Across Multiple Test-train Splits From Sklearn"