i. 创建一个存储过程来创建训练好的模型,在本例中是一个 Extra Trees 分类器算法。该过程将从上一步创建的 cancer 表中读取数据。
下面是用于创建过程的代码
*-- Stored procedure that generates a PyCaret model using the cancer data using Extra Trees Classifier Algorithm*
DROP PROCEDURE IF EXISTS generate_cancer_pycaret_model;
Go
CREATE PROCEDURE generate_cancer_pycaret_model (@trained_model varbinary(max) OUTPUT) AS
BEGIN
EXECUTE sp_execute_external_script
@language = N'Python'
, @script = N'
import pycaret.classification as cp
import pickle
trail1 = cp.setup(data = cancer_data, target = "Class", silent = True, n_jobs=None)
*# Create Model*
et = cp.create_model("et", verbose=False)
*#To improve our model further, we can tune hyper-parameters using tune_model function.
#We can also optimize tuning based on an evaluation metric. As our choice of metric is F1-score, lets optimize our algorithm!*
tuned_et = cp.tune_model(et, optimize = "F1", verbose=False)
*#The finalize_model() function fits the model onto the complete dataset.
#The purpose of this function is to train the model on the complete dataset before it is deployed in production*
final_model = cp.finalize_model(tuned_et)
*# Before saving the model to the DB table, convert it to a binary object*
trained_model = []
prep = cp.get_config("prep_pipe")
trained_model.append(prep)
trained_model.append(final_model)
trained_model = pickle.dumps(trained_model)'
, @input_data_1 = N'select "Class", "age", "menopause", "tumor_size", "inv_nodes", "node_caps", "deg_malig", "breast", "breast_quad", "irradiat" from dbo.cancer'
, @input_data_1_name = N'cancer_data'
, @params = N'@trained_model varbinary(max) OUTPUT'
, @trained_model = @trained_model OUTPUT;
END;
GO
ii. 创建一个表,用于存储训练好的模型对象
DROP TABLE IF EXISTS dbo.pycaret_models;
GO
CREATE TABLE dbo.pycaret_models (
model_id INT NOT NULL PRIMARY KEY,
dataset_name VARCHAR(100) NOT NULL DEFAULT('default dataset'),
model_name VARCHAR(100) NOT NULL DEFAULT('default model'),
model VARBINARY(MAX) NOT NULL
);
GO
*--Insert the results of the predictions for test set into a table*
INSERT INTO [pycaret_cancer_predictions]
EXEC pycaret_predict_cancer 2, 'cancer', 'Extra Trees Classifier algorithm';
iv. 执行下面的 SQL 查看预测结果
*-- Select contents of the table*
SELECT * FROM [pycaret_cancer_predictions];
四、结论
在这篇文章中,我们学习了如何使用 PyCaret 在 SQL Server 中构建分类模型。类似地,你可以根据业务问题的需求构建和运行其他类型的监督和无监督机器学习模型。
我未来的文章将是关于如何在 SQL Server 中使用 Python 和 PyCaret 探索其他监督和无监督学习技术的教程。
五、重要链接
我的前一篇文章** 详细介绍了如何整合** 与. 在本文中,我将提供详细的步骤,说明如何使用(PyCaret 是一个 Python 中的低代码机器学习库)在 SQL Server 中训练和部署一个监督机器学习分类模型。