Two popular PHP libraries for machine learning are Rubix ML and PHP-ML. Rubix ML provides a comprehensive suite of tools with support for Linear Regression and other algorithms, while PHP-ML is a simpler library with a wide range of algorithms.
Implementing Simple Linear Regression with Rubix ML
Using Rubix ML, we’ll set up a simple example that predicts housing prices based on square footage.
Step 1: Prepare the Data
For this example, let’s use a small dataset with square footage and price.
Now, we’ll create a Ridge Regression model (a regularized form of linear regression) to prevent overfitting. We then train it on our dataset.
$estimator =newRidge(1.0); // 1.0 is the regularization strength$estimator->train($dataset);
Step 3: Make Predictions
Once trained, we can use the model to make predictions on new data.
// Make prediction for new house with 2200 sq ft$newSample = [2200];$newDataset =newUnlabeled([$newSample]);$prediction = $estimator->predict($newDataset);// Show resultsecho"Sample size: 2200 sq.ft";echo"\nPredicted Price for: $".number_format($prediction[0], decimals:2);
Step 4: Evaluate the Model
To measure the model’s accuracy, we can use a metric like Mean Squared Error (MSE), which calculates the average of squared differences between predicted and actual values.
useRubix\ML\Datasets\Labeled;useRubix\ML\Regressors\Ridge;useRubix\ML\CrossValidation\Metrics\MeanSquaredError;// Sample data: [Square Footage] => Price$samples = [ [800,160000], [900,180000], [1000,200000], [1100,220000], [1200,240000], [1300,260000], [1400,280000],];// Create a dataset from our samples (splits into features and labels)$dataset =Labeled::fromIterator($samples);// Create and train Ridge regression model// 1.0 controls how much we prevent overfitting$estimator =newRidge(1.0);$estimator->train($dataset);// Predict price for a 2200 sq ft house$newSample = [2200];$newDataset =newUnlabeled([$newSample]);$prediction = $estimator->predict($newDataset);// Show resultsecho'Sample size: 2200 sq.ft';echo"\nPredicted Price for: $".number_format($prediction[0], decimals:2);// Check how accurate our model is using Mean Squared Error// Lower number = better predictions$predictions = $estimator->predict($dataset);$mse =newMeanSquaredError();echo"\n\nMean Squared Error: ".number_format($mse->score($predictions, $dataset->labels()),10);
Advanced Techniques: Feature Scaling and Regularization
When working with Linear Regression, it’s essential to consider feature scaling. Differences in feature scales (e.g., square footage vs. number of bedrooms) can cause the model to weigh one feature more heavily than another. Standardizing features to a similar scale improves model accuracy.
In Rubix ML, you can apply feature scaling using Z Scale Standardizer:
Linear Regression is a powerful tool for predicting continuous values, and implementing it in PHP with Rubix ML and PHP-ML provides a straightforward approach to creating predictive models. By understanding the underlying mechanics, using regularization to prevent overfitting, and scaling features, you can build effective regression models in PHP.