Can machine learning model simple mathematical functions?
Use machine learning to model some basic mathematical functions
Recently, applying machine learning to various tasks has become a common practice. It seems that every emerging technology on Gartner's technology cycle curve involves machine learning, which is a trend. These algorithms are regarded as figure-out-it-yourself models: decompose any type of data into a series of features, apply some black-box machine learning models, solve each model and choose the one with the best result.
But can machine learning really solve all problems? Or is it only suitable for a small number of tasks? In this article, we are trying to answer a more basic question, that is, whether machine learning can deduce those mathematical relationships that often appear in daily life. Here, I will try to use some popular machine learning techniques to fit several basic functions, and observe whether these algorithms can identify and model these basic mathematical relationships.
Functions we will try:
- Linear function
- Exponential function
- Logarithmic function
- Power function
- Modular function
- Trigonometric function
Machine learning algorithms that will be used:
- Linear regression
- Support Vector Regression (SVR)
- Decision tree
- Random forest
- Multilayer perceptron (feedforward neural network)
I will keep the dimension of the dependent variable (translator's note: the original text is wrong, it should be the independent variable) as 4 (there is no reason for choosing this particular number). Therefore, the relationship between X (independent variable) and Y (dependent variable) is:
f :- the function we will fit
Epsilon:- Random noise (to make Y a bit more real, because there is always some noise in the data in real life)
Each function type uses a series of parameters. These parameters are obtained by generating random numbers, the method is as follows:
randint() is used to obtain the parameters of the power function, so that the value of Y is not particularly small. normal() is used in all other cases.
Generate the independent variable (ie X):
function_type = 'Linear' if function_type=='Logarithmic': X_train = abs(np.random.normal(loc=5, size=(1000, 4))) X_test = abs(np.random.normal(loc=5, size=(500, 4))) else: X_train = np.random.normal(size=(1000, 4)) X_test = np.random.normal(size=(500, 4))
For logarithmic functions, use a normal distribution with a mean of 5 (the mean is much greater than the variance) to avoid negative values.
Get the dependent variable (ie Y):
def get_Y(X, function_type, paras): X1 = X[:,0] X2 = X[:,1] X3 = X[:,2] X4 = X[:,3] if function_type=='Linear': [a0, a1, a2, a3, a4] = paras noise = np.random.normal(scale=(a1*X1).var(), size=X.shape) Y = a0+a1*X1+a2*X2+a3*X3+a4*X4+noise elif function_type=='Exponential': [a0, a1, a2, a3, a4] = paras noise = np.random.normal(scale=(a1*np.exp(X1)).var(), size=X.shape) Y = a0+a1*np.exp(X1)+a2*np.exp(X2)+a3*np.exp(X3)+a4*np.exp(X4)+noise elif function_type=='Logarithmic': [a0, a1, a2, a3, a4] = paras noise = np.random.normal(scale=(a1*np.log(X1)).var(), size=X.shape) Y = a0+a1*np.log(X1)+a2*np.log(X2)+a3*np.log(X3)+a4*np.log(X4)+noise elif function_type=='Power': [a0, a1, a2, a3, a4] = paras noise = np.random.normal(scale=np.power(X1,a1).var(), size=X.shape) Y = a0+np.power(X1,a1)+np.power(X2,a2)+np.power(X2,a2)+np.power(X3,a3)+np.power(X4,a4)+noise elif function_type=='Modulus': [a0, a1, a2, a3, a4] = paras noise = np.random.normal(scale=(a1*np.abs(X1)).var(), size=X.shape) Y = a0+a1*np.abs(X1)+a2*np.abs(X2)+a3*np.abs(X3)+a4*np.abs(X4)+noise elif function_type=='Sine': [a0, a1, b1, a2, b2, a3, b3, a4, b4] = paras noise = np.random.normal(scale=(a1*np.sin(X1)).var(), size=X.shape) Y = a0+a1*np.sin(X1)+b1*np.cos(X1)+a2*np.sin(X2)+b2*np.cos(X2)+a3*np.sin(X3)+b3*np.cos(X3)+a4*np.sin(X4)+b4*np.cos(X4)+noise else: print('Unknown function type') return Y if function_type=='Linear': paras = [0.35526578, -0.85543226, -0.67566499, -1.97178384, -1.07461643] Y_train = get_Y(X_train, function_type, paras) Y_test = get_Y(X_test, function_type, paras) elif function_type=='Exponential': paras = [ 0.15644562, -0.13978794, -1.8136447 , 0.72604755, -0.65264939] Y_train = get_Y(X_train, function_type, paras) Y_test = get_Y(X_test, function_type, paras) elif function_type=='Logarithmic': paras = [ 0.63278503, -0.7216328 , -0.02688884, 0.63856392, 0.5494543] Y_train = get_Y(X_train, function_type, paras) Y_test = get_Y(X_test, function_type, paras) elif function_type=='Power': paras = [2, 2, 8, 9, 2] Y_train = get_Y(X_train, function_type, paras) Y_test = get_Y(X_test, function_type, paras) elif function_type=='Modulus': paras = [ 0.15829356, 1.01611121, -0.3914764 , -0.21559318, -0.39467206] Y_train = get_Y(X_train, function_type, paras) Y_test = get_Y(X_test, function_type, paras) elif function_type=='Sine': paras = [-2.44751615, 1.89845893, 1.78794848, -2.24497666, -1.34696884, 0.82485303, 0.95871345, -1.4847142 , 0.67080158] Y_train = get_Y(X_train, function_type, paras) Y_test = get_Y(X_test, function_type, paras)
The noise is randomly sampled from a normal distribution with zero mean. I set the variance of the noise to be equal to the variance of f(X) to ensure that the "signal and noise" in our data are comparable, and the noise will not be lost in the signal, and vice versa.
Note: There is no hyperparameter adjustment in any model. Our basic idea is to only make a rough estimate of the performance of the mentioned functions in these models, so we did not make too many optimizations on these models.
model_type = 'MLP' if model_type=='XGBoost': model = xgb.XGBRegressor() elif model_type=='Linear Regression': model = LinearRegression() elif model_type=='SVR': model = SVR() elif model_type=='Decision Tree': model = DecisionTreeRegressor() elif model_type=='Random Forest': model = RandomForestRegressor() elif model_type=='MLP': model = MLPRegressor(hidden_layer_sizes=(10, 10)) model.fit(X_train, Y_train)
Most performance results are much better than the average baseline. The calculated average R-square is 70.83% . We can say that machine learning technology can indeed effectively model these simple mathematical functions .
But through this experiment, we not only know whether machine learning can model these functions, but also understand how different machine learning techniques perform on various basic functions.
Some results are surprising (at least to me), some are reasonable. In short, these results reconfirmed some of our previous ideas, and also gave new ideas.
Finally, we can get the following conclusions:
- Although linear regression is a simple model, it performs better than other models on linearly related data
- In most cases, decision tree <random forest <XGBoost, which is based on the performance of the experiment (5 out of the above 6 results are obvious)
- Unlike recent trends in practice, XGBoost (only 2 out of 6 results perform best) should not be a one-stop solution for all types of list data, we still need to compare each model fairly.
- Contrary to our guess, a linear function is not necessarily the easiest function to predict. We got the best aggregate R-square result on the logarithmic function, reaching 92.98%
- The effects of various technologies on different basic functions (relatively) are very different. Therefore, which technology to choose for a task must be well thought and experimented
See my github for the complete code .
Like, comment and share. Constructive criticism and feedback are always welcome!
If you find there is a translation error or other areas for improvement, welcome to Denver translation program to be modified and translations PR, also obtained the corresponding bonus points. The beginning of the article Permalink article is the MarkDown the links in this article on GitHub.
Nuggets Translation Project is a high-quality translation of technical articles Internet community, Source for the Nuggets English Share article on. Content covering Android , iOS , front-end , back-end , block chain , product , design , artificial intelligence field, etc., you want to see more high-quality translations, please continue to focus Nuggets translation program , the official micro-blog , we know almost columns .