我正在使用Django项目开发一个网站。我采取销售价格(基于关键字),并作出未来的价格预测和机器学习的数据从易趣api数据库调用(关键字)。我正在使用tensorflow的keras,根据sklearn.model选择的train\u test\u进行价格预测。我所有的特性和目标(售价)数组都在-1和1之间缩小,平均值为0,标准偏差为1。所有的预测都返回到1.0,我不知道为什么。看起来我对模型“太合适了”。我想知道是否有人能帮我。我将在下面显示我的神经网络文件
神经网络.py
import pandas as pd
import numpy as np
from tensorflow import keras
from sklearn.preprocessing import MinMaxScaler
from sklearn.model_selection import train_test_split
class Neural_Network:
def neural_network(self, n_df):
df = n_df.copy()
df = df.replace('^\s*$', np.nan, regex=True)
#df['itemId'] = df['itemId'].astype(int)
df['listingType'] = pd.get_dummies(df['listingType'])
df['endPrice'] = df['endPrice'].astype(float)
df['shippingServiceCost'] = df['shippingServiceCost'].astype(float)
#df['shippingServiceCost'] = df['shippingServiceCost'].interpolate()
df['shippingServiceCost'] = df['shippingServiceCost'].fillna(df['shippingServiceCost'].mean())
df['bidCount'] = df['bidCount'].astype(np.float)
#df['bidCount'] = df['bidCount'].interpolate()
df['bidCount'] = df['bidCount'].fillna(df['bidCount'].mean())
df['watchCount'] = df['watchCount'].astype(np.float)
#df['watchCount'] = df['watchCount'].interpolate()
df['watchCount'] = df['watchCount'].fillna(df['watchCount'].mean())
df['returnsAccepted'] = pd.get_dummies(df['returnsAccepted'])
df['handlingTime'] = df['handlingTime'].astype(int)
df['sellerUserName'] = pd.get_dummies(df['sellerUserName'])
df['feedbackScore'] = df['feedbackScore'].astype(int)
df['positiveFeedbackPercent'] = df['positiveFeedbackPercent'].astype(float)
df['topRatedSeller'] = pd.get_dummies(df['topRatedSeller'])
df['endDate'] = pd.get_dummies(df['endDate'])
print('\nnull values in dataframe are:\n', df.isnull().any())
features_df = df.drop(['itemId','title','endPrice','location','endTime','startTime','endTimeOfDay'], axis=1)
num_of_cols = len(features_df.columns)
features = features_df.values
target = df.endPrice.values
print('\ntarget values:\n', target)
print('\nfeatures values:\n', features)
print('\ntarget shape:\n', target.shape)
print('\nfeatures shape:\n', features.shape)
X_train, X_test, y_train, y_test = train_test_split(features, target, test_size=0.3, random_state=124)
print('\nTRAIN TEST SPLIT EXECUTED\n')
X_train = MinMaxScaler(feature_range=(-1,1)).fit_transform(X_train)
X_test = MinMaxScaler(feature_range=(-1,1)).fit_transform(X_test)
print('\nX_train and X_test scaled\n')
y_train = y_train.reshape(-1,1)
y_test = y_test.reshape(-1,1)
y_train = MinMaxScaler(feature_range=(-1,1)).fit_transform(y_train)
y_test = MinMaxScaler(feature_range=(-1,1)).fit_transform(y_test)
y_train = y_train.reshape(-1)
y_test = y_test.reshape(-1)
print('\nshape of X_train:\n', X_train.shape)
print('\nshape of X_test:\n', X_test.shape)
print('\nshape of y_train:\n', y_train.shape)
print('\nshape of y_test:\n', y_test.shape)
model = keras.Sequential()
input_layer = keras.layers.Dense(16, input_dim=num_of_cols, activation='sigmoid')
model.add(input_layer)
hidden_layer = keras.layers.Dense(num_of_cols, input_dim=16, activation='sigmoid')
model.add(hidden_layer)
output_layer = keras.layers.Dense(1, input_dim=num_of_cols, activation='softmax')
model.add(output_layer)
model.compile(loss='binary_crossentropy', optimizer='rmsprop', metrics=['accuracy'])
history = model.fit(X_train, y_train, validation_split=0.2, batch_size=32, epochs=100, shuffle=True)
predictions = model.predict(X_test, verbose=0, steps=1)
print('\npredictions shape:\n', predictions.shape)
pred_nn_df = pd.DataFrame({'predictions':pd.Series(np.round(predictions.reshape(-1),2)),'actual_sell_prices':pd.Series(y_test)})
return pred_nn_df, history
我会展示dataframe列和值的示例,但我不能发布图像,复制/粘贴到stackoverflow会导致混乱。所以我假设你对ebay有足够的了解,可以想象神经网络文件中使用的特征的典型值。 我试着在网上寻找有类似问题的人,却找不到任何有效的方法。我敢肯定,不知怎么的,我太适合这个模型了
“预测”数组示例(每次获得的输出):
[1.0 1.0 1.0 1.0 ... 1.0 1.0]
“实际售价”数组示例(在(-1,1)之间缩小):
[-0.104930 -0.866221 0.279235 ... 1.000000-0.201099]
有什么想法吗
在输出层对一个神经元使用
softmax
,只输出一个恒定的1.0值。您需要使用一个适当的激活函数,可能是tanh
,因为它输出在[-1, 1]
范围内相关问题 更多 >
编程相关推荐