为什么我的深度学习应用程序需要这么多RAM?

2024-06-11 14:08:18 发布

您现在位置:Python中文网/ 问答频道 /正文

我对深度学习还不熟悉。我在Coursera中学到了深度学习的发展。我下载了应用程序来运行我自己的数据集。事实证明,它需要更多的内存。我在GoogleColab上试用过,它有25GB的内存,但仍然不起作用

另一个拥有20064*64像素图像的数据集可以完美运行。我的数据集有500张800*800图像,并且由于缺少RAM而崩溃。我认为这就是问题所在。现在我有两个问题:

  1. 如何优化此应用程序以训练数据集
  2. 有没有可能有更好的NN模型来解决RAM问题
    # GRADED FUNCTION: two_layer_model
    def two_layer_model(X, Y, layers_dims, learning_rate = 0.0075, num_iterations = 3000, print_cost=False):
        """
        Implements a two-layer neural network: LINEAR->RELU->LINEAR->SIGMOID.
        
        Arguments:
        X -- input data, of shape (n_x, number of examples)
        Y -- true "label" vector (containing 1 if cat, 0 if non-cat), of shape (1, number of examples)
        layers_dims -- dimensions of the layers (n_x, n_h, n_y)
        num_iterations -- number of iterations of the optimization loop
        learning_rate -- learning rate of the gradient descent update rule
        print_cost -- If set to True, this will print the cost every 100 iterations 
        
        Returns:
        parameters -- a dictionary containing W1, W2, b1, and b2
        """
        
        np.random.seed(1)
        grads = {}
        costs = []                              # to keep track of the cost
        m = X.shape[1]                           # number of examples
        (n_x, n_h, n_y) = layers_dims
        
        # Initialize parameters dictionary, by calling one of the functions you'd previously implemented
        ### START CODE HERE ### (≈ 1 line of code)
        parameters = initialize_parameters(n_x, n_h, n_y)
        ### END CODE HERE ###
        
        # Get W1, b1, W2 and b2 from the dictionary parameters.
        W1 = parameters["W1"]
        b1 = parameters["b1"]
        W2 = parameters["W2"]
        b2 = parameters["b2"]
        
        # Loop (gradient descent)
    
        for i in range(0, num_iterations):
    
            # Forward propagation: LINEAR -> RELU -> LINEAR -> SIGMOID. Inputs: "X, W1, b1, W2, b2". Output: "A1, cache1, A2, cache2".
            ### START CODE HERE ### (≈ 2 lines of code)
            A1, cache1 = linear_activation_forward(X, W1, b1, activation='relu')
            A2, cache2 = linear_activation_forward(A1, W2, b2, activation='sigmoid')
            ### END CODE HERE ###
            
            # Compute cost
            ### START CODE HERE ### (≈ 1 line of code)
            cost = compute_cost(A2, Y)
            ### END CODE HERE ###
            
            # Initializing backward propagation
            dA2 = - (np.divide(Y, A2) - np.divide(1 - Y, 1 - A2))
            
            # Backward propagation. Inputs: "dA2, cache2, cache1". Outputs: "dA1, dW2, db2; also dA0 (not used), dW1, db1".
            ### START CODE HERE ### (≈ 2 lines of code)
            dA1, dW2, db2 =  linear_activation_backward(dA2, cache2, activation='sigmoid')
            dA0, dW1, db1 =  linear_activation_backward(dA1, cache1, activation='relu')
            ### END CODE HERE ###
            
            # Set grads['dWl'] to dW1, grads['db1'] to db1, grads['dW2'] to dW2, grads['db2'] to db2
            grads['dW1'] = dW1
            grads['db1'] = db1
            grads['dW2'] = dW2
            grads['db2'] = db2
            
            # Update parameters.
            ### START CODE HERE ### (approx. 1 line of code)
            parameters = update_parameters(parameters, grads, learning_rate)
            ### END CODE HERE ###
    
            # Retrieve W1, b1, W2, b2 from parameters
            W1 = parameters["W1"]
            b1 = parameters["b1"]
            W2 = parameters["W2"]
            b2 = parameters["b2"]
            
            # Print the cost every 100 training example
            if print_cost and i % 100 == 0:
                print("Cost after iteration {}: {}".format(i, np.squeeze(cost)))
            if print_cost and i % 100 == 0:
                costs.append(cost)
           
        # plot the cost
    
        plt.plot(np.squeeze(costs))
        plt.ylabel('cost')
        plt.xlabel('iterations (per tens)')
        plt.title("Learning rate =" + str(learning_rate))
        plt.show()
        
        return parameters

此方法是训练我的数据集的主要方法。 运行此语句后:

parameters = two_layer_model(train_x, train_y, layers_dims = (n_x, n_h, n_y), num_iterations = 2500, print_cost=True)

我期望得到如下输出:

Cost after iteration 100: 0.6464320953428849
Cost after iteration 200: 0.6325140647912678
Cost after iteration 300: 0.6015024920354665
Cost after iteration 400: 0.5601966311605747
Cost after iteration 500: 0.5158304772764729
Cost after iteration 600: 0.4754901313943325
Cost after iteration 700: 0.43391631512257495
Cost after iteration 800: 0.4007977536203887
Cost after iteration 900: 0.35807050113237976
Cost after iteration 1000: 0.33942815383664127
Cost after iteration 1100: 0.30527536361962654
Cost after iteration 1200: 0.2749137728213016
Cost after iteration 1300: 0.24681768210614846
Cost after iteration 1400: 0.19850735037466097
Cost after iteration 1500: 0.17448318112556657
Cost after iteration 1600: 0.1708076297809689
Cost after iteration 1700: 0.11306524562164715
Cost after iteration 1800: 0.09629426845937145
Cost after iteration 1900: 0.08342617959726863
Cost after iteration 2000: 0.07439078704319078
Cost after iteration 2100: 0.06630748132267933
Cost after iteration 2200: 0.0591932950103817
Cost after iteration 2300: 0.05336140348560554
Cost after iteration 2400: 0.04855478562877016

但是我的应用程序在迭代100次后第一行计算成本后崩溃,下面是Colab的应用程序日志:

^{tb1}$

可以找到整个应用程序代码here

任何帮助都将不胜感激


Tags: oftheherecodeactivationb2b1w1