使用python从分布中创建一列随机数,并从其他列中创建mean和std

2024-09-25 00:21:36 发布

您现在位置:Python中文网/ 问答频道 /正文

我希望基本上能够将一列传递给np.random.normal()函数。你知道吗

我有以下几点:

def calc_z(w,S,a1,a2,yt1,yt2):

    mu = w * S
    print 'Mu' , mu
    sigma = mt.sqrt(0.5)
    z = np.array(np.random.normal(mu,sigma))
    u = [a1,a2,z]
    yt = [yt1,yt2,1]
    thetaset = np.random.rand(len(u))
    m = [i for i in range(len(u))]

    max_iter = 30

#Calculate E-step
    for i in range(max_iter):

        print 'Iteration:', i
        print 'z:', z
        print 'thetaset', thetaset

        devLz = eq6(var,w,S,z,yt,u,thetaset,m)
        dev2Lz2 = eq9(var,thetaset,u)

#Calculate M-Step
        z = z - (devLz / dev2Lz2)
        w = lambdaw * z

        for i in range(len(thetaset)):

            devLTheta = eq7(yt,u,thetaset,lambdatheta)
            dev2LTheta2 = eq10(thetaset,u,lambdatheta)           

            thetaset = thetaset - (devLTheta / dev2LTheta2)

    return float(z)

calc_z_udf = udf(calc_z,FloatType())

data.show()

data = data.withColumn('z', calc_z(data['w'],data['Org_Depth_Diff_S'],data['proximity_rank_a1'],data['cotravel_count_a2'],data['cotravel_yt1'],data['proximity_yt2']))

但是当我在中传递S时,np.random.normal函数不喜欢被传递给列,并给出以下错误

Traceback (most recent call last):
  File "/home/taylorr2/PySparkLatent3.py", line 125, in <module>
    data = data.withColumn('z', calc_z(data['w'],data['Org_Depth_Diff_S'],data['proximity_rank_a1'],data['cotravel_count_a2'],data['cotravel_yt1'],data['proximity_yt2']))
  File "/home/taylorr2/PySparkLatent3.py", line 90, in calc_z
    z = np.array(np.random.normal(mu,sigma))
  File "mtrand.pyx", line 1282, in mtrand.RandomState.normal (numpy/random/mtrand/mtrand.c:6920)
ValueError: setting an array element with a sequence.

我在想一种方法让这个函数接受这个值,或者用另一种方法。你知道吗

谢谢!你知道吗


Tags: ina2dataa1npcalcrandomprint