<p>除了N.p.的回答。你可以这样做:</p>
<pre><code>import pandas as pd
import numpy as np
def generate_df(df_len):
values = np.random.binomial(n=1, p=0.1, size=df_len)
return pd.DataFrame({'value': values})
df = generate_df(1000)
</code></pre>
<p>编辑:</p>
<p>更完整的功能:</p>
<pre><code>def generate_df(df_len, option, p_success=0.1):
'''
Generate a pandas DataFrame with one single field filled with
1s and 0s in p_success proportion and length df_len.
Input:
- df_len: int, length of the 1st dimension of the DataFrame
- option: string, determines how will the sample be generated
* random: according to a bernoully distribution with p=p_success
* fixed: failures first, and fixed proportion of successes p_success
* fixed_shuffled: fixed proportion of successes p_success, random order
- p_success: proportion of successes among total
Output:
- df: pandas Dataframe
'''
if option == 'random':
values = np.random.binomial(n=1, p=p_success, size=df_len)
elif option in ('fixed_shuffled', 'fixed'):
n_success = int(df_len*p_success)
n_fail = df_len - n_success
values = [0]*n_fail + [1]*n_success
if option == 'fixed_shuffled':
np.random.shuffle(values)
else:
raise Exception('Unknown option: {}'.format(option))
df = pd.DataFrame({'value': values})
return df
</code></pre>