尝试避免在每次运行.py文件时加载大型数据集

2024-10-01 11:41:25 发布

您现在位置:Python中文网/ 问答频道 /正文

我被一个简单的代码卡住了

我只想在手套模型还没有加载的情况下加载它(一个大的向量数据集);也就是说,我不想在每次运行.py文件时都重新加载它时等待(每次都可能需要一段时间)。但是,每次我运行脚本时,它都会加载,就像以前从未运行过一样

我在conda虚拟环境中使用Spyder 4,在Mac上使用Python 3.6.10

以下是我尝试过的三件事:

# Check if gloVe already loaded. If yes, skip loading again. 
try:
    gloveModel
except NameError:
    # Load gloVe pre-trained vectors. 
    # Each key is a word; each val is a np array of length 50. 
    filename = 'glove_twitter_50d.txt'
    print("gloVe vectors loading . . .")
    foo = open(filename,'r')
    gloveModel = {}
    for line in foo:
        splitLines = line.split()
        word = splitLines[0]
        wordEmbedding = np.array([float(value) for value in splitLines[1:]])
        gloveModel[word] = wordEmbedding
    print(len(gloveModel),"gloVe loaded.")
else:
    print("gloVe already loaded.")

而且

   # Check if gloVe already loaded. If yes, skip loading again. 
try:
    assert('gloveModel' in locals() or 'gloveModel' in globals())
except AssertionError:
    # Load gloVe pre-trained vectors. 
    # Each key is a word; each val is a np array of length 50. 
    filename = 'glove_twitter_50d.txt'
    print("gloVe vectors loading . . .")
    foo = open(filename,'r')
    gloveModel = {}
    for line in foo:
        splitLines = line.split()
        word = splitLines[0]
        wordEmbedding = np.array([float(value) for value in splitLines[1:]])
        gloveModel[word] = wordEmbedding
    print(len(gloveModel),"gloVe vectors loaded.")
else:
   print("gloVe already loaded.")

然后也

# Check if gloVe already loaded. If yes, skip loading again. 
try:
    present = 'gloveModel' in locals() or 'gloveModel' in globals()
    if not present:
        raise ValueError
except ValueError:
    # Load gloVe pre-trained vectors. 
    # Each key is a word; each val is a np array of length 50. 
    filename = 'glove_twitter_50d.txt'
    print("gloVe vectors loading . . .")
    foo = open(filename,'r')
    gloveModel = {}
    for line in foo:
        splitLines = line.split()
        word = splitLines[0]
        wordEmbedding = np.array([float(value) for value in splitLines[1:]])
        gloveModel[word] = wordEmbedding
    print(len(gloveModel),"gloVe vectors loaded.")
else:
   print("gloVe already loaded.")

提前谢谢,我很感激


Tags: infooisnpfilenamearraywordprint