TPU问题。过渡TF 1.3至TF 2.1

from __future__ import absolute_import, division, print_function, unicode_literals import tensorflow as tf import tensorflow.keras as k print('TF v:', tf.__version__, 'Keras v:', k.__version__) resolver = tf.distribute.cluster_resolver.TPUClusterResolver(tpu='grpc://xx.xx.xx.xx:8470') tf.config.experimental_connect_to_cluster(resolver) tf.tpu.experimental.initialize_tpu_system(resolver) strategy = tf.distribute.experimental.TPUStrategy(resolver)

with strategy.scope(): model = k.Sequential() model.add(k.layers.Conv1D(filters=16, kernel_size=2, activation = 'relu', input_shape=(window_size, 1) )) model.add(k.layers.Conv1D(filters=32, kernel_size=2, activation = 'relu')) model.add(k.layers.Conv1D(filters=64, kernel_size=2, activation = 'relu')) model.add(k.layers.Conv1D(filters=128, kernel_size=2, activation = 'relu')) model.add(k.layers.MaxPooling1D(pool_size=2)) model.add(k.layers.Flatten()) model.add(k.layers.Dense(cats, activation='softmax')) # summary print(model.metrics_names) print(model.summary()) print('--') model.compile(optimizer='adam', loss=tf.keras.losses.CategoricalCrossentropy(from_logits=True), metrics=['categorical_accuracy']) print('--')

TF v: 2.1.0 Keras v: 2.2.4-tf INFO:tensorflow:Initializing the TPU system: xxxxxxxxxx:8470 INFO:tensorflow:Initializing the TPU system: xxxxxxxxxx:8470 INFO:tensorflow:Clearing out eager caches INFO:tensorflow:Clearing out eager caches INFO:tensorflow:Finished initializing TPU system. INFO:tensorflow:Finished initializing TPU system. INFO:tensorflow:Found TPU system: INFO:tensorflow:Found TPU system: INFO:tensorflow:*** Num TPU Cores: 8 INFO:tensorflow:*** Num TPU Cores: 8 INFO:tensorflow:*** Num TPU Workers: 1 INFO:tensorflow:*** Num TPU Workers: 1 INFO:tensorflow:*** Num TPU Cores Per Worker: 8 INFO:tensorflow:*** Num TPU Cores Per Worker: 8 INFO:tensorflow:*** Available Device: _DeviceAttributes(/job:localhost/replica:0/task:0/device:CPU:0, CPU, 0, 0) INFO:tensorflow:*** Available Device: _DeviceAttributes(/job:localhost/replica:0/task:0/device:CPU:0, CPU, 0, 0) INFO:tensorflow:*** Available Device: _DeviceAttributes(/job:localhost/replica:0/task:0/device:XLA_CPU:0, XLA_CPU, 0, 0) INFO:tensorflow:*** Available Device: _DeviceAttributes(/job:localhost/replica:0/task:0/device:XLA_CPU:0, XLA_CPU, 0, 0) INFO:tensorflow:*** Available Device: _DeviceAttributes(/job:worker/replica:0/task:0/device:CPU:0, CPU, 0, 0) INFO:tensorflow:*** Available Device: _DeviceAttributes(/job:worker/replica:0/task:0/device:CPU:0, CPU, 0, 0) INFO:tensorflow:*** Available Device: _DeviceAttributes(/job:worker/replica:0/task:0/device:TPU:0, TPU, 0, 0) INFO:tensorflow:*** Available Device: _DeviceAttributes(/job:worker/replica:0/task:0/device:TPU:0, TPU, 0, 0) INFO:tensorflow:*** Available Device: _DeviceAttributes(/job:worker/replica:0/task:0/device:TPU:1, TPU, 0, 0) INFO:tensorflow:*** Available Device: _DeviceAttributes(/job:worker/replica:0/task:0/device:TPU:1, TPU, 0, 0) INFO:tensorflow:*** Available Device: _DeviceAttributes(/job:worker/replica:0/task:0/device:TPU:2, TPU, 0, 0) INFO:tensorflow:*** Available Device: _DeviceAttributes(/job:worker/replica:0/task:0/device:TPU:2, TPU, 0, 0) INFO:tensorflow:*** Available Device: _DeviceAttributes(/job:worker/replica:0/task:0/device:TPU:3, TPU, 0, 0) INFO:tensorflow:*** Available Device: _DeviceAttributes(/job:worker/replica:0/task:0/device:TPU:3, TPU, 0, 0) INFO:tensorflow:*** Available Device: _DeviceAttributes(/job:worker/replica:0/task:0/device:TPU:4, TPU, 0, 0) INFO:tensorflow:*** Available Device: _DeviceAttributes(/job:worker/replica:0/task:0/device:TPU:4, TPU, 0, 0) INFO:tensorflow:*** Available Device: _DeviceAttributes(/job:worker/replica:0/task:0/device:TPU:5, TPU, 0, 0) INFO:tensorflow:*** Available Device: _DeviceAttributes(/job:worker/replica:0/task:0/device:TPU:5, TPU, 0, 0) INFO:tensorflow:*** Available Device: _DeviceAttributes(/job:worker/replica:0/task:0/device:TPU:6, TPU, 0, 0) INFO:tensorflow:*** Available Device: _DeviceAttributes(/job:worker/replica:0/task:0/device:TPU:6, TPU, 0, 0) INFO:tensorflow:*** Available Device: _DeviceAttributes(/job:worker/replica:0/task:0/device:TPU:7, TPU, 0, 0) INFO:tensorflow:*** Available Device: _DeviceAttributes(/job:worker/replica:0/task:0/device:TPU:7, TPU, 0, 0) INFO:tensorflow:*** Available Device: _DeviceAttributes(/job:worker/replica:0/task:0/device:TPU_SYSTEM:0, TPU_SYSTEM, 0, 0) INFO:tensorflow:*** Available Device: _DeviceAttributes(/job:worker/replica:0/task:0/device:TPU_SYSTEM:0, TPU_SYSTEM, 0, 0) INFO:tensorflow:*** Available Device: _DeviceAttributes(/job:worker/replica:0/task:0/device:XLA_CPU:0, XLA_CPU, 0, 0) INFO:tensorflow:*** Available Device: _DeviceAttributes(/job:worker/replica:0/task:0/device:XLA_CPU:0, XLA_CPU, 0, 0)

['loss'] Model: "sequential" _________________________________________________________________ Layer (type) Output Shape Param # ================================================================= conv1d (Conv1D) (None, 1279, 16) 48 _________________________________________________________________ conv1d_1 (Conv1D) (None, 1278, 32) 1056 _________________________________________________________________ conv1d_2 (Conv1D) (None, 1277, 64) 4160 _________________________________________________________________ conv1d_3 (Conv1D) (None, 1276, 128) 16512 _________________________________________________________________ max_pooling1d (MaxPooling1D) (None, 638, 128) 0 _________________________________________________________________ flatten (Flatten) (None, 81664) 0 _________________________________________________________________ dense (Dense) (None, 4) 326660 ================================================================= Total params: 348,436 Trainable params: 348,436 Non-trainable params: 0 _________________________________________________________________ None -- --

1条回答

网友

1楼 · 发布于 2024-09-29 18:54:32

它似乎主要与ProtoBuf有关，而不是TensorFlow-ProtoBuf的硬限制是每次调用2GB，TensorFlow只能通过多个ProtoBuf消息分割tf.data.Dataset个实体。您应该使数据集小于2 GB，或者将其转换为TensorFlow数据集格式。资料来源：1、2、3、4

相关问题更多 >

编程相关推荐

热门问题

热门文章