有 Java 编程相关的问题?

你可以在下面搜索框中键入要查询的问题!

hadoop Gobblin:错误:java。木卫一。IOException:未能提交作业作业\u GobblinKafkaQuickStart的某些数据集的数据集状态

我试图在https://gobblin.readthedocs.io/en/latest/case-studies/Kafka-HDFS-Ingestion/之后将卡夫卡主题中的数据摄取到hdfs中

我将遵循以下步骤:

启动zookeeper
$ zookeeper-server-start.bat C:\Users\name\kafka_2.11-1.1.0\config\zookeeper.properties

开始卡夫卡
$ kafka-server-start.bat C:\Users\name\kafka_2.11-1.1.0\config\server.properties

创建卡夫卡主题(如果不存在)
$ kafka-topics.bat --create --zookeeper localhost:2181 --replication-factor 1 --partitions 1 --topic test

启动hadoop
$ C:\Users\name\hadoop-3.1.3\sbin\start-all.cmd

创建卡夫卡HDF。拉入GOBBLIN\u作业\u配置\u目录,如下所示

job.group=GobblinKafka
job.description=Gobblin quick start job for Kafka
job.lock.enabled=false

kafka.brokers=localhost:9092

source.class=org.apache.gobblin.source.extractor.extract.kafka.KafkaSimpleSource
extract.namespace=org.apache.gobblin.extract.kafka

writer.builder.class=org.apache.gobblin.writer.SimpleDataWriterBuilder
writer.file.path.type=tablename
writer.destination.type=HDFS
writer.output.format=txt

data.publisher.type=org.apache.gobblin.publisher.BaseDataPublisher

mr.job.max.mappers=1

metrics.reporting.file.enabled=true
metrics.log.dir=/gobblin-kafka/metrics
metrics.reporting.file.suffix=txt

bootstrap.with.offset=earliest

fs.uri=hdfs://localhost:9000
writer.fs.uri=hdfs://localhost:9000
state.store.fs.uri=hdfs://localhost:9000

mr.job.root.dir=/gobblin-kafka/working
state.store.dir=/gobblin-kafka/state-store
task.data.root.dir=/jobs/kafkaetl/gobblin/gobblin-kafka/task-data
data.publisher.final.dir=/gobblintest/job-output

设置GOBBLIN\u WORK\u DIR
$ export GOBBLIN_WORK_DIR=/mnt/c/users/name/incubator-gobblin/GOBBLIN_WORK_DIR

设置GOBBLIN\u作业\u配置\u目录
$ export GOBBLIN_JOB_CONFIG_DIR=/mnt/c/users/name/incubator-gobblin/GOBBLIN_JOB_CONFIG_DIR

独立启动
$ bin/gobblin.sh service standalone start

以下是在日志/单机版中发现的一些错误。出去

[JobScheduler-0] org.apache.gobblin.scheduler.JobScheduler$NonScheduledJobRunner  637 - Failed to run job GobblinKafkaQuickStart
org.apache.gobblin.runtime.JobException: Failed to run job GobblinKafkaQuickStart

ERROR [ForkExecutor-0] org.apache.gobblin.runtime.fork.Fork  258 - Fork 0 of task task_GobblinKafkaQuickStart_1580883582897_0 failed to process data records. Set throwable in holder org.apache.gobblin.runtime.ForkThrowableHolder@721ea24d
java.lang.RuntimeException: Error creating writer

ERROR [TaskExecutor-0] org.apache.gobblin.runtime.Task  545 - Task task_GobblinKafkaQuickStart_1580883582897_0 failed
java.lang.RuntimeException: Some forks failed.

ERROR [Commit-thread-0] org.apache.gobblin.runtime.SafeDatasetCommit  196 - Failed to persist dataset state for dataset  of job job_GobblinKafkaQuickStart_1580883582897
org.apache.hadoop.security.AccessControlException: Permission denied: user=name, access=WRITE, inode="/":name:supergroup:drwxrwxr-x

ERROR [JobScheduler-0] org.apache.gobblin.util.executors.IteratorExecutor  163 - Iterator executor failure.
java.util.concurrent.ExecutionException: java.lang.RuntimeException: org.apache.hadoop.security.AccessControlException: Permission denied: user=name, access=WRITE, inode="/":name:supergroup:drwxrwxr-x

ERROR [JobScheduler-0] org.apache.gobblin.runtime.AbstractJobLauncher  521 - Failed to launch and run job job_GobblinKafkaQuickStart_1580883582897: java.io.IOException: Failed to commit dataset state for some dataset(s) of job job_GobblinKafkaQuickStart_1580883582897
java.io.IOException: Failed to commit dataset state for some dataset(s) of job job_GobblinKafkaQuickStart_1580883582897 

请告诉我如何解决这个问题


共 (0) 个答案