如何使用Apache Sp仅流式传输文件的一部分

+-------+------+-------+--------------------+ | event | id | alert | datetime | +-------+------+-------+--------------------+ | reg | 1 | def1 | 06.06.17-17.24.30 | +-------+------+-------+--------------------+ | alt | 2 | def2 | 06.06.17-17.25.11 | +-------+------+-------+--------------------+ | mot | 3 | def5 | 06.06.17-17.26.01 | +-------+------+-------+--------------------+ | mot | 4 | def5 | 06.06.17-17.26.01 | +-------+------+-------+--------------------+

1条回答

网友
1楼 · 发布于 2024-09-29 21:29:45

file sources Spark不支持它
Reads files written in a directory as a stream of data. Supported file formats are text, csv, json, orc, parquet. See the docs of the DataStreamReader interface for a more up-to-date list, and supported options for each file format. Note that the files must be atomically placed in the given directory, which in most file systems, can be achieved by file move operations
对于legacy streaming也是如此（注意2.2文档，但是实现没有改变）
The files must be created in the dataDirectory by atomically moving or renaming them into the data directory.

相关问题更多 >

编程相关推荐

热门问题

热门文章