从R Studi上的Postgres服务器下载JSON文件

2024-10-01 11:21:04 发布

您现在位置:Python中文网/ 问答频道 /正文

我试图从R Studio上的Postgres服务器下载一个JSON文件,但是,rjsonio包中的fromJSON()方法只在文件1)位于本地目录或2)位于URL时有效。有人知道在R Studio或Python上执行此操作的方法吗?在


Tags: 文件方法服务器目录jsonurlpostgresstudio
1条回答
网友
1楼 · 发布于 2024-10-01 11:21:04

我正冒险去推断你在这里真正需要什么。下面是我要回答的问题:

Title: How do I access json-encoded data in a postgres database?

Body: I need to access the data within JSON data stored in a postgres database. Once I connect, I can see that the data is there, but I don't know how to get the individual elements either in SQL or in R.

样本数据

一个精心设计的问题应该包括示例数据,所以我将在这里添加一个示例表。此数据改编自http://www.postgresqltutorial.com/postgresql-json/

library(DBI)
con <- dbConnect(RPostgres::Postgres(), ...) # also works with `odbc::odbc()`, untested with `RODBC`
dbExecute(con, "
  create table mytable (
    id SERIAL8 PRIMARY KEY NOT NULL,
    valtext TEXT,
    jsontext JSON
  )")
d <- data.frame(
  valtext = c('{ "customer": "John Doe", "items": {"product": "Beer","qty": 6}}',
              '{ "customer": "Lily Bush", "items": {"product": "Diaper","qty": 24}}',
              '{ "customer": "Josh William", "items": {"product": "Toy Car","qty": 1}}',
              '{ "customer": "Mary Clark", "items": {"product": "Toy Train","qty": 2}}')
 )
d$jsontext <- d$valtext
dbWriteTable(con, "mytable", d, append=TRUE)
dbGetQuery(con, "select * from mytable")
#   id                                                                 valtext
# 1  1        { "customer": "John Doe", "items": {"product": "Beer","qty": 6}}
# 2  2    { "customer": "Lily Bush", "items": {"product": "Diaper","qty": 24}}
# 3  3 { "customer": "Josh William", "items": {"product": "Toy Car","qty": 1}}
# 4  4 { "customer": "Mary Clark", "items": {"product": "Toy Train","qty": 2}}
#                                                                  jsontext
# 1        { "customer": "John Doe", "items": {"product": "Beer","qty": 6}}
# 2    { "customer": "Lily Bush", "items": {"product": "Diaper","qty": 24}}
# 3 { "customer": "Josh William", "items": {"product": "Toy Car","qty": 1}}
# 4 { "customer": "Mary Clark", "items": {"product": "Toy Train","qty": 2}}

在数据库中提取JSON

如果您继续本教程,您将看到Postgres包含两个运算符,它们只适用于JSON类型的字段:

^{pr2}$

但不是TEXT或类似的:

dbGetQuery(con, "select id from mytable
                 where cast(valtext -> 'customer' as text) like '%William%'")
# Error in result_create(conn@ptr, statement) (from functions.R#284) : 
#   ERROR:  operator does not exist: text -> unknown
# LINE 2:                  where cast(valtext -> 'customer' as text) l...
#                                             ^
# HINT:  No operator matches the given name and argument type(s). You might need to add explicit type casts.

您可以使用类似的方式检索单个组件:

dbGetQuery(con, "select jsontext -> 'items' -> 'qty' as quantity from mytable
                 where cast(jsontext -> 'items' -> 'product' as text) like '%Toy%'")
#   quantity
# 1        1
# 2        2

提取JSON in-R

ret <- dbGetQuery(con, "select jsontext from mytable
                        where cast(jsontext -> 'items' -> 'product' as text) like '%Toy%'")
ret
#                                                                  jsontext
# 1 { "customer": "Josh William", "items": {"product": "Toy Car","qty": 1}}
# 2 { "customer": "Mary Clark", "items": {"product": "Toy Train","qty": 2}}

暴力(但仍然是功能性的)方法是在每个字段上应用fromJSON函数。(我使用的是jsonlite,但我认为RJSONIO在这里也同样适用:

lapply(ret$jsontext, jsonlite::fromJSON)
# [[1]]
# [[1]]$customer
# [1] "Josh William"
# [[1]]$items
# [[1]]$items$product
# [1] "Toy Car"
# [[1]]$items$qty
# [1] 1
# [[2]]
# [[2]]$customer
# [1] "Mary Clark"
# [[2]]$items
# [[2]]$items$product
# [1] "Toy Train"
# [[2]]$items$qty
# [1] 2

jsonlite提供的另一种方法是通过jsonlite::stream_in;我尝试使用RJSONIO::readJSONStream但无法使其工作。我没有努力,我希望也一样容易。在

jsonlite::stream_in(textConnection(ret$jsontext))
#  Imported 2 records. Simplifying...
#       customer items.product items.qty
# 1 Josh William       Toy Car         1
# 2   Mary Clark     Toy Train         2

还有jsonlite::fromJSON的选项可以与jsonlite::stream_in一起使用:当数据以比上面更复杂的方式嵌套时,我经常需要simplifyDataFrame=FALSE(注意,包含两个元素的“items”字典是如何“扁平化”成"items.product"和{},这是默认操作simplifyDataFrame=TRUE的副作用)。在

本教程还有更多内容,可能还有无数其他资源。在

相关问题 更多 >