当前位置: 首页 > 面试题库 >

将导入的json数据获取到数据框中

纪秋月
2023-03-14
问题内容

我有一个文件,其中包含要在R中使用的1500个json对象。我已经能够将数据作为列表导入,但是在将其强制转换为有用的结构时遇到了麻烦。我想创建一个数据框,其中每个json对象包含一行,每个key:value对包含一列。

我用这个小的假数据集重新创建了我的处境:

[{"name":"Doe, John","group":"Red","age (y)":24,"height (cm)":182,"wieght (kg)":74.8,"score":null},
{"name":"Doe, Jane","group":"Green","age (y)":30,"height (cm)":170,"wieght (kg)":70.1,"score":500},
{"name":"Smith, Joan","group":"Yellow","age (y)":41,"height (cm)":169,"wieght (kg)":60,"score":null},
{"name":"Brown, Sam","group":"Green","age (y)":22,"height (cm)":183,"wieght (kg)":75,"score":865},
{"name":"Jones, Larry","group":"Green","age (y)":31,"height (cm)":178,"wieght (kg)":83.9,"score":221},
{"name":"Murray, Seth","group":"Red","age (y)":35,"height (cm)":172,"wieght (kg)":76.2,"score":413},
{"name":"Doe, Jane","group":"Yellow","age (y)":22,"height (cm)":164,"wieght (kg)":68,"score":902}]

数据的一些功能:

  • 所有对象都包含相同数量的key:value对,尽管某些值是null
  • 每个对象有两个非数字列(名称和组)
  • 名称是唯一标识符,大约有10个组
  • 名称和组整体中的许多都包含空格,逗号和其他标点符号。

基于这个问题:Rlist(structure(html" target="_blank">list()))到数据帧,我尝试了以下操作:

json_file <- "test.json"
json_data <- fromJSON(json_file)
asFrame <- do.call("rbind.fill", lapply(json_data, as.data.frame))

有了我的真实数据和虚假数据,最后一行给我这个错误:

Error in data.frame(name = "Doe, John", group = "Red", `age (y)` = 24,  : 
  arguments imply differing number of rows: 1, 0

问题答案:

您只需要将NA替换为NULL:

require(RJSONIO)

json_file <-  '[{"name":"Doe, John","group":"Red","age (y)":24,"height (cm)":182,"wieght (kg)":74.8,"score":null},
    {"name":"Doe, Jane","group":"Green","age (y)":30,"height (cm)":170,"wieght (kg)":70.1,"score":500},
    {"name":"Smith, Joan","group":"Yellow","age (y)":41,"height (cm)":169,"wieght (kg)":60,"score":null},
    {"name":"Brown, Sam","group":"Green","age (y)":22,"height (cm)":183,"wieght (kg)":75,"score":865},
    {"name":"Jones, Larry","group":"Green","age (y)":31,"height (cm)":178,"wieght (kg)":83.9,"score":221},
    {"name":"Murray, Seth","group":"Red","age (y)":35,"height (cm)":172,"wieght (kg)":76.2,"score":413},
    {"name":"Doe, Jane","group":"Yellow","age (y)":22,"height (cm)":164,"wieght (kg)":68,"score":902}]'


json_file <- fromJSON(json_file)

json_file <- lapply(json_file, function(x) {
  x[sapply(x, is.null)] <- NA
  unlist(x)
})

一旦每个元素都有一个非空值,就可以调用rbind而不会出现错误:

do.call("rbind", json_file)
     name           group    age (y) height (cm) wieght (kg) score
[1,] "Doe, John"    "Red"    "24"    "182"       "74.8"      NA   
[2,] "Doe, Jane"    "Green"  "30"    "170"       "70.1"      "500"
[3,] "Smith, Joan"  "Yellow" "41"    "169"       "60"        NA   
[4,] "Brown, Sam"   "Green"  "22"    "183"       "75"        "865"
[5,] "Jones, Larry" "Green"  "31"    "178"       "83.9"      "221"
[6,] "Murray, Seth" "Red"    "35"    "172"       "76.2"      "413"
[7,] "Doe, Jane"    "Yellow" "22"    "164"       "68"        "902"


 类似资料:
  • 问题内容: Hy 我是python的新手,我想使用此简单查询将一些数据从Oracle数据库导入python(pandas数据框) 我做了什么 但是我有这个错误 我做错了什么? 谢谢 问题答案: 您需要正确引用SQL查询。如果您查看问题(或IDE)中突出显示的语法,则会注意到单引号没有按预期工作。 将最外面的引号更改为双引号-如果要在一行上全部使用-或将三引号更改为多行:

  • 理想的解决方案如下所示: 其中'my_file.json'包含一个JSON格式的变量。

  • 使用此代码,我初始化了组合框 并使用以下方法获取价值: 并将值插入mysql数据库。现在,在类别组合框中插入下一个值之前,我需要将数据库中的值导入到组合框的下拉列表中,值应该显示在组合框中。

  • 我正在尝试将mysql数据库导入新版本的xampp(v7.1.8)。按下导入按钮后,我收到数百条此错误消息: 请注意.\vendor\phpmyadmin\sql parser\src\UtfString.php#128未初始化字符串偏移量:516 回溯 .\vendor\phpmyadmin\sql parser\src\Lexer.php\823:phpmyadmin\SqlParser\Ut

  • 问题内容: 我将文件上传到Google电子表格(以制作带有数据的公共示例IPython Notebook),我使用的本机文件可以读入Pandas Dataframe中。因此,现在我使用以下代码读取电子表格,可以正常工作,但只能以字符串形式输入,而且我没有运气试图将其重新放入数据框(可以获取数据) 数据最终看起来像:(第一行标题) 引入磁盘驻留文件的本机pandas代码如下所示: 一个“干净”的解决

  • 问题内容: 我正在尝试使用fetch发布 JSON对象。 据我了解,我需要将一个字符串化的对象附加到请求的主体,例如: 当使用jsfiddle的json回显时,我希望看到返回的对象(),但这不会发生-chrome devtools甚至不会在请求中显示JSON,这意味着它没有被发送。 问题答案: 借助ES2017 支持,这是如何实现JSON负载的方法: 但是,问题是由 很久以来修复的chrome b