将Google Spreadsheet CSV导入Pandas数据框

邵浩大

2023-03-14

问题内容：

我将文件上传到Google电子表格（以制作带有数据的公共示例IPython Notebook），我使用的本机文件可以读入Pandas
Dataframe中。因此，现在我使用以下代码读取电子表格，可以正常工作，但只能以字符串形式输入，而且我没有运气试图将其重新放入数据框（可以获取数据）

import requests
r = requests.get('https://docs.google.com/spreadsheet/ccc?key=0Ak1ecr7i0wotdGJmTURJRnZLYlV3M2daNTRubTdwTXc&output=csv')
data = r.content

数据最终看起来像：（第一行标题）

',City,region,Res_Comm,mkt_type,Quradate,National_exp,Alabama_exp,Sales_exp,Inventory_exp,Price_exp,Credit_exp\n0,Dothan,South_Central-Montgomery-Auburn-Wiregrass-Dothan,Residential,Rural,1/15/2010,2,2,3,2,3,3\n10,Foley,South_Mobile-Baldwin,Residential,Suburban_Urban,1/15/2010,4,4,4,4,4,3\n12,Birmingham,North_Central-Birmingham-Tuscaloosa-Anniston,Commercial,Suburban_Urban,1/15/2010,2,2,3,2,2,3\n

引入磁盘驻留文件的本机pandas代码如下所示：

df = pd.io.parsers.read_csv('/home/tom/Dropbox/Projects/annonallanswerswithmaster1012013.csv',index_col=0,parse_dates=['Quradate'])

一个“干净”的解决方案将对许多人有所帮助，以提供一种简便的方法来共享数据集供熊猫使用！我尝试了一堆替代方法都没有成功，而且我很确定自己会再次错过一些显而易见的事情。

只是更新说明新的Google电子表格具有不同的URL模式只需使用它代替上面的示例和/或下面的答案中的URL，那么这里的示例是可以的：

https://docs.google.com/spreadsheets/d/177_dFZ0i-duGxLiyg6tnwNDKruAYE-_Dd8vAQziipJQ/export?format=csv&id

请参阅@Max Ghenis的以下解决方案，该解决方案仅使用了pd.read_csv，不需要StringIO或请求…

问题答案：

您可以read_csv()在一个StringIO对象上使用：

from io import BytesIO

import requests
r = requests.get('https://docs.google.com/spreadsheet/ccc?key=0Ak1ecr7i0wotdGJmTURJRnZLYlV3M2daNTRubTdwTXc&output=csv')
data = r.content

In [10]: df = pd.read_csv(BytesIO(data), index_col=0,parse_dates=['Quradate'])

In [11]: df.head()
Out[11]: 
          City                                            region     Res_Comm  \
0       Dothan  South_Central-Montgomery-Auburn-Wiregrass-Dothan  Residential   
10       Foley                              South_Mobile-Baldwin  Residential   
12  Birmingham      North_Central-Birmingham-Tuscaloosa-Anniston   Commercial   
38       Brent      North_Central-Birmingham-Tuscaloosa-Anniston  Residential   
44      Athens                 North_Huntsville-Decatur-Florence  Residential

          mkt_type            Quradate  National_exp  Alabama_exp  Sales_exp  \
0            Rural 2010-01-15 00:00:00             2            2          3   
10  Suburban_Urban 2010-01-15 00:00:00             4            4          4   
12  Suburban_Urban 2010-01-15 00:00:00             2            2          3   
38           Rural 2010-01-15 00:00:00             3            3          3   
44  Suburban_Urban 2010-01-15 00:00:00             4            5          4

    Inventory_exp  Price_exp  Credit_exp  
0               2          3           3  
10              4          4           3  
12              2          2           3  
38              3          3           2  
44              4          4           4

将Google Spreadsheet CSV导入Pandas数据框

相关阅读

相关文章

相关问答

相关工具

相关文档