使用pandas和google sheets api的营销工具集,以及各种其他google api的类
prospecting的Python项目详细描述
谷歌与熊猫数据框架表,有用时,在分析工作和黑客“prospecting”。
设置
- Create or select a project in Google’s developer console
- Also, you will need to enable the APIs you plan to use
- Get a ^{tt1}$ credentials file from the credentials section
- Select OAuth client ID from the dropdown in the API access pane
- Load the ^{tt2}$ module in a Python session to initialize the ^{tt3}$ folder in your home directory
- Place the ^{tt1}$ file in the ^{tt5}$ directory
- Load an API class in a Python session, then run apiclass.authenticate() and follow steps
- You only need to setup authentication once per API unless creds change
示例:
import prospecting as p
-
使用stats sheet存储stats和misc statistics(scopelist默认为只读,因此传递要写入的作用域):
ss_stats = p.SheetsApi(spreadsheetid = 'PASTE_GOOGLE_SHEETID_HERE', scopelist=['https://www.googleapis.com/auth/spreadsheets', 'https://www.googleapis.com/auth/drive.metadata']) ss_stats.authenticate() ss_stats.update('Sheet1', somedataframe)
-
使用参考表为NLP预处理提供命名实体列表(或停止词、词汇表):
ss_reference = p.SheetsApi(spreadsheetid = 'PASTE_GOOGLE_SHEETID_HERE', scopelist=['https://www.googleapis.com/auth/spreadsheets', 'https://www.googleapis.com/auth/drive.metadata']) ss_reference.authenticate() named_entity_list = list(ss_reference.get('ne!A:B').iloc[:,0].values)
-
获取关键字表作为数据框,筛选,获取采样子集,将新的数据框上载到电子表格中的其他选项卡:
ss_kw = p.SheetsApi(spreadsheetid = 'PASTE_GOOGLE_SHEETID_HERE', scopelist=['https://www.googleapis.com/auth/spreadsheets', 'https://www.googleapis.com/auth/drive.metadata']) ss_kw.authenticate() # Get data using spreadsheet syntax like ('sheetname') or ('sheetname!A:B25') df_query = ss_kw.get('queries') df_query_subset = df_query[(df_query['raw_len'] > 1) & (df_query['reject'] != 1)] # Take a subsample of data df_query_subset_sample = df_query_subset.sample(frac=0.5) df_query_subset_sample.reset_index(drop=True, inplace=True) # Update 'sheetname' with dataframe object ss_kw.update('queries_shuffled', df_query_subset_sample)
0.1.4到0.1.2之间的关键变化:
ss.update()函数的输入参数的切换顺序:
From ss.update(dataframe, 'sheetname') To ss.update('sheetname', dataframe)
删除Docker文件以简化