我正在使用以下函数将一些日志标准输出从Databricks发送到Azure application insights日志
我的功能
import logging
from opencensus.ext.azure.log_exporter import AzureLogHandler
from opencensus.trace import config_integration
from opencensus.trace.samplers import AlwaysOnSampler
from opencensus.trace.tracer import Tracer
def custom_logging_function(log_type, instrumentation_key_value, input_x):
"""
Purpose: The standard output sent to Application insights logs
Inputs: -
Return: -
"""
config_integration.trace_integrations(['logging'])
logging.basicConfig(format='%(asctime)s traceId=%(traceId)s spanId=%(spanId)s %(message)s')
tracer=Tracer(sampler=AlwaysOnSampler())
logger=logging.getLogger(__name__)
logger.addHandler(AzureLogHandler(connection_string='InstrumentationKey={0}'.format(instrumentation_key_value)))
if log_type=="INFO" or log_type=="SUCESSFULL":
#[UPDATE]
logger.setLevel(logging.INFO)
logger.info(input_x)
#logging.info(input_x)
elif log_type=="ERROR":
#[UPDATE]
logger.setLevel(logging.ERROR)
logger.exception(input_x)
#logging.exception(input_x)
else:
logger.warning(input_x)
[更新] 通过将日志记录级别设置为INFO,可以记录不同类型的跟踪
此功能即使正确执行,也会出现故障,原因如下:
原因1
当我想要打印logger.info()消息时,它没有成功地记录在Application insights中。由于无法解释的原因,只有logger.warning()消息成功发送到Application insights日志。
比如说,
custom_logging_function("INFO", instrumentation_key_value, "INFO: {0} chronical dates in the specified time-frame have been created!".format(len(date_list)))
# Uses the logger.info() based on my function!
这永远不会被记录。但是,下面只记录了它
custom_logging_function("WARNING", instrumentation_key_value, "INFO: {0} chronical dates in the specified time-frame have been created!".format(len(date_list)))
# Uses the logger.warning() based on my function!
原因1已由我解决..请检查我的函数编辑
原因2
同一消息被多次记录,而不是只记录一次。
一些代码来解释相同的问题
# Set keyword parameters
time_scale=12
time_frame_repetition=1
timestamp_snapshot=datetime.utcnow()
round_up = math.ceil(time_frame_repetition*365/time_scale)
day_list = [(timestamp_snapshot - timedelta(days=x)).strftime("%d") for x in range(round_up)]
month_list = [(timestamp_snapshot - timedelta(days=x)).strftime("%m") for x in range(round_up)]
year_list = [(timestamp_snapshot - timedelta(days=x)).strftime("%Y") for x in range(round_up)]
date_list=[[day_list[i], month_list[i], year_list[i]] for i in range(0, len(day_list))]
custom_logging_function("INFO", instrumentation_key_value, "INFO: {0} chronical dates in the specified time-frame have been created!".format(len(date_list))) #the function already written in the start of my post.
上面代码片段的输出在ApplicationInsights中记录了1次以上,我正在试图找出原因
输出登录应用程序洞察
从查询的输出中可以看到,同一行被记录了多次
第一件事解决后,你对第二件事有什么建议
[更新]基于@Izhen提供的以下答案
def instantiate_logger(instrumentation_key_value):
config_integration.trace_integrations(['logging'])
logging.basicConfig(format='%(asctime)s traceId=%(traceId)s spanId=%(spanId)s %(message)s')
tracer=Tracer(sampler=AlwaysOnSampler())
logger=logging.getLogger(__name__)
logger.addHandler(AzureLogHandler(connection_string='InstrumentationKey={0}'.format(instrumentation_key_value)))
return logger
logging_instance=instantiate_logger(instrumentation_key_value)
def custom_logging_function(logging_instance, disable_logging, log_type, input_x, *arguments):
"""
Purpose: The standard output sent to Application insights logs
Inputs: -
Return: The logger object.
"""
if disable_logging==0:
if log_type=="INFO" or log_type=="SUCCESSFUL":
logging_instance.setLevel(logging.INFO)
logging_instance.info(input_x)
print(input_x, *arguments)
elif log_type=="ERROR":
logging_instance.setLevel(logging.ERROR)
logging_instance.exception(input_x)
print(input_x, *arguments)
else:
logging_instance.warning(input_x)
print(input_x, *arguments)
else:
print(input_x, *arguments)
上面的代码仍然记录此函数的输出:
date_list=merge_hierarchy_list(year_list, month_list, day_list, None, None)
custom_logging_function(logging_instance, disable_logging_value, "INFO", "INFO: {0} chronological dates in the specified time-frame have been created!".format(len(date_list)))
输出(在Application Insights日志跟踪中记录2次):
"INFO: 31 chronological dates in the specified time-frame have been created!"
原因2:
您是否在Databricks笔记本中运行Python文件?笔记本将保存实例化的所有对象的状态(包括使用的Python记录器)。以前,当用户在笔记本中多次运行代码时,我们会遇到重复的日志条目,因为每次代码再次执行时,AzureLogHandler都会作为处理程序添加到根记录器中。作为普通Python模块运行不应导致这种行为,因为状态不会保留在后续运行中
如果您没有使用笔记本,那么问题似乎在于多次添加AzureLogHandler。在您的Databricks管道中是否有多个执行相同逻辑的工作者
相关问题 更多 >
编程相关推荐