Regular Expressions 101

Python logs parser (Airflow)

1

Regular Expression
Python

r"
(?P<time_full>\[(?P<time_clear>\d{4}\-\d{2}\-\d{2}\s\d{2}\:\d{2}\:\d{2}\,\d{1,})\])\s(?P<module>\{(?P<module_name>\w+.\w+):(?P<module_row>\d+)\})(?P<log_message>.*?)(?=\[\d{4}\-\d{2}\-\d{2}\s\d{2}\:\d{2}\:\d{2}\,\d{1,}\]|\Z)
"
gms

Description

Parser for individual row of log

# pattern for parsing row in format:
# [2023-03-11 00:00:07,377] {taskinstance.py:1088} INFO - Starting attempt 1
# where:
#   - time_full: [2023-03-11 00:00:07,377]
#   - time_clear: 2023-03-11 00:00:07,377
#   - module: {taskinstance.py:1088}
#   - module_name: taskinstance.py
#   - module_row: 1088
#   - log_message: INFO - Starting attempt 1
#   - log_level: INFO
#   - message: Starting attempt 1
Submitted by anonymous - 3 months ago