使用regex python检测具有if语句的jinja2变量

2024-10-04 11:30:37 发布

您现在位置:Python中文网/ 问答频道 /正文

从下面的文件中,我只想提取if语句块并对它们进行迭代 还希望仅提取块内具有image:as键的块

apiVersion: extensions/v1beta1
kind: Deployment
metadata:
  name: {{ template "fullname" . }}
  labels:
    app: {{ template "fullname" . }}
    chart: "{{ .Chart.Name }}-{{ .Chart.Version }}"
    release: "{{ .Release.Name }}"
    heritage: "{{ .Release.Service }}"
spec:
  replicas: {{ .Values.replicas }}
  strategy:
    type: RollingUpdate
    rollingUpdate:
      maxSurge: 1
      maxUnavailable: 1
  minReadySeconds: 5
  template:
    metadata:
      labels:
        name: {{ template "fullname" . }}
        app: {{ template "fullname" . }}
    spec:
{{- if .Values.pvc.enabled }}
      volumes:
      - name: {{ template "fullname" . }}
        persistentVolumeClaim:
          claimName: {{ template "claimname" . }}
{{- end }}
{{- if .Values.k8swait.enabled }}
      serviceAccountName: {{ template "fullname" . }}-admin
      initContainers:
        - env:
            - name: CLUSTER
              value: "{{ .Values.k8swait.parameters.cluster}}"
            - name: NAMESPACE
              value: "{{ .Release.Namespace }}"
            - name: RESOURCE
              value: "{{ .Values.k8swait.parameters.resource}}"
            - name: RNAME
              value: "{{ .Values.k8swait.job.jobname }}"
            - name: TIMEOUT
              value: "{{ .Values.k8swait.parameters.timeout}}"
            - name: FREQUENCE
              value: "{{ .Values.k8swait.parameters.frequence}}"
          name: {{ .Values.k8swait.parameters.name}}
          image: "{{ .Values.global.registry1 }}/{{ .Values.k8swait.repo }}:{{ .Values.k8swait.tag }}"
          resources:
            limits:
              cpu: "{{ .Values.resources.limits.cpu }}"
              memory: "{{ .Values.resources.limits.memory }}"
            requests:
              cpu: "{{ .Values.resources.requests.cpu }}"
              memory: "{{ .Values.resources.requests.memory }}"
{{- end }}
      securityContext:
        runAsUser: 1000
        fsGroup: 1000
      containers:
      - name: {{ template "fullname" . }}
        image: "{{ .Values.global.registry }}/{{ .Values.image.repository }}:{{ .Values.image.tag }}"
        imagePullPolicy: {{ default "" .Values.imagePullPolicy | quote }}
        ports:
        - name: http
          containerPort: 9000
{{- if .Values.pvc.enabled }}
        image: "{{ .Values.global.registry1 }}/{{ .Values.k8swait.repo }}:{{ .Values.k8swait.tag }}"
        volumeMounts:
          - mountPath: /BACKUP
            name: "{{ template "fullname" . }}"

{{- end }}

期望输出:

{{- if .Values.k8swait.enabled }}
      serviceAccountName: {{ template "fullname" . }}-admin
      initContainers:
        - env:
            - name: CLUSTER
              value: "{{ .Values.k8swait.parameters.cluster}}"
            - name: NAMESPACE
              value: "{{ .Release.Namespace }}"
            - name: RESOURCE
              value: "{{ .Values.k8swait.parameters.resource}}"
            - name: RNAME
              value: "{{ .Values.k8swait.job.jobname }}"
            - name: TIMEOUT
              value: "{{ .Values.k8swait.parameters.timeout}}"
            - name: FREQUENCE
              value: "{{ .Values.k8swait.parameters.frequence}}"
          name: {{ .Values.k8swait.parameters.name}}
          image: "{{ .Values.global.registry1 }}/{{ .Values.k8swait.repo }}:{{ .Values.k8swait.tag }}"
          resources:
            limits:
              cpu: "{{ .Values.resources.limits.cpu }}"
              memory: "{{ .Values.resources.limits.memory }}"
            requests:
              cpu: "{{ .Values.resources.requests.cpu }}"
              memory: "{{ .Values.resources.requests.memory }}"
{{- end }}

{{- if .Values.pvc.enabled }}
        image: "{{ .Values.global.registry1 }}/{{ .Values.k8swait.repo }}:{{ .Values.k8swait.tag }}"
        volumeMounts:
          - mountPath: /BACKUP
            name: "{{ template "fullname" . }}"
{{- end }}

我尝试了以下代码,但它不能正常工作

with open(args.dataFileName) as fd:
    data = fd.read()

match = re.findall(r'{{-?\s?if .+ end\s?}}', data, re.DOTALL)

正如您所看到的,所需的输出只包含if语句块,该语句块中有image作为键 如何使用regex实现这一点有什么提示吗?你知道吗


Tags: nameimageifvaluetemplatecpurequestsend
2条回答

正则表达式的一个限制是,只有当if块没有嵌套时,它才起作用。

另外,我只熟悉Jinja2中用于if块{% if %}{% endif %}。所以,我跟随你寻找{{-?\s*if\s*}}{{-?\s*end\s*}}。如果这是不正确的,它很容易补救。你知道吗

import re

text = """apiVersion: extensions/v1beta1
kind: Deployment
metadata:
  name: {{ template "fullname" . }}
  labels:
    app: {{ template "fullname" . }}
    chart: "{{ .Chart.Name }}-{{ .Chart.Version }}"
    release: "{{ .Release.Name }}"
    heritage: "{{ .Release.Service }}"
spec:
  replicas: {{ .Values.replicas }}
  strategy:
    type: RollingUpdate
    rollingUpdate:
      maxSurge: 1
      maxUnavailable: 1
  minReadySeconds: 5
  template:
    metadata:
      labels:
        name: {{ template "fullname" . }}
        app: {{ template "fullname" . }}
    spec:
{{- if .Values.pvc.enabled }}
      volumes:
      - name: {{ template "fullname" . }}
        persistentVolumeClaim:
          claimName: {{ template "claimname" . }}
{{- end }}
{{- if .Values.k8swait.enabled }}
      serviceAccountName: {{ template "fullname" . }}-admin
      initContainers:
        - env:
            - name: CLUSTER
              value: "{{ .Values.k8swait.parameters.cluster}}"
            - name: NAMESPACE
              value: "{{ .Release.Namespace }}"
            - name: RESOURCE
              value: "{{ .Values.k8swait.parameters.resource}}"
            - name: RNAME
              value: "{{ .Values.k8swait.job.jobname }}"
            - name: TIMEOUT
              value: "{{ .Values.k8swait.parameters.timeout}}"
            - name: FREQUENCE
              value: "{{ .Values.k8swait.parameters.frequence}}"
          name: {{ .Values.k8swait.parameters.name}}
          image: "{{ .Values.global.registry1 }}/{{ .Values.k8swait.repo }}:{{ .Values.k8swait.tag }}"
          resources:
            limits:
              cpu: "{{ .Values.resources.limits.cpu }}"
              memory: "{{ .Values.resources.limits.memory }}"
            requests:
              cpu: "{{ .Values.resources.requests.cpu }}"
              memory: "{{ .Values.resources.requests.memory }}"
{{- end }}
      securityContext:
        runAsUser: 1000
        fsGroup: 1000
      containers:
      - name: {{ template "fullname" . }}
        image: "{{ .Values.global.registry }}/{{ .Values.image.repository }}:{{ .Values.image.tag }}"
        imagePullPolicy: {{ default "" .Values.imagePullPolicy | quote }}
        ports:
        - name: http
          containerPort: 9000
{{- if .Values.pvc.enabled }}
        image: "{{ .Values.global.registry1 }}/{{ .Values.k8swait.repo }}:{{ .Values.k8swait.tag }}"
        volumeMounts:
          - mountPath: /BACKUP
            name: "{{ template "fullname" . }}"

{{- end }}"""

start_if = r'{{-?\s*if\s*[^}]+}}' # {{- if }}
end_if = r'{{-?\s*end\s*}}' # {{- end }}
regex = re.compile(f'{start_if}(.*?){end_if}', flags=re.DOTALL)

matches = [m.group(0) for m in regex.finditer(text) if 'image: ' in m.group(1)]

for match in matches:
    print(match)
    print()

印刷品:

{{- if .Values.k8swait.enabled }}
      serviceAccountName: {{ template "fullname" . }}-admin
      initContainers:
        - env:
            - name: CLUSTER
              value: "{{ .Values.k8swait.parameters.cluster}}"
            - name: NAMESPACE
              value: "{{ .Release.Namespace }}"
            - name: RESOURCE
              value: "{{ .Values.k8swait.parameters.resource}}"
            - name: RNAME
              value: "{{ .Values.k8swait.job.jobname }}"
            - name: TIMEOUT
              value: "{{ .Values.k8swait.parameters.timeout}}"
            - name: FREQUENCE
              value: "{{ .Values.k8swait.parameters.frequence}}"
          name: {{ .Values.k8swait.parameters.name}}
          image: "{{ .Values.global.registry1 }}/{{ .Values.k8swait.repo }}:{{ .Values.k8swait.tag }}"
          resources:
            limits:
              cpu: "{{ .Values.resources.limits.cpu }}"
              memory: "{{ .Values.resources.limits.memory }}"
            requests:
              cpu: "{{ .Values.resources.requests.cpu }}"
              memory: "{{ .Values.resources.requests.memory }}"
{{- end }}

{{- if .Values.pvc.enabled }}
        image: "{{ .Values.global.registry1 }}/{{ .Values.k8swait.repo }}:{{ .Values.k8swait.tag }}"
        volumeMounts:
          - mountPath: /BACKUP
            name: "{{ template "fullname" . }}"

{{- end }}

See demo

即使您有嵌套的if语句,您仍然可以使用Regex执行此操作,并且解析文本文件的速度会很快:

import re

code = """
some text ....some text ....some text ....
some text ....some text ....some text ....
{{- if .Values.pvc.enabled [don't extract this 0]}}
     some text ....
{{- end }}
{{- if .Values.k8swait.enabled  [extract this 1}}
      some text ....
      image:
{{- end }}
{{- if .Values.k8swait.enabled [extract this 2]}}
          some text ....
          image: 000
         {{- if [extract this 3]}}
             image: 000 
         {{- end }}
{{- end }}
{{- if .Values.k8swait.enabled  [extract this 4}}
      some text ....
      image:
{{- end }}
{{- if .Values.k8swait.enabled [extract this 5]}}
          some text ....
          image: 000
         {{- if [don't extract this sub if 6 ]}}
         {{- end }}
{{- end }}
"""


def extract_image_if_statement(text):
    # this to extract nested ifs or if preceded by if statements
    sub_if = re.compile("((?:{{-\s*if.+?)+)({{-\s*if.+?end\s*}})", re.DOTALL)
    # this to extract if statement that left by the first pattern
    outer_if = re.compile("{{-\s*if.+?end\s*}}", re.DOTALL)
    # used to get the if statement by index from expression list
    get_if = re.compile("#(\d)#")
    # used to build back full nested expression
    expression = []
    # to hold expression that contains image: word
    result = []
    index = 0

    def extract_if(pattern, repl, index_group):
        """
            extract the if statement to expression and replace it with special word in the text.
            #index_in_expression_list#.
            index_group is the position of the target if statement because we have two pattern.
            repl contains {} to format the current index of extract if statement
        """
        nonlocal text
        nonlocal index
        m = pattern.search(text)
        while m:
            expression.append(m.group(index_group))
            text = pattern.sub(repl.format(index), text)
            m = sub_if.search(text)
            index += 1
        return index

    def build_if_statement(exp):
        """ we have the index of exp in expression so keep building back the statement, this is only for nested statements"""
        while get_if.search(exp):
            exp = get_if.sub(lambda m: expression[int(m.group(1))], exp)
        return exp

    # extract all if statements
    extract_if(sub_if, r'\1#{}#', 2)
    extract_if(outer_if, r'#{}#', 0)

    # for debugging
    # print('\n\n\n'.join(expression))

    result = [build_if_statement(exp) for exp in expression if 'image:' in exp]
    # for debugging
    # print('\n\n'.join(result))
    # print(text)  # if you need Order this will help with it just tell me so I can fix that.
    return result


# Note this extract sub if and outer if if they both have image: word like [2,3]
print(('\n'+'-'*100+'\n').join(extract_image_if_statement(code)))

输出:

{{- if .Values.k8swait.enabled [extract this 5]}}
          some text ....
          image: 000
         {{- if [don't extract this sub if 6 ]}}
         {{- end }}
{{- end }}
                                                  
{{- if .Values.k8swait.enabled  [extract this 4}}
      some text ....
      image:
{{- end }}
                                                  
{{- if [extract this 3]}}
             image: 000 
         {{- end }}
                                                  
{{- if .Values.k8swait.enabled [extract this 2]}}
          some text ....
          image: 000
         {{- if [extract this 3]}}
             image: 000 
         {{- end }}
{{- end }}
                                                  
{{- if .Values.k8swait.enabled  [extract this 1}}
      some text ....
      image:
{{- end }}

如果if statements的顺序对您很重要,那么我们也可以解决这个问题,只需对如何提取嵌套语句进行注释。如果在嵌套if statements的情况下,如果外部表达式有image:单词,并且子表达式也有该单词to,那么在结果中,我提取了两个元素,如果你不想这样做的话,只需放一个注释,我也会修复它。你知道吗

我希望这能帮你好运。你知道吗

相关问题 更多 >