Doubt on Regular Expression

1

Good evening, everyone.

I would like to know a regular expression to get the following information:

Numbering and title of a chapter.

Example:

  • INTRODUCTION
  • number = 1.

    title = INTRODUCTION

    1.1. GENERAL OBJECTIVE

    number = 1.1.

    title = GENERAL OBJECTIVE

    1.2. SPECIFIC OBJECTIVE

    number = 1.2.

    title = SPECIFIC OBJECTIVE

    I need this to generate a table of contents in the following format:

  • | INTRODUCTION | PAGE
  • 1.1. | GENERAL OBJECTIVE | PAGE

    1.2. | SPECIFIC OBJECTIVE | PAGE

    That is, the regular expression should be able to recognize numbers followed by periods in the following generic format:

    Primary Title = > x.

    Secondary Title = > x.x.

    Tertiary Title = > x.x.x.

    Quaternary title = > x.x.x.x.

    And so on.

    I thank you for your attention. Hugs to all.

        
    asked by anonymous 25.01.2017 / 02:15

    1 answer

    2

    This regular expression here solves:

    "^\s*?(?P<numero>(\d\.)+)\s*(?P<titulo>.*)$" 
    

    You did not say which tool to use to apply the regular expression - this might have something specific to Python regular expressions - where I tested it. The documentation is at: link

    At an interactive Python 3.5 prompt:

    In [47]: import re
    
    In [48]: a = """
        ...: 1.INTRODUÇÃO
        ...: número = 1.
        ...: 
        ...: título = INTRODUÇÃO
        ...: 
        ...: 1.1. OBJETIVO GERAL
        ...: 
        ...: número = 1.1.
        ...: título = OBJETIVO GERAL
        ...: 
        ...: 1.2. OBJETIVO ESPECÍFICO
        ...: 
        ...: 1.2.1. Detalhamento
        ...: 2. Outro Capítulo
        ...: """
    
    In [49]: [(m.group('numero'), m.group('titulo')) for m in re.finditer(r"^\s*?(?P<numero>(\d\.)+)\s*(?P<titulo>.*)$", a, re.MULTILINE) ]
    Out[49]: 
    [('1.', 'INTRODUÇÃO'),
     ('1.1.', 'OBJETIVO GERAL'),
     ('1.2.', 'OBJETIVO ESPECÍFICO'),
     ('1.2.1.', 'Detalhamento'),
     ('2.', 'Outro Capítulo')]
    

    (The function re.finditer returns a match object iterator - which in turn has a group method that can be called with the desired group name. The group names in turn are given within the regular expression itself, using the (?P<nome>...) construct) This part of the group names should be the only thing that changes if the regexp tool you use is different from the Python regexps.)

        
    25.01.2017 / 04:08