How to separate edital String in Excel

1

I need to separate this type of String (Example):

1 Estruturas lógicas. 2 Lógica de argumentação: analogias, inferências, deduções e conclusões. 3 Lógica sentencial (ou proposicional). 3.1 Proposições simples e compostas. 3.2 Tabelasverdade. 3.3 Equivalências. 3.4 Leis de De Morgan. 3.5 Diagramas lógicos. 4 Lógica de primeira ordem. 5 Princípios de contagem e probabilidade. 6 Operações com conjuntos. 7 Raciocínio lógico envolvendo problemas aritméticos, geométricos e matriciais.

I want what the result looks like, cell below cell:

1 Estruturas lógicas.  

2 Lógica de argumentação: analogias, inferências, deduções e conclusões.  

3 Lógica sentencial (ou proposicional). 

3.1 Proposições simples e compostas. 

3.2 Tabelas-verdade.

3.3 Equivalências.

And so on. However, I did not find a rule / function to do this in Excel's VBA

    
asked by anonymous 20.09.2017 / 20:13

1 answer

2

To solve the problem I studied RegEx (Regular Expression), but as I was learning, I could not create a complex expression that solved the problem. So I went on getting the response on Global OS . Credits: Wiktor Stribiżew

Step 1: Enable Regex in Excel

  • RegEx needs to be enabled, Enable Developer mode
  • In the 'Developer' tab, click 'Visual Basic' and the VBA window will open.
  • Go to 'Tools' - > 'References ...' and a window will open.
  • Look for 'Microsoft VBScript Regular Expressions 5.5', as in the image below. And enable this option.
  • Step2:CreatingRegEx

    A regex demo can be seen, where the expression used is: \d+(?:\.\d+)*[\s\S]*?[\D]+\.(?=\s*(?:\d+|$))

    Where the developer's explanation of this code is as follows:

    Detailhes

    • \d+ - 1 or more digits
    • (?:\.\d+)* - zero or more sequences of:
      • \. - point
      • \d+ - 1 or more digits
    • [\s\S]*? - any 0+ chars, minimum possible, up to the first ...
    • \w+\. - 1 or more chars followed by .
    • Verifying% positive requires the presence of zero or more whitespace (?=\s*(?:\d+|$)) followed by one or more digits (\s*) or end of string (\d+) immediately to the right of the current location.

    Step 3: VBA Code

    Dim str As String
    Dim objMatches As Object
    str = "1 Estruturas lógicas. 2 Lógica de argumentação: analogias, inferências, deduções e conclusões. 3 Lógica sentencial (ou proposicional). 3.1 Proposições simples e compostas. 3.2 Tabelasverdade. 3.3 Equivalências. 3.4 Leis de De Morgan. 3.5 Diagramas lógicos. 4 Lógica de primeira ordem. 5 Princípios de contagem e probabilidade. 6 Operações com conjuntos. 7 Raciocínio lógico envolvendo problemas aritméticos, geométricos e matriciais."
    Set objRegExp = New regexp ' CreateObject("VBScript.RegExp")
    objRegExp.Pattern = "\d+(?:\.\d+)*[\s\S]*?\D\.(?=\s*(?:\d|$))"
    objRegExp.Global = True
    Set objMatches = objRegExp.Execute(str)
    If objMatches.Count <> 0 Then
      For Each m In objMatches
          Debug.Print m.Value
      Next
    End If
    

    With output from ($) of:

    Note

    RegExwasusinganyletterDebug.Printfollowedbyaperiodtodeterminetheendofaniteminthelist.Butincaseofending[a-zA-Z]wasnotrecognized,therefore,itwaschangedfromanylettertoanynon-digit)..

      

    Notethat\Dmatchesanynon-digitsymbol.Youmaywanttouse  \D.Andfor\d+(?:\.\d+)*[\s\S]*?\D\.(?=\s*(?:\d|$)),itcancombine  startoftheString(withRegExp.Multiline=False),orstarttheline  (withRegExp.Multiline=True)-WiktorStribiżew5hoursago

        

    Andanothertip:Iftherearenolinebreaksinthestring,^can  bereplacedby[\s\S]-WiktorStribiżew4hoursago

    • Anewtesthasbeenperformedwiththenewexpressionandcanbe view here . Where the new expression was replaced in code in an edit of this answer:

      objRegExp.Pattern = "\d+(?:\.\d+)*[\s\S]*?\D\.(?=\s*(?:\d|$))"
      
    • If the error occurs, Compile error: The user-defined type has not been defined. And this line is spelled: .

      Uncomment the CreateObject and delete New regex code, like this: Set objRegExp = New regexp ' CreateObject("VBScript.RegExp")

    Deploy in cells

    A Brief code to implement in an Excel spreadsheet. To allocate these values in cells just work with the value Set objRegExp = CreateObject("VBScript.RegExp")

    Assuming the string in cell A1 and the values will be entered from cell B1 through B last line.

    Dim str As String
    Dim objMatches As Object
    Dim i As Long: Let i = 1
    str = Cells(1, 1)
    Set objRegExp = CreateObject("VBScript.RegExp")
    objRegExp.Pattern = "\d+(?:\.\d+)*[\s\S]*?\D\.(?=\s*(?:\d|$))"
    objRegExp.Global = True
    Set objMatches = objRegExp.Execute(str)
    If objMatches.Count <> 0 Then
      For Each m In objMatches
          Cells(i, 2) = m.Value
          i = i + 1
      Next
    End If
    
        
    21.09.2017 / 02:50