Regex take from one point to another within a text

5

I have the following text:

From: .... blabla bla
Message: blablabalab

//linha em branco

From: .... blabla bla
Message: blablabalab

//linha em Branco

From: .... blabla bla
Message: blablabalab

How do I make my regex pick up where From starts and before I start the next From ?

So far I have the following regex: From\s\-\s\w{3}\s\w{3}([^\n]*\n+)+ . What I want is for every piece of text that contains From to start another From to be in a group. Only my regex tah taking everything until the end of the text.

Does anyone know how I can do this?

    
asked by anonymous 16.03.2015 / 13:08

5 answers

5

I was able to resolve using Regex.Split() , with regex @"From\s\-\s\w{3}\s\w{3}[^\n]*\n+" .

1 - I'm using regex because I want to get it from the first From - Fri Mar 13 10:58:58 2015 Until the beginning of the next. And in this range there are other From in the middle of the message, but without the date.

so it looks like this:

string[] split = Regex.Split(texto, rgxSplit);

It will return an array with all the texts that are between From - ... and another.

    
16.03.2015 / 15:13
6

I made another regular expression for you that looked like this:

From:\s*([\.\w\d\s]*)\nMessage:\s*([\.\w\d\s]*)\n

I made a proof of concept here .

    
16.03.2015 / 14:45
4

You have the method Regex.Split that can be used for that.

using System.Text.RegularExpressions;
....

public static void Main() {
        string texto = @"From: .... blabla bla
        Message: blablabalab

        //linha em branco
        From: .... blabla bla
        Message: blablabalab

        //linha em Branco
        From: .... blabla bla
        Message: blablabalab";

        string[] pedacos = Regex.Split(texto, "From:\s+");
        foreach (var pedaco in pedacos){
            Console.WriteLine(pedaco);
        }
        Console.ReadLine();
}

But in this case you do not need to use this method because it does not involve regular expressions, so Split traditional may be better used here.

If multiple delimiters are required, the following can be done:

string[] pedacos = texto.Split(new string[] { "From: ", "Message: " }, StringSplitOptions.RemoveEmptyEntries);
foreach (var pedaco in pedacos){
    Console.WriteLine(pedaco);
}

Functional example here .

    
16.03.2015 / 14:48
3

Good people, I decided it was a shame to never have done a program in C #. From there (the moment is solemn!) I installed the mono and tried to see if the Perl idea was applicable. IS. I was so happy that I decided to write a new answer!

These being my first 10 lines of C #, constructive suggestions are welcome.

To differentiate, I enriched the regular expression to separate components (from / message):

using System;
using System.Text.RegularExpressions;

class Program{
    static void Main(){
       string text = @"From:.....(Cut&paste exemplo da pergunta)....balab\n";

       Regex r = new Regex(@"(From:(.*)\n((.|\n)+?\n)(?=From:|$))");
       MatchCollection m= r.Matches(text);
       foreach (Match k in m) {
 //       Console.WriteLine("##Full# " + k.Groups[1].Value);
          Console.WriteLine("##From# " + k.Groups[2].Value);
          Console.WriteLine("##Mesg# " + k.Groups[3].Value);
       }
    }
 }

After $ gmcs regexp.cs ; regexp.exe the output was:

##From#  .... blabla bla
##Mesg# Message: blablabalab

//linha em branco

##From#  .... blabla bla
##Mesg# Message: blablabalab

//linha em Branco

##From#  .... blabla bla
##Mesg# Message: blablabalab
    
20.03.2015 / 13:46
0

Okay, okay, it's not well c # nor .net but here's a regular expression

/(.+?\n)(?=From:|$)/s

I think I am within the spirit of the statement. Example usage:

perl -n0E 'for $mail ( m/(.+?\n)(?=From:|$)/sg ){ 
                  print "===\n", $mail
           }' file

With the example question give

===
From: .... blabla bla
Message: blablabalab

//linha em branco

===
From: .... blabla bla
Message: blablabalab

//linha em Branco

===
From: .... blabla bla
Message: blablabalab
    
19.03.2015 / 19:47