using System;
using System.Text.RegularExpressions;
public class Example
{
public static void Main()
{
string pattern = @"^[ \t]*\*[A-Z]{2,3},\s*(?:[ART]|RSS?)\.?[\n\r](?:(?!^[ \t]*\*[A-Z]{2,3},\s*(?:[ART]|RSS?)\.?[\n\r])[\s\S])+";
string input = @"*GW, A
This is my very first line. The asterics defines a new block, followed by the initials (2-3 chars), a comma, a (possible) space and a code that could be A, R, T, RS or RSS. Followed by that is an optional dot. Linebreak afterwards, where the text comes.
*JP, R.
New block here, as the line (kind of) starts with an asterics. Indentations with 4 spaces or a tab means that it is a second level thing only, that does not need to be stripped away necessarily.
But as you can see, a block can be devided into several
lines,
even with multiple lines.
*GML, T.
And so we continue...
Let's just make sure that a line can start with an
*asterics, without breaking the whole thing.
*GW, RS
Yet another block here.
*GW, RSS.
And a very final one.
Spread over several lines.
*TA, RS.
First level all of a sudden again.
";
RegexOptions options = RegexOptions.Multiline;
foreach (Match m in Regex.Matches(input, pattern, options))
{
Console.WriteLine("'{0}' found at index {1}.", m.Value, m.Index);
}
}
}
Please keep in mind that these code samples are automatically generated and are not guaranteed to work. If you find any syntax errors, feel free to submit a bug report. For a full regex reference for C#, please visit: https://msdn.microsoft.com/en-us/library/system.text.regularexpressions.regex(v=vs.110).aspx