I'm using Regex list to parse FTP server listing. I'm not good with Regex at all, this is list of regex I collected online to parse various server FTP outputs:
private static readonly string[] DirectoryParseFormats =
{
"(?<dir>[\\-d])(?<permission>([\\-r][\\-w][\\-xs]){3})\\s+\\d+\\s+\\w+\\s+\\w+\\s+(?<size>\\d+)\\s+(?<timestamp>\\w+\\s+\\d+\\s+\\d{4})\\s+(?<name>.+)",
"(?<dir>[\\-d])(?<permission>([\\-r][\\-w][\\-xs]){3})\\s+\\d+\\s+\\d+\\s+(?<size>\\d+)\\s+(?<timestamp>\\w+\\s+\\d+\\s+\\d{4})\\s+(?<name>.+)",
"(?<dir>[\\-d])(?<permission>([\\-r][\\-w][\\-xs]){3})\\s+\\d+\\s+\\d+\\s+(?<size>\\d+)\\s+(?<timestamp>\\w+\\s+\\d+\\s+\\d{1,2}:\\d{2})\\s+(?<name>.+)",
"(?<dir>[\\-d])(?<permission>([\\-r][\\-w][\\-xs]){3})\\s+\\d+\\s+\\w+\\s+\\w+\\s+(?<size>\\d+)\\s+(?<timestamp>\\w+\\s+\\d+\\s+\\d{1,2}:\\d{2})\\s+(?<name>.+)",
"(?<dir>[\\-d])(?<permission>([\\-r][\\-w][\\-xs]){3})(\\s+)(?<size>(\\d+))(\\s+)(?<ctbit>(\\w+\\s\\w+))(\\s+)(?<size2>(\\d+))\\s+(?<timestamp>\\w+\\s+\\d+\\s+\\d{2}:\\d{2})\\s+(?<name>.+)",
"(?<timestamp>\\d{2}\\-\\d{2}\\-\\d{2}\\s+\\d{2}:\\d{2}[Aa|Pp][mM])\\s+(?<dir>\\<\\w+\\>){0,1}(?<size>\\d+){0,1}\\s+(?<name>.+)"
};
Now I stumbled upon following output from odd FTP server. What's weird is that server outputs file name together with folder name for some reason.
Anyway, I'd like to have similar RegEx for this string, ideally introduce folder
name to separate it out, String returned by server is what's inside pipes |
|-rw-rw-rw- 1 generic 235 Mar 22 11:21 fromDoder/DOD997ABCD.20170322112114159.1961812284.txt|
EDIT:
Here is C# code I use to iterate through regex expressions to pick one that matches FTP server output. Then I use it to parse out file name and type
// Use our regex library to parse
match = DirectoryParseFormats.Select(dpf => new Regex(dpf).Match(raw)).FirstOrDefault(m => m.Success);
if (match == null) throw new Exception($"Can't parse FTP directory list item. raw item: |{raw}|, whole response: |{response}|");
// If not directory - this is file
var dir = match.Groups["dir"].Value;
if (dir == string.Empty || dir == "-") list.Add(match.Groups["name"].Value);
EDIT 2:
total 0
drw-rw-rw- 1 user group 0 Apr 23 2016 .
drw-rw-rw- 1 user group 0 Apr 23 2016 ..
EDIT 3:
var hintRegex = @"^
(?<dir>[-d])
(?<permission>(?:[-r][-w][-xs]){3})
\s+\d+
\s+\w+
(?:\s+\w+)?
\s+(?<size>\d+)
\s+(?<timestamp>\w+\s+\d+(?:\s+\d+(?::\d+)?))
\s+(?!(?:\.|\.\.)\s*$)(?<name>.+?)\s*
$";
Match match = new Regex(hintRegex).Match("-rw-r--r-- 1 ftp ftp 1079 Apr 06 2017 LEANCOR_040617084839.txt");
if (!match.Success) Debug.WriteLine("Doesn't match");