When programmers need to match text against a pattern I can’t think of a more powerful tool than regular expressions (regex). But regex can seem complicated and has a steep (but short) learning curve. As much as we sometimes want to, we can’t expect end-users to use (or know about) regex. A simple asterisk (*) wildcard, however, seems to be much widely understood.
Subject = testuser1@example.com
Pattern = testuser*@example.com
Pattern = *user*@example.com
Pattern = *user*@example.*
Pattern = testuser*@*.com
Pattern = testuser*@*.*
This is a walk-through of the way I implemented wildcard string matching using regular expressions.
1. Pre-flight check
It’s always worth checking to see if the code is applicable before we get into it. Usually it’s a null-check or item-count, in this case we check that the pattern has any “*” characters.
If it doesn’t have wildcards, we’ll do a case-insensitive equals-match.
int wildcardCount = wildcardPattern.Count(x => x.Equals('*'));
if (wildcardCount <= 0)
{
return subject.Equals(wildcardPattern, StringComparison.CurrentCultureIgnoreCase);
}
2. Plan your escape first
Regex uses certain characters (metacharacters) to set special behaviours, escape our pattern string first so it doesn’t interfere with the expression.
There’s a few more things going on here.
We want to match the whole string from beginning (“$”) to end (“^”). I’ll take the opportunity to add those now.
Any time the “*” is used, we want to match it with any word containing any characters (“.”) of any length (“*”) including zero length. In other words, where it matches any word or nothing at all. Remember that we escaped the “*”, that’s what those extra slashes are.
string regexPattern = string.Concat("^", Regex.Escape(wildcardPattern).Replace("\\*", ".*"), "$");
3. Pure strings are faster
I’ve always considered that regex is slower than the native Strings. I’m not sure if that’s true anymore but we’ll continue regardless.
If there is just one “*” at the very beginning or end of a pattern, it’s just a simple “startsWith” or “endsWith” match.
Let’s deal with that separately. Remember that we need to remove the “*” before matching.
int wildcardCount = wildcardPattern.Count(x => x.Equals('*'));
if (wildcardCount <= 0)
{
// todo: done
}
else if (wildcardCount == 1)
{
string newWildcardPattern = wildcardPattern.Replace("*", "");
if (wildcardPattern.StartsWith("*"))
{
return subject.EndsWith(newWildcardPattern, StringComparison.CurrentCultureIgnoreCase);
}
else if (wildcardPattern.EndsWith("*"))
{
return subject.StartsWith(newWildcardPattern, StringComparison.CurrentCultureIgnoreCase);
}
else
{
// todo
}
}
4. Regex it up
We can finish up with the regex matching. I’ve wrapped in a try-catch for some extra stability.
try
{
return Regex.IsMatch(subject, regexPattern);
}
catch
{
return false;
}
5. Done
This is the finished code; it’s reasonably straight-forward.
public bool IsWildcardMatch(string wildcardPattern, string subject)
{
if (string.IsNullOrWhiteSpace(wildcardPattern))
{
return false;
}
string regexPattern = string.Concat("^", Regex.Escape(wildcardPattern).Replace("\\*", ".*"), "$");
int wildcardCount = wildcardPattern.Count(x => x.Equals('*'));
if (wildcardCount <= 0)
{
return subject.Equals(wildcardPattern, StringComparison.CurrentCultureIgnoreCase);
}
else if (wildcardCount == 1)
{
string newWildcardPattern = wildcardPattern.Replace("*", "");
if (wildcardPattern.StartsWith("*"))
{
return subject.EndsWith(newWildcardPattern, StringComparison.CurrentCultureIgnoreCase);
}
else if (wildcardPattern.EndsWith("*"))
{
return subject.StartsWith(newWildcardPattern, StringComparison.CurrentCultureIgnoreCase);
}
else
{
try
{
return Regex.IsMatch(subject, regexPattern);
}
catch
{
return false;
}
}
}
else
{
try
{
return Regex.IsMatch(subject, regexPattern);
}
catch
{
return false;
}
}
}
That is it.
I hope someone finds this useful. If that someone is future-Ray then, “you’re welcome, nerd”.
Posted on Sat 18th Apr 2020
Modified on Sun 13th Mar 2022