Regex with SQL Server 2008 CLR performance issues -
i trying understand why taking long execute simple query. in local machine takes 10 seconds in production takes 1 min. (i imported database production local database)
select * jobhistory dbo.likeinlist(instanceid, 'e218553d-aad1-47a8-931c-87b52e98a494') = 1 the table datahistory not indexed , has 217,302 rows
public partial class userdefinedfunctions { [sqlfunction] public static bool likeinlist([sqlfacet(maxsize = -1)]sqlstring value, [sqlfacet(maxsize = -1)]sqlstring list) { foreach (string val in list.value.split(new char[] { ',' }, stringsplitoptions.none)) { regex re = new regex("^.*" + val.trim() + ".*$", regexoptions.ignorecase); if (re.ismatch(value.value)) { return(true); } } return (false); } }; and issue if table has 217k rows calling function 217,000 times! not sure how can rewrite thing.
thank you
there several issues code:
- missing
(isdeterministic = true, isprecise = true)in[sqlfunction]attribute. doing (mainlyisdeterministic = truepart) allow sqlclr udf participate in parallel execution plans. without settingisdeterministic = true, function prevent parallel plans, t-sql udfs do. - return type
boolinstead ofsqlboolean - regex call inefficient: using instance method once expensive. switch using static
regex.ismatchinstead regex pattern very inefficient: wrapping search string in "^.*" , ".*$" require regex engine parse and retain in memory "match", entire contents of
valueinput parameter, every single iteration offoreach. yet behavior of regular expressions such usingval.trim()entire pattern yield exact same result.
(optional) if neither input parameter ever on 4000 characters, specify
maxsizeof4000instead of-1sincenvarchar(4000)fasternvarchar(max)passing data into, , out of, sqlclr objects.
Comments
Post a Comment