c# - regular expression to escape html entities -
i have string contains html formatting tags. using regex.replace() convert text within tags character "x" want leave tags are. string escaping tags.
as far understood question, following code solve problem:
string str = @"hi, how you. hope doing good"; regex.replace(str, @"[a-za-z0-9]", "z");
output:
zz, zzz zzz zzz. zzzz zzz zzz zzzzz zzzz
if don't want replace numbers occuring in word z
remove 0-9
. if want other characters in words replace z
include there. e.g., [a-za-z\-]
include -
in regex , replace z
too. in regex, note match special characters, should preceded \
. e.g., match [
, should \[
.
hope help.
update:
i got you. solution problem is:
string str = @"hi, how you. hope doing & ¥ whatis goind on &lessthan;and"; matchcollection matches = regex.matches(str, @"\b(?!&)[a-za-z]+(?!;)\b"); foreach (match m in matches) { string oldword = m.tostring(); str = regex.replace(str, oldword, regex.replace(m.tostring(), @".", "z")); } console.writeline(str);
and, output is:
zz, zzz zzz zzz. zzzz zzz zzz zzzzz zzzz & ¥ zzzzzz zzzzz zz &lessthan;zzz
note: if want use html convert/extract characters or pattern it's fine.but, if planning parse whole html using regex don't try so. because html non-regular language. , regex doesn't have capability parse html.
Comments
Post a Comment