re
module provides almost the same regular expression support as that of Perl. It lets you specify a pattern that can be checked against other strings that match that pattern.
Python re
supports Unicode as well as 8-bit strings however both can not be intermixed when doing operations like search, replacement, etc.
Regular expressions use backslash('\')
to allow special characters like '\', '\n', '\t'
etc to be used without interpreting their special meaning by Python. Developers can use raw python strings ('r')
to avoid special interpretations of characters in a string.
import re
re.compile(pattern, flags=0)
- It compiles pattern
and generates regular expression object -Pattern
which then can be used for matching, searching other strings, etc. It's a wise decision to compile regular expression using this function if you are going to reuse it many times.compiled_regex = re.compile(r'\W+') ## Matches one or more occurances of any word.
type(compiled_regex)
re.search(pattern,string,flags=0)
- It scans through string
looking for pattern and at first match it returns match object
else returns None
if no match found.## Below strings looks for first occurance of cat in supplied string and returns match object which has match location & other details
match_obj = re.search(r'cat', 'dog catches cat, cat runs behind rat and it goes like cats & dogs')
print(type(match_obj))
match_obj.start(), match_obj.end()
re.match(pattern,string,flags=0)
- It maches pattern
at the begining of string
and returns match object
if match found else returns None
## First one returns None because cat is not at begining and 2nd one returns match object because patters occurs at begining.
match_obj1 = re.match(r'cat', 'dog catches cat, cat runs behind rat and it goes like cats & dogs')
match_obj2 = re.match(r'cat', 'cat runs behind rat, dog goes behind cat and it goes like cats & dogs')
match_obj1, match_obj2
re.split(pattern,string,maxsplit=0,flags=0)
- It splits string
by occurrences of pattern
in it and returns list. If maxsplit
is specified then only that many elements are returned in a list and the remaining matching string is returned as of the last element of a list.## Make a not of last split list. It returned 3 elements (2 because of maxsplit and 1 more remaining string).
print(re.split(r'cat', 'dog catches cat, cat runs behind rat and it goes like cats & dogs'))
print(re.split(r'cat', 'cat runs behind rat, dog goes behind cat and it goes like cats & dogs'))
print(re.split(r'cat', 'cat runs behind rat, dog goes behind cat and it goes like cats & dogs', maxsplit=2))
re.fullmatch(pattern,string,flags=0)
- If whole
string
matches pattern
then returns match object
else returns None
.print(re.fullmatch(r'cat', 'cat runs behind rat, dog goes behind cat and it goes like cats & dogs'))
print(re.fullmatch(r'\w+', 'cat')) ## Matches one of more occurance of word containing any characters at begining
print(re.fullmatch(r'\w+', '!all cat'))
re.findall(pattern,string,flags=0)
- Returns list of all strings matching particular pattern
in string
.re.finditer(pattern,string,flags=0)
- Same as above method but returns iterator
of match object
sprint(re.findall(r'cat', 'cat runs behind rat, dog goes behind cat and it goes like cats & dogs'))
print(re.findall(r'\w+', 'cat runs behind rat, dog goes behind cat and it goes like cats & dogs'))
## Matches iterator of match objects which will be matching pattern in string passed.
result = re.finditer(r'\w+', 'cat runs behind rat, dog goes behind cat and it goes like cats & dogs')
for obj in result:
print(obj)
re.sub(pattern,repl,string,count=0, flags=0)
- Returns new string by replacing all matched pattern
strings with repl
string. If no match then string
is returned untouched.re.subn(pattern,repl,string,count=0, flags=0)
- Same as above method but returns tuple with first element as modified string and 2nd element as number of replacements made in original string.print(re.sub(r'cat', 'kangaroo','cat runs behind rat, dog goes behind cat and it goes like cats & dogs'))
print(re.subn(r'cat', 'kangaroo','cat runs behind rat, dog goes behind cat and it goes like cats & dogs'))
re.escape(pattern)
- Escapes special characters in pattern
which then can be used inside other functions mentioned above.re.purge()
- Clears regular expression cache.re.error(msg,pattern=None,pos=None)
- Use can pass customised error msg
when any issue occurs during compilation, matching or any other operations mentioned above.It raises error if pattern is also not valid regular expression.re.escape('\n Lets check \t') ## It escapes special characters like space, \t, \n so that user can avoid some work.
try:
re.match('[+*','template matching')
except re.error as err:
print('Invalid Regular Expression : '+err.msg+'. Match operation failed.')
custom_err = re.error('Custom user-defined error message')
try:
try:
re.compile('*+')
except re.error as err:
raise custom_err
except re.error as err:
print(err.msg)
If you are more comfortable learning through video tutorials then we would recommend that you subscribe to our YouTube channel.
When going through coding examples, it's quite common to have doubts and errors.
If you have doubts about some code examples or are stuck somewhere when trying our code, send us an email at coderzcolumn07@gmail.com. We'll help you or point you in the direction where you can find a solution to your problem.
You can even send us a mail if you are trying something new and need guidance regarding coding. We'll try to respond as soon as possible.
If you want to