python如何使用正则表达式的前向、后向搜索及前向搜索否定模式详解

时间：2021-05-22

前言

在许多的情况下，很多要匹配内容是一起出现，或者一起不出现的。比如《》，< >，这样的括号，不存在使用半个的情况。因此，在正则表达式里也有一致性的判断，要么两个尖括号一起出现，要么一个也不要出现。怎么样来实现这种判断呢？针对这种情况得引入新的正则表达式的语法：(?=pattern)，这个语法它会向前搜索或者向后搜索相关内容，如果不会出现就不能匹配。不过，这个匹配不会消耗任何输入的字符，它只是查看一下。

例子如下：

#python 3.6 #蔡军生 #http://blog.csdn.net/caimouse/article/details/51749579 # import re address = re.compile( ''''' # A name is made up of letters, and may include "." # for title abbreviations and middle initials. ((?P<name> ([\w.,]+\s+)*[\w.,]+ ) \s+ ) # name is no longer optional # LOOKAHEAD # Email addresses are wrapped in angle brackets, but only # if both are present or neither is. (?= (<.*>$) # remainder wrapped in angle brackets | ([^<].*[^>]$) # remainder *not* wrapped in angle brackets ) <? # optional opening angle bracket # The address itself: username@domain.tld (?P<email> [\w\d.+-]+ # username @ ([\w\d.]+\.)+ # domain name prefix (com|org|edu) # limit the allowed top-level domains ) >? # optional closing angle bracket ''', re.VERBOSE) candidates = [ u'First Last <first.last@example.com>', u'No Brackets first.last@example.com', u'Open Bracket <first.last@example.com', u'Close Bracket first.last@example.com>', ] for candidate in candidates: print('Candidate:', candidate) match = address.search(candidate) if match: print(' Name :', match.groupdict()['name']) print(' Email:', match.groupdict()['email']) else: print(' No match')

结果输出如下：

Candidate: First Last <first.last@example.com> Name : First Last Email: first.last@example.comCandidate: No Brackets first.last@example.com Name : No Brackets Email: first.last@example.comCandidate: Open Bracket <first.last@example.com No matchCandidate: Close Bracket first.last@example.com> No match

python里使用正则表达式的前向搜索否定模式

上面学习前向搜索或后向搜索模式(?=pattern)，这个模式里看到有等于号=，它是表示一定相等，其实前向搜索模式里，还有不相等的判断。比如你需要识别EMAIL地址：noreply@example.com，这个EMAIL地址大多数是不需要回复的，所以我们要把这个EMAIL地址识别出来，并且丢掉它。怎么办呢？这时你就需要使用前向搜索否定模式，它的语法是这样：(?!pattern)，这里的感叹号就是表示非，不需要的意思。比如遇到这样的字符串：noreply@example.com，它会判断noreply@是否相同，如果相同，就丢掉这个模式识别，不再匹配。

例子如下：

#python 3.6 #蔡军生 #http://blog.csdn.net/caimouse/article/details/51749579 # import re address = re.compile( ''''' ^ # An address: username@domain.tld # Ignore noreply addresses (?!noreply@.*$) [\w\d.+-]+ # username @ ([\w\d.]+\.)+ # domain name prefix (com|org|edu) # limit the allowed top-level domains $ ''', re.VERBOSE) candidates = [ u'first.last@example.com', u'noreply@example.com', ] for candidate in candidates: print('Candidate:', candidate) match = address.search(candidate) if match: print(' Match:', candidate[match.start():match.end()]) else: print(' No match')

结果输出如下：

Candidate: first.last@example.com Match: first.last@example.comCandidate: noreply@example.com No match

总结

以上就是这篇文章的全部内容了，希望本文的内容对大家的学习或者工作具有一定的参考学习价值，如果有疑问大家可以留言交流，谢谢大家对的支持。

python如何使用正则表达式的前向、后向搜索及前向搜索否定模式详解

相关文章

简述JavaScript中正则表达式的使用方法

MongoDB正则表达式及应用

Java正则表达式学习教程

常用正则表达式大全(金钱,非负整数,正整数,邮箱,手机号码)

正则表达式教程之模式修正符使用介绍