python语言 -- 正则分组处理字符串整理

devtools/2024/10/21 3:35:58/

Python 正则表达式匹配字符串以及分组用法

导入正则表达式模块

python">import re

基本匹配

使用 re.match()、re.search() 和 re.findall() 方法进行基本的字符串匹配。

re.match() 从字符串的开头进行匹配。
re.search() 在字符串的任意位置进行匹配。
re.findall() 返回所有匹配的子字符串。

示例

python">pattern = r'\d+'  # 匹配一个或多个数字text = "There are 123 apples and 456 oranges."# 从字符串开头进行匹配
match = re.match(pattern, text)
if match:print("Match found:", match.group())
else:print("No match at the beginning of the string.")# 在字符串任意位置进行匹配
search = re.search(pattern, text)
if search:print("Search found:", search.group())# 返回所有匹配的子字符串
findall = re.findall(pattern, text)
print("Findall found:", findall)

分组匹配

使用圆括号 () 在正则表达式中定义分组。分组可以提取匹配的子字符串。

示例

python">pattern = r'(\d+) apples and (\d+) oranges'text = "There are 123 apples and 456 oranges."match = re.search(pattern, text)
if match:print("Group 1:", match.group(1))print("Group 2:", match.group(2))

命名分组

可以使用 (?P<name>...) 语法为分组命名。

示例

python">pattern = r'(?P<apples>\d+) apples and (?P<oranges>\d+) oranges'text = "There are 123 apples and 456 oranges."match = re.search(pattern, text)
if match:print("Apples:", match.group('apples'))print("Oranges:", match.group('oranges'))

分组捕获所有匹配

使用 re.finditer() 返回一个迭代器，捕获所有匹配并可以访问每个匹配的分组。

示例

python">pattern = r'(\d+)'text = "There are 123 apples and 456 oranges."matches = re.finditer(pattern, text)
for match in matches:print("Match found:", match.group())

替换匹配的子字符串

使用 re.sub() 方法替换匹配的子字符串。

示例

python">pattern = r'(\d+) apples'text = "There are 123 apples and 456 apples."# 将所有匹配的 'apples' 替换为 'pears'
result = re.sub(pattern, r'\1 pears', text)
print("Substitution result:", result)

复杂示例

组合使用分组和替换来处理更复杂的文本。

示例

python">pattern = r'(\d+)\s+(apples|oranges)'text = "There are 123 apples and 456 oranges."def repl(match):quantity = int(match.group(1))fruit = match.group(2)return f'{quantity * 2} {fruit}'result = re.sub(pattern, repl, text)
print("Complex substitution result:", result)