std::match_results
(匹配的结果存入其中)
result[0]是完整的文本,result[1]是第一个分组匹配的数据。如果正则表达式有n个分组,match_results的size也就是n+1个
This is a specialized allocator-aware container. It can only be default created, obtained from std::regex_iterator, or modified by std::regex_search or std::regex_match. Because std::match_results holds std::sub_matches, each of which is a pair of iterators into the original character sequence that was matched, it’s undefined behavior to examine std::match_results if the original character sequence was destroyed or iterators to it were invalidated for other reasons.
Type | Definition |
---|---|
std::cmatch | std::match_results<const char*> |
std::wcmatch | std::match_results<const wchar_t*> |
std::smatch | std::match_results<std::string::const_iterator> |
std::wsmatch | std::match_results<std::wstring::const_iterator> |
std::pmr::cmatch (C++17) | std::pmr::match_results<const char*> |
std::pmr::wcmatch (C++17) | std::pmr::match_results<const wchar_t*> |
std::pmr::smatch (C++17) | std::pmr::match_results<std::string::const_iterator> |
std::pmr::wsmatch (C++17) | std::pmr::match_results<std::wstring::const_iterator> |
std::sub_match
用来观测match_results的结果
The class template std::sub_match is used by the regular expression engine to denote sequences of characters matched by marked sub-expressions.
regex_match
Returns true if a match exists, false otherwise.
#include <iostream>
#include <regex>
#include <string>int main()
{// Simple regular expression matchingconst std::string fnames[] = {"foo.txt", "bar.txt", "baz.dat", "zoidberg"};const std::regex txt_regex("[a-z]+\\.txt");for (const auto &fname : fnames)std::cout << fname << ": " << std::regex_match(fname, txt_regex) << '\n';
/*
foo.txt: 1
bar.txt: 1
baz.dat: 0
zoidberg: 0
*/// Extraction of a sub-matchconst std::regex base_regex("([a-z]+)\\.txt");std::smatch base_match;for (const auto &fname : fnames){if (std::regex_match(fname, base_match, base_regex)){// The first sub_match is the whole string; the next// sub_match is the first parenthesized expression.if (base_match.size() == 2){std::ssub_match base_sub_match = base_match[1];std::string base = base_sub_match.str();std::cout << fname << " has a base of " << base << '\n';}}}
/*
foo.txt has a base of foo
bar.txt has a base of bar
*/// Extraction of several sub-matchesconst std::regex pieces_regex("([a-z]+)\\.([a-z]+)");std::smatch pieces_match;for (const auto &fname : fnames){if (std::regex_match(fname, pieces_match, pieces_regex)){std::cout << fname << '\n';for (size_t i = 0; i < pieces_match.size(); ++i){std::ssub_match sub_match = pieces_match[i];std::string piece = sub_match.str();std::cout << " submatch " << i << ": " << piece << '\n';}}}
}
/*
foo.txtsubmatch 0: foo.txtsubmatch 1: foosubmatch 2: txt
bar.txtsubmatch 0: bar.txtsubmatch 1: barsubmatch 2: txt
baz.datsubmatch 0: baz.datsubmatch 1: bazsubmatch 2: dat
*/
regex_search
std::regex_search: 搜素正则表达式参数,但它不要求整个字符序列完全匹配。而且它只进行单次搜索,搜索到即停止继续搜索,不进行重复多次搜索。
Determines if there is a match between the regular expression e and some subsequence in the target character sequence.
1- Analyzes generic range [first, last). Match results are returned in m.
2- Analyzes a null-terminated string pointed to by str. Match results are returned in m.
3- Analyzes a string s. Match results are returned in m.
4-6- Equivalent to (1-3), just omits the match results.
7- The overload (3) is prohibited from accepting temporary strings, otherwise this function populates match_results m with string iterators that become invalid immediately.
regex_search will successfully match any subsequence of the given sequence, whereas std::regex_match will only return true if the regular expression matches the entire sequence.
#include <iostream>
#include <regex>
#include <string>int main()
{std::string lines[] = {"Roses are #ff0000","violets are #0000ff","all of my base are belong to you"};std::regex color_regex("#([a-f0-9]{2})""([a-f0-9]{2})""([a-f0-9]{2})");// simple matchfor (const auto &line : lines) {std::cout << line << ": " << std::boolalpha<< std::regex_search(line, color_regex) << '\n';} std::cout << '\n';// show contents of marked subexpressions within each matchstd::smatch color_match;for (const auto& line : lines) {if(std::regex_search(line, color_match, color_regex)) {std::cout << "matches for '" << line << "'\n";std::cout << "Prefix: '" << color_match.prefix() << "'\n";for (size_t i = 0; i < color_match.size(); ++i) std::cout << i << ": " << color_match[i] << '\n';std::cout << "Suffix: '" << color_match.suffix() << "\'\n\n";}}// repeated search (see also std::regex_iterator)std::string log(R"(Speed: 366Mass: 35Speed: 378Mass: 32Speed: 400Mass: 30)");std::regex r(R"(Speed:\t\d*)");std::smatch sm;while(regex_search(log, sm, r)){std::cout << sm.str() << '\n';log = sm.suffix();}// C-style string demostd::cmatch cm;if(std::regex_search("this is a test", cm, std::regex("test"))) std::cout << "\nFound " << cm[0] << " at position " << cm.prefix().length();
}
std::regex_replace
- Copies characters in the range [first, last) to out, replacing any sequences that match re with characters formatted by fmt. In other words:
Constructs a std::regex_iterator object i as if by std::regex_iterator<BidirIt, CharT, traits> i(first, last, re, flags), and uses it to step through every match of re within the sequence [first,last).
For each such match m, copies the non-matched subsequence (m.prefix()) into out as if by out = std::copy(m.prefix().first, m.prefix().second, out) and then replaces the matched subsequence with the formatted replacement string as if by calling out = m.format(out, fmt, flags).
When no more matches are found, copies the remaining non-matched characters to out as if by out = std::copy(last_m.suffix().first, last_m.suffix().second, out) where last_m is a copy of the last match found.
If there are no matches, copies the entire sequence into out as-is, by out = std::copy(first, last, out)
If flags contains std::regex_constants::format_no_copy, the non-matched subsequences are not copied into out.
If flags contains std::regex_constants::format_first_only, only the first match is replaced. - same as 1), but the formatted replacement is performed as if by calling out = m.format(out, fmt, fmt + char_traits::length(fmt), flags)
3-4) Constructs an empty string result of type std::basic_string<CharT, ST, SA> and calls std::regex_replace(std::back_inserter(result), s.begin(), s.end(), re, fmt, flags).
5-6) Constructs an empty string result of type std::basic_string and calls std::regex_replace(std::back_inserter(result), s, s + std::char_traits::length(s), re, fmt, flags)
Return value
1-2) Returns a copy of the output iterator out after all the insertions.
3-6) Returns the string result which contains the output.
#include <iostream>
#include <iterator>
#include <regex>
#include <string>int main()
{std::string text = "Quick brown fox";std::regex vowel_re("a|e|i|o|u");// write the results to an output iteratorstd::regex_replace(std::ostreambuf_iterator<char>(std::cout),text.begin(), text.end(), vowel_re, "*");// construct a string holding the resultsstd::cout << '\n' << std::regex_replace(text, vowel_re, "[$&]") << '\n';
}
std::regex_iterator
It is the programmer’s responsibility to ensure that the std::basic_regex object passed to the iterator’s constructor outlives the iterator. Because the iterator stores a pointer to the regex, incrementing the iterator after the regex was destroyed accesses a dangling pointer.
If the part of the regular expression that matched is just an assertion (^, $, \b, \B), the match stored in the iterator is a zero-length match, that is, match[0].first == match[0].second.
#include <iostream>
#include <iterator>
#include <regex>
#include <string>int main()
{const std::string s = "Quick brown fox.";std::regex words_regex("[^\\s]+");auto words_begin =std::sregex_iterator(s.begin(), s.end(), words_regex);auto words_end = std::sregex_iterator();std::cout << "Found "<< std::distance(words_begin, words_end)<< " words:\n";for (std::sregex_iterator i = words_begin; i != words_end; ++i){std::smatch match = *i;std::string match_str = match.str();std::cout << match_str << '\n';}
}