In a couple recent versions of Python (including 3.8 and 3.10), the following code:The documentation for re.sub() and re.findall() has these notes:
import re
print(re.sub(".*", "replacement", "pattern"))
yields the output "replacementreplacement".
This behavior does not occur in 3.6.
Which behavior is the desired one? Perhaps relatedly, I noticed that even
in 3.6, the code
print(re.findall(".*","pattern"))
yields ['pattern',''] which is not what I was expecting.
Alexander Richert - NOAA Affiliate via Python-list schreef opFor what it's worth, there's some discussion about this in this Github
28/12/2022 om 19:42:
In a couple recent versions of Python (including 3.8 and 3.10), theThe documentation for re.sub() and re.findall() has these notes:
following code:
import re
print(re.sub(".*", "replacement", "pattern"))
yields the output "replacementreplacement".
This behavior does not occur in 3.6.
Which behavior is the desired one? Perhaps relatedly, I noticed that even
in 3.6, the code
print(re.findall(".*","pattern"))
yields ['pattern',''] which is not what I was expecting.
"Changed in version 3.7: Empty matches for the pattern are replaced
when adjacent to a previous non-empty match." and "Changed in version
3.7: Non-empty matches can now start just after a previous empty match." That's probably describes the behavior you're seeing. ".*" first
matches "pattern", which is a non-empty match; then it matches the
empty string at the end, which is an empty match but is replaced
because it is adjacent to a non-empty match.
Seems somewhat counter-intuitive to me, but AFAICS it's the intended behavior.
In a couple recent versions of Python (including 3.8 and 3.10), the following code:
import re
print(re.sub(".*", "replacement", "pattern"))
yields the output "replacementreplacement".
This behavior does not occur in 3.6.
Which behavior is the desired one? Perhaps relatedly, I noticed that even
in 3.6, the code
print(re.findall(".*","pattern"))
yields ['pattern',''] which is not what I was expecting.
On 2022-12-28 18:42, Alexander Richert - NOAA Affiliate via Python-list wrote:
In a couple recent versions of Python (including 3.8 and 3.10), theIt's not a bug, it's a change in behaviour to bring it more into line with other regex implementations in other languages.
following code:
import re
print(re.sub(".*", "replacement", "pattern"))
yields the output "replacementreplacement".
This behavior does not occur in 3.6.
Which behavior is the desired one? Perhaps relatedly, I noticed that even
in 3.6, the code
print(re.findall(".*","pattern"))
yields ['pattern',''] which is not what I was expecting.
On 2022-12-28 18:42, Alexander Richert - NOAA Affiliate via Python-list wrote:[...]
print(re.sub(".*", "replacement", "pattern"))
yields the output "replacementreplacement".
It's not a bug, it's a change in behaviour to bring it more into line with other regex implementations in other languages.
Sysop: | Keyop |
---|---|
Location: | Huddersfield, West Yorkshire, UK |
Users: | 300 |
Nodes: | 16 (2 / 14) |
Uptime: | 73:25:18 |
Calls: | 6,714 |
Calls today: | 2 |
Files: | 12,246 |
Messages: | 5,357,174 |