Jon Ribbens wrote:
On 2021-05-25, Michael Haufe (TNO) <tno@thenewobjective.com> wrote:
Since you are supporting comments, it seems like you could support
named matches without too much additional effort.
I'm not sure what you mean - named capture groups are already a standard
part of JavaScript, so I don't need to add them.
But only because you implicitly limit your runtime environment and target implementation; i.e. regardless of that it is standardized, what matters
here is that you are targeting *only Google V8* JavaScript thanks to
Node.js as the *only* runtime environment :)
I've just published an npm package: https://www.npmjs.com/package/verbose-regexp
It provides a way to use verbose regular expressions in JavaScript and TypeScript, similar to re.VERBOSE in Python. It provides that white-space
at the start and end of lines are ignored, as are newlines, and anything following // to the end of the line.
It allows you to easily write multi-line regular expressions, and to make your regular expressions more self-documenting using formatting and
comments.
[…]
Any comments or thoughts would be appreciated.
On 2021-05-25, Michael Haufe (TNO) <tno@thenewobjective.com> wrote:
Since you are supporting comments, it seems like you could support
named matches without too much additional effort.
I'm not sure what you mean - named capture groups are already a standard
part of JavaScript, so I don't need to add them.
[…] adding named capture groups to JavaScript implementations
that don't already have them would transform the project from a neat
9-line template-string trick into a complete re-implementation of
the JavaScript regular expression engine, which would be a whole
different project...
I've just published an npm package: https://www.npmjs.com/package/verbose-regexp
[ ... ]
It allows you to easily write multi-line regular expressions, and to make your regular expressions more self-documenting using formatting and comments.
[ ... ]
You can use regular expression flags by accessing them as a property of rx, e.g.:
const alpha = rx.i`[a-z]+`
I don't see a great advantage to
```
const dateTime = rx.gi`
(\d{4}-\d{2}-\d{2}) // date
T // time separator
(\d{2}:\d{2}:\d{2}) // time
`
```
over
```
const dateTime = rx (`
(\d{4}-\d{2}-\d{2}) // date
T // time separator
(\d{2}:\d{2}:\d{2}) // time
`, 'gi')
```
especially when I have to remember to include the flags in alphabetic
order and when it won't automatically update if and when the underlying
regex engine includes new flags. Is there a compelling advantage to
this?
Scott Sauyet wrote:
An ill-considered alternative.
The function would receive the string after
escape processing, i.e. given rx(`\d{4}`, 'g') it would receive 'd{4}'.
You'd have to do rx(String.raw`\d{4}`, 'g') which is starting to become
very ugly and verbose.
As an aside, personally I prefer the flags being up front anyway
- it's annoying reading a long regular expression, reaching the
end, and finding that you now need to go back and read it all again,
because there's a modifier flag appended that changes its meaning.
Jon Ribbens wrote:
Scott Sauyet wrote:
An ill-considered alternative.
The function would receive the string after
escape processing, i.e. given rx(`\d{4}`, 'g') it would receive 'd{4}'.
You'd have to do rx(String.raw`\d{4}`, 'g') which is starting to become
very ugly and verbose.
Ah yes. Obviously I had not considered the problem thoroughly. This
does feel like an elegant solution... except for those 127 functions!
As an aside, personally I prefer the flags being up front anyway
- it's annoying reading a long regular expression, reaching the
end, and finding that you now need to go back and read it all again,
because there's a modifier flag appended that changes its meaning.
Agreed, although I don't feel it to be a big deal. I usually scan for
the end of the regex before I even start to analyze it. But they would definitely be better up front.
If I have some extra time in the next few days, I'll spend part of it
trying to create an alternative to the way flags are handled.
Scott Sauyet wrote:
Jon Ribbens wrote:
Ah yes. Obviously I had not considered the problem thoroughly. This
does feel like an elegant solution... except for those 127 functions!
I mean I'm with you on that to an extent, but as overheads go,
given the general massive overhead of using JavaScript rather than,
say, C, it's unnoticeable in the end.
sort of thing that I would probably do voluntarily even if not forced
to ;-) >
Obviously if JavaScript does add further flags to RegExps then there
are downsides including (a) needing to release a new version of the
module and (b) O(2^n) in memory and load time. But it seems unlikely
this would be a weekly occurrence...
[ ... ]
If I have some extra time in the next few days, I'll spend part of it
trying to create an alternative to the way flags are handled.
The only options I can think of off the top of my head are:
(a) properties (like it is now)
(b) a function (e.g. rx('gi')`foo`)
(c) including the flags in the string parameter
Whatever you suggest I'd want it to be backwards compatible with the
current release, which I think all of the above could be. In principle
it could support all of (a), (b), and (c) at once ;-)
Sysop: | Keyop |
---|---|
Location: | Huddersfield, West Yorkshire, UK |
Users: | 250 |
Nodes: | 16 (2 / 14) |
Uptime: | 83:36:30 |
Calls: | 5,510 |
Calls today: | 5 |
Files: | 11,668 |
Messages: | 5,086,179 |