Hello all,
I'm trying to match group of characters, delimited by anything that does not belong in that group.
Case in point: "Couldn't understand a g**d*** word she was saying!"
I would like to match the "g**d***" sequence of letters.
For that I've tried to use a RegEx :
RegExp("(?<=[^a-zA-Z\*])"+escapeRegExp(find)+"(?=[^a-zA-Z\*])",'gi')
-or-
RegExp("(?<![a-zA-Z\*])"+escapeRegExp(find)+"(?=[^a-zA-Z\*])",'gi')
("escapeRegExp()" escapes the "*" to "\*")
The problem is that neither of the above seem to match anything.
When I use the following
RegExp("\ \b"+escapeRegExp(find)+"(?=[^a-zA-Z\*])",'gi')
(no space between the first two slashes. Had to insert it otherwise my newsclient turns it into a link)
all goes well - but for the problem that than the partial "d***" is matched before "g**d***", throwing everything off.
Question: if not the above, what /am/ I suppose to use as the "look behind assertion" ?
Regards,
Rudy Wieser
Hello all,
I'm trying to match group of characters, delimited by anything that does not belong in that group.
Case in point: "Couldn't understand a g**d*** word she was saying!"
I would like to match the "g**d***" sequence of letters.
For that I've tried to use a RegEx :
RegExp("(?<=[^a-zA-Z\*])"+escapeRegExp(find)+"(?=[^a-zA-Z\*])",'gi')
-or-
RegExp("(?<![a-zA-Z\*])"+escapeRegExp(find)+"(?=[^a-zA-Z\*])",'gi')
("escapeRegExp()" escapes the "*" to "\*")
The problem is that neither of the above seem to match anything.
When I use the following
RegExp("\ \b"+escapeRegExp(find)+"(?=[^a-zA-Z\*])",'gi')
(no space between the first two slashes. Had to insert it otherwise my newsclient turns it into a link)
all goes well - but for the problem that than the partial "d***" is matched before "g**d***", throwing everything off.
Question: if not the above, what /am/ I suppose to use as the "look behind assertion" ?
Regards,
Rudy Wieser
Look-behind is used to match patternA which follows or not follows
patternB, and does not include patternB in the match result.
Your condition is to simply to match "g**d***". You do not have a
condition
that, "g**d***" must follow or must not follow other pattern.
For matching just
|"alpha beta g****d*** eps".replace( /.*?\b([\w*]*\*[\w*]).*/, '$1' )| |
|"alpha beta g****d*** eps".replace( /.*?\b([\w*]*\*[\w*]).*/, '$1' )| |
|<|"g****d***"
TNP,
In the time you have wasted trying to learn enough regexp, failing, and
posting here, you could have written it in C three times over...
Good idea ! Now all you have to tell me how I get that C code loaded and running in a browsers webpage ...
... Idiot.
Regards,
Rudy Wieser
In the time you have wasted trying to learn enough regexp, failing, and posting here, you could have written it in C three times over...
Google cgi-bin and AJAX
TNP,
Google cgi-bin and AJAX
Good idea ! Now a simple, locally run JS command is translated to
something *way* heavier and complex, needing to communicate with a server
and going over two different frameworks to do the work . Oh wait, three including the actual search-and-replace program.
I can see it now : the server sends a full webpage to the browser, than the browser extracts all textparts from it one by one, sends each of them back
to the server and asks it to do a search-and-replace, after which the server sends that part back and have Ajax replace the involved textpart in the browser. Rinse and repeat for a list of to be replaced words. One webpage, going around at least trice.
Nahhh, I don't think so. It would be *much* simpler to just send the whole list of words and than have the server do everything at once and send the full updated webpage back. Using PHP for the search-and-replace ofcourse.
... if the JS I posted would be coming from the/a server to begin with.
Which it doesn't. IOW, the above doesn't even apply.
Besides, JS itself is good a enough language to create a non-regexp solution in.
Regards,
Rudy Wieser
On Sat, 21 May 2022 11:20:22 +0200, R.Wieser wrote:
For that I've tried to use a RegEx :
RegExp("(?<=[^a-zA-Z\*])"+escapeRegExp(find)+"(?=[^a-zA-Z\*])",'gi')
-or-
RegExp("(?<![a-zA-Z\*])"+escapeRegExp(find)+"(?=[^a-zA-Z\*])",'gi')
RegExp("\ \b"+escapeRegExp(find)+"(?=[^a-zA-Z\*])",'gi')
RegExp("(?<=[^a-zA-Z\*])"+escapeRegExp(find)+"(?=[^a-zA-Z\*])",'gi')
There is a conceptual problem in this line near the beginning and the end.
The same problem leads to a program logic problem (although not a syntax error, which is why it is not so easily spotted) in the following line:
JJ wrote:
On Sat, 21 May 2022 11:20:22 +0200, R.Wieser wrote:
[…]
The same problem leads to a program logic problem (although not a syntax error, which is why it is not so easily spotted) in the following line:
RegExp("\ \b"+escapeRegExp(find)+"(?=[^a-zA-Z\*])",'gi')
I'm trying to match group of characters, delimited by anything that does not belong in that group.
Case in point: "Couldn't understand a g**d*** word she was saying!"
I would like to match the "g**d***" sequence of letters.
"Couldn't understand a 🔵🟡🔴🟢🔵🟡 word she was saying!"```
I don't understand your look-behind discussion at all. If all you
want to do is to replace some string in another, it's mostly a
simple problem, with the only significant complexity coming from
needing to escape certain characters for a regular expression.
("g**d***", "????????????")
("Couldn't understand a g**d*** word she was saying!")
Ahh, so this isn't about replacing the regex at all, but how to
organize a collection of search strings. Is that right?
Scott Sauyet wrote:
I don't understand your look-behind discussion at all. If all you
want to do is to replace some string in another, it's mostly a
simple problem, with the only significant complexity coming from
needing to escape certain characters for a regular expression.
Google "clbuttic mistake" and you'll know what the problem with that is.
:-)
("g**d***", "????????????")
("Couldn't understand a g**d*** word she was saying!")
Now imagine that ("d***", "damn") is another replacement, which might be executed before the ("g**d***", "goddamn") one. I would be left with "g**damn". Not good.
Imagine that I have "g**d***" somewhere in the text, but no replacement for it yet. But I already have "d***" as a searchstring. In that case that^^^ ^^^^^^^
"g**d***" should stay as it is.
I'm not sure what is meant here by "yet" and "already". Is there some temporal change at play?
Or possibly a few such examples?
I'm now getting intrigued, but I still don't really understand the
problem.
Also, is this an attempt to solve a real-world problem or is mostly a way
to noodle with more advanced regexes?
Scott Sauyet wrote:,
Or possibly a few such examples?
Of what ? I've given several examples (of the basic problem and how your approach does not quite work), but those are not the ones you seem to need.
I'm now getting intrigued, but I still don't really understand the
problem.
Ask yourself : does your approach solve the "this is a clbuttic mistake" problem ?
'Cause *that* is what I am after - just not for regular words as recognised by a RegExp (see "\w", "\W", "\b")
Also, is this an attempt to solve a real-world problem or is mostly a way
to noodle with more advanced regexes?
Yes ...
https://notalwaysright.com/some-people-shouldnt-be-allowed-to-drive-or-go-out-in-public/258702/
... and yes
Discovering/learning how to solve certain problems is never wasted time to me.
Also, do notice that I provided the correct solution in my first post.
Only later I found out that the "look behind assertion" method itself isn't recognised by my FF v52 browsers JS. :-\
...Ask yourself : does your approach solve the "this is a clbuttic mistake"
problem ?
I don't think that anything can comfortably reverse the clbuttic
mistake.
What I think you're doing is trying to find a way to reverse the manual censoring of, say, `g**d***` which is too unlikely to be intended text,
and
restore `goddamn`
No, but there is are different ways to answer the questions, "How do I
solve this real world problem?" and "How can I best use regex to solve
this problem?"
Finally, here's another attempt, using a combination of a negative look-behind before the target and a negative look-ahead after it.
Hello all,
I'm trying to match group of characters, delimited by anything that does not belong in that group.
Case in point: "Couldn't understand a g**d*** word she was saying!"
I would like to match the "g**d***" sequence of letters.
For that I've tried to use a RegEx :
RegExp("(?<=[^a-zA-Z\*])"+escapeRegExp(find)+"(?=[^a-zA-Z\*])",'gi')
-or-
RegExp("(?<![a-zA-Z\*])"+escapeRegExp(find)+"(?=[^a-zA-Z\*])",'gi')
("escapeRegExp()" escapes the "*" to "\*")
The problem is that neither of the above seem to match anything.
When I use the following
RegExp("\ \b"+escapeRegExp(find)+"(?=[^a-zA-Z\*])",'gi')
(no space between the first two slashes. Had to insert it otherwise my newsclient turns it into a link)
all goes well - but for the problem that than the partial "d***" is matched before "g**d***", throwing everything off.
Question: if not the above, what /am/ I suppose to use as the "look behind assertion" ?
Regards,
Rudy Wieser
Scott Sauyet wrote:
I don't think that anything can comfortably reverse the clbuttic
mistake.
Than start with thinking how *not* to make it. 'Cause /that is all/ what my problem is about.
I've already tried to explain it, but here it goes again :
When I'm replacing all instances of "ass" in a piece of text I DO NOT WANT
to see "classic" being converted into "clbuttic".
No, but there is are different ways to answer the questions, "How do I
solve this real world problem?" and "How can I best use regex to solve
this problem?"
The problem with the above is that you are second-guessing what I "really" want to know.
Don't.
Finally, here's another attempt, using a combination of a negative
look-behind before the target and a negative look-ahead after it.
Sigh.
Look at the subjectline. What does that tell you ?
) about the state of the world.
Also take a quick peek at the last few lines of my second message.
Hello all,
I'm trying to match group of characters, delimited by anything that does not belong in that group.
Case in point: "Couldn't understand a g**d*** word she was saying!"
I would like to match the "g**d***" sequence of letters.
For that I've tried to use a RegEx :
RegExp("(?<=[^a-zA-Z\*])"+escapeRegExp(find)+"(?=[^a-zA-Z\*])",'gi')
-or-
RegExp("(?<![a-zA-Z\*])"+escapeRegExp(find)+"(?=[^a-zA-Z\*])",'gi')
("escapeRegExp()" escapes the "*" to "\*")
The problem is that neither of the above seem to match anything.
When I use the following
RegExp("\ \b"+escapeRegExp(find)+"(?=[^a-zA-Z\*])",'gi')
(no space between the first two slashes. Had to insert it otherwise my newsclient turns it into a link)
all goes well - but for the problem that than the partial "d***" is matched before "g**d***", throwing everything off.
Question: if not the above, what /am/ I suppose to use as the "look behind assertion" ?
Regards,
Rudy Wieser
Sysop: | Keyop |
---|---|
Location: | Huddersfield, West Yorkshire, UK |
Users: | 248 |
Nodes: | 16 (2 / 14) |
Uptime: | 35:25:47 |
Calls: | 5,493 |
Calls today: | 1 |
Files: | 11,664 |
Messages: | 5,032,102 |