Is there a systematic way to discard the extra noise that can occur
when using parser combinators? For example, the `many` combinator
which matches zero or more instances of its argument parser.
In the case of zero matches, it still needs to return a value.
In my situation, I'm simulating everything in PostScript because it's
my favorite language. I'm simulating Lisp cons cells as 2-element
arrays. So for this JSON string,
( [ 3, [ 4, [ 5 ] ] ] ) JSON-parse report
if I make no special effort, I get a resulting value that looks like this:
OK
[[3 [[4 [[5 [[] []]] []]] []]] []]
remainder:[]
All those little empty arrays need to just go away, but not any of the important array structure.
`many` and `maybe` seem to be the chief
culprits, but then their results are propagated back by `alt`s and
`then`s all the way back to the top.
Do I need to make some kind of out-of-band signal for these "zeros"
that I can filter out later? The obvious problem here is that the array
type is being used for too many things. But there's a paucity of
types in PostScript, sigh. For the JSON application, I have nametype
objects available that don't have a JSON corollary.
Do I need to rewrite all the combinators to filter out noise values at
every turn?.
Is there a systematic way to discard the extra noise that can occur
when using parser combinators? For example, the `many` combinator
which matches zero or more instances of its argument parser.
In the case of zero matches, it still needs to return a value.
luserdroog <mij...@yahoo.com> writes:
Is there a systematic way to discard the extra noise that can occur
when using parser combinators? For example, the `many` combinator
which matches zero or more instances of its argument parser.
In the case of zero matches, it still needs to return a value.
In my situation, I'm simulating everything in PostScript because it's
my favorite language. I'm simulating Lisp cons cells as 2-element
arrays. So for this JSON string,
( [ 3, [ 4, [ 5 ] ] ] ) JSON-parse report
if I make no special effort, I get a resulting value that looks like this:
OK
[[3 [[4 [[5 [[] []]] []]] []]] []]
remainder:[]
All those little empty arrays need to just go away, but not any of the important array structure.So you want
[[3 [[4 [[5 []]]]]]]
?
`many` and `maybe` seem to be the chief
culprits, but then their results are propagated back by `alt`s and
`then`s all the way back to the top.
Do I need to make some kind of out-of-band signal for these "zeros"
that I can filter out later? The obvious problem here is that the array type is being used for too many things. But there's a paucity of
types in PostScript, sigh. For the JSON application, I have nametype objects available that don't have a JSON corollary.
Do I need to rewrite all the combinators to filter out noise values at every turn?.It's odd to call something that you are returning (presuambly) as
noise. Are you using lists as a sort of Maybe monad with [] as Nothing?
I think you'd have to show the code to get anything more concrete as a
reply.
--
Ben.
luserdroog <mij...@yahoo.com> writes:
Is there a systematic way to discard the extra noise that can occurI'd expect 'many' to return a list, an empty list in the case of zero matches. What is the extra noise? Your PostScript example is
when using parser combinators? For example, the `many` combinator
which matches zero or more instances of its argument parser.
In the case of zero matches, it still needs to return a value.
confusing. I'd expect ( [ 3, [ 4, [ 5 ] ] ] ) JSON-parse to give
something like [3, [4, [5]]], using square brackets to denote lists.
I didn't know parser combinators were even a thing in PostScript: or are
you trying to implement them? You could look at the Parsec paper to see
how they traditionally worked:
https://www.microsoft.com/en-us/research/wp-content/uploads/2016/02/parsec-paper-letter.pdf
On Sunday, November 14, 2021 at 10:01:16 AM UTC-6, Ben Bacarisse wrote:
luserdroog <mij...@yahoo.com> writes:
Is there a systematic way to discard the extra noise that can occurSo you want
when using parser combinators? For example, the `many` combinator
which matches zero or more instances of its argument parser.
In the case of zero matches, it still needs to return a value.
In my situation, I'm simulating everything in PostScript because it's
my favorite language. I'm simulating Lisp cons cells as 2-element
arrays. So for this JSON string,
( [ 3, [ 4, [ 5 ] ] ] ) JSON-parse report
if I make no special effort, I get a resulting value that looks like this: >> >
OK
[[3 [[4 [[5 [[] []]] []]] []]] []]
remainder:[]
All those little empty arrays need to just go away, but not any of the
important array structure.
[[3 [[4 [[5 []]]]]]]
?
I guess that's the big problem here. I'm not sure what I want. I keep having to add extra code to clean up and delete the extra stuff. Ultimately the result should be
[ 3 [ 4 [ 5 ] ] ]
The parser for arrays looks for the left bracket, then ...
/Jarray //begin-array
//value executeonly xthen
//value-separator //value executeonly xthen many then %{ps flatten ps} using
maybe
//end-array thenx
{ %filter-zeros first %ps
} using def
The `executeonly` are in there to prevent infinite recursion if the expanded code ever gets printed (like in a stack dump while debugging). The /Jarray parser is one of the components of the //value parser.
Hmm. Initially I had the `then` combinator doing a Lisp-style (append) operation on my simulated lists, so something like
(a) char (b) char then
would -- if matched by the input -- return
[ (a) [ (b) [] ] ]
which I could then easily massage into
[ (a) (b) ]
luserdroog <mij...@yahoo.com> writes:
On Sunday, November 14, 2021 at 10:01:16 AM UTC-6, Ben Bacarisse wrote:
luserdroog <mij...@yahoo.com> writes:
Is there a systematic way to discard the extra noise that can occurSo you want
when using parser combinators? For example, the `many` combinator
which matches zero or more instances of its argument parser.
In the case of zero matches, it still needs to return a value.
In my situation, I'm simulating everything in PostScript because it's
my favorite language. I'm simulating Lisp cons cells as 2-element
arrays. So for this JSON string,
( [ 3, [ 4, [ 5 ] ] ] ) JSON-parse report
if I make no special effort, I get a resulting value that looks like this:
OK
[[3 [[4 [[5 [[] []]] []]] []]] []]
remainder:[]
All those little empty arrays need to just go away, but not any of the >> > important array structure.
[[3 [[4 [[5 []]]]]]]
?
I guess that's the big problem here. I'm not sure what I want. I keep having
to add extra code to clean up and delete the extra stuff. Ultimately the result should be
[ 3 [ 4 [ 5 ] ] ]
The parser for arrays looks for the left bracket, then ...
/Jarray //begin-array
//value executeonly xthen
//value-separator //value executeonly xthen many then %{ps flatten ps} using
maybe
//end-array thenx
{ %filter-zeros first %ps
} using def
The `executeonly` are in there to prevent infinite recursion if the expanded
code ever gets printed (like in a stack dump while debugging). The /Jarray parser is one of the components of the //value parser.
Hmm. Initially I had the `then` combinator doing a Lisp-style (append) operation on my simulated lists, so something like
(a) char (b) char then
would -- if matched by the input -- return
[ (a) [ (b) [] ] ]
which I could then easily massage into
[ (a) (b) ]I think you need to pin down what you want. You made a remark about
using two-element arrays as cons cells. In that case, parsing (a) and
(b) in sequence /should/ give [ (a) [ (b) [] ] ]. Massaging that into something else seems like the wrong strategy.
On Monday, November 15, 2021 at 4:45:12 AM UTC-6, Ben Bacarisse wrote:[snip]
luserdroog <mij...@yahoo.com> writes:
On Sunday, November 14, 2021 at 10:01:16 AM UTC-6, Ben Bacarisse wrote:
luserdroog <mij...@yahoo.com> writes:
Is there a systematic way to discard the extra noise that can occurSo you want
when using parser combinators? For example, the `many` combinator
which matches zero or more instances of its argument parser.
In the case of zero matches, it still needs to return a value.
In my situation, I'm simulating everything in PostScript because it's >> > my favorite language. I'm simulating Lisp cons cells as 2-element
arrays. So for this JSON string,
( [ 3, [ 4, [ 5 ] ] ] ) JSON-parse report
if I make no special effort, I get a resulting value that looks like this:
OK
[[3 [[4 [[5 [[] []]] []]] []]] []]
remainder:[]
All those little empty arrays need to just go away, but not any of the >> > important array structure.
[[3 [[4 [[5 []]]]]]]
?
I guess that's the big problem here. I'm not sure what I want. I keep having
to add extra code to clean up and delete the extra stuff. Ultimately the result should be
[ 3 [ 4 [ 5 ] ] ]
The parser for arrays looks for the left bracket, then ...
/Jarray //begin-array
//value executeonly xthen
//value-separator //value executeonly xthen many then %{ps flatten ps} using
maybe
//end-array thenx
{ %filter-zeros first %ps
} using def
The `executeonly` are in there to prevent infinite recursion if the expanded
code ever gets printed (like in a stack dump while debugging). The /Jarray
parser is one of the components of the //value parser.
Hmm. Initially I had the `then` combinator doing a Lisp-style (append) operation on my simulated lists, so something like
(a) char (b) char then
would -- if matched by the input -- return
[ (a) [ (b) [] ] ]
which I could then easily massage into
[ (a) (b) ]I think you need to pin down what you want. You made a remark about
using two-element arrays as cons cells. In that case, parsing (a) and
(b) in sequence /should/ give [ (a) [ (b) [] ] ]. Massaging that into something else seems like the wrong strategy.
Yes, I think I may have presented an X/Y problem or started the story
from the middle. In all of my test cases for this version (regexs,
Postscript scanner, JSON parser) I had to write a `fix` function
to convert these lists into arrays that I can work with more simply.
Is there a systematic way to discard the extra noise that can occur
when using parser combinators? For example, the `many` combinator
which matches zero or more instances of its argument parser.
In the case of zero matches, it still needs to return a value.
On Sunday, November 14, 2021 at 5:11:37 PM UTC-6, Paul Rubin wrote:
luserdroog <mij...@yahoo.com> writes:
Is there a systematic way to discard the extra noise that can occurI'd expect 'many' to return a list, an empty list in the case of zero matches. What is the extra noise? Your PostScript example is
when using parser combinators? For example, the `many` combinator
which matches zero or more instances of its argument parser.
In the case of zero matches, it still needs to return a value.
confusing. I'd expect ( [ 3, [ 4, [ 5 ] ] ] ) JSON-parse to give
something like [3, [4, [5]]], using square brackets to denote lists.
I didn't know parser combinators were even a thing in PostScript: or are you trying to implement them? You could look at the Parsec paper to see
how they traditionally worked:
https://www.microsoft.com/en-us/research/wp-content/uploads/2016/02/parsec-paper-letter.pdfIt's something I've been trying to do in PostScript for a while now.
A lot of the saga is detailed in comp.lang.postscript.
Code is at github.com/luser-dr00g/pcomb/ps
On Saturday, November 13, 2021 at 10:54:09 PM UTC-6, luserdroog wrote:
Is there a systematic way to discard the extra noise that can occur
when using parser combinators? For example, the `many` combinator
which matches zero or more instances of its argument parser.
In the case of zero matches, it still needs to return a value.
[snip]
Sigh. I already had this same problem before. It came up when I googled it. [sad trombone]
https://stackoverflow.com/q/55346600/733077
Sysop: | Keyop |
---|---|
Location: | Huddersfield, West Yorkshire, UK |
Users: | 296 |
Nodes: | 16 (2 / 14) |
Uptime: | 88:11:17 |
Calls: | 6,658 |
Files: | 12,203 |
Messages: | 5,333,954 |