• webecv4 questions

    From Hemo@1:103/705 to All on Wed Apr 1 15:45:00 2020
    I've looked for a while but my google-foo is failing me.

    I am wanting to have the BBS web pages present, but not allow anyone to browse the message areas unless logged in. Perhaps allow one or two areas like a local/main, if possible. I want to shutdown the network areas from being web crawling/indexing targets.

    I am also wanting to, if possible, display the file areas and files, but not allow download. Or just not display file areas at all over http if that's not possible.

    It doesn't seem to matter what restrictions I apply in SCFG, any non-logged in user is able to browse the message areas ( but not post! ), and is able to browse and download from all file areas over http.

    I'm not getting hammered, but there is constant connections downloading random files.

    I'd really like to keep the http up, but restrict things until someone logs in.
    For now, I've basically resorted to require a login for anything using the webctrl.ini file in /sbbs/webv4/root

    Can someone point me in the right direction?

    --
    Hemo
    ... Heisenberg may have slept here.
    --- MultiMail/Win32 v0.49
    þ Synchronet þ - Running madly into the wind and screaming - bbs.ujoint.org
    * Origin: Vertrauen - [vert/cvs/bbs].synchro.net (1:103/705)
  • From Al@1:153/757 to Hemo on Wed Apr 1 14:28:08 2020
    I've looked for a while but my google-foo is failing me.

    I am wanting to have the BBS web pages present, but not allow anyone to browse
    the message areas unless logged in. Perhaps allow one or two areas like a local/main, if possible. I want to shutdown the network areas from being
    web
    crawling/indexing targets.

    You can stop the web crawlers with your robots.txt.

    I'm not sure but I think the default robots.txt that comes with Synchronet will do this. My own robots.txt looks like this..

    User-agent: *
    Disallow: /bbbs

    That stops the web crawlers (at least the honourable ones) from crawling whatever parts of your BBS you don't want crawled.

    --- BBBS/Li6 v4.10 Toy-4
    * Origin: The Rusty MailBox - Penticton, BC Canada (1:153/757)
  • From echicken@1:103/705 to Hemo on Wed Apr 1 17:59:52 2020
    Re: webecv4 questions
    By: Hemo to All on Wed Apr 01 2020 15:45:00

    It doesn't seem to matter what restrictions I apply in SCFG, any
    non-logged in
    user is able to browse the message areas ( but not post! ), and is able
    to

    What restrictions have you tried? I'm able to obscure sysop-only message areas from guest/non-sysop web users and so on. Not aware of any problems with this.

    browse and download from all file areas over http.

    Should be the same, but I'll have to take another look; it's been a while.

    I'd really like to keep the http up, but restrict things until someone
    logs in.
    For now, I've basically resorted to require a login for anything using
    the
    webctrl.ini file in /sbbs/webv4/root

    You should be able to set restrictions on specific pages by putting a webctrl.ini file inside your webv4/pages/ directory. If you really wanted to hide the "forum" page from guests, you should be able to do it that way.

    ---
    echicken
    electronic chicken bbs - bbs.electronicchicken.com
    þ Synchronet þ electronic chicken bbs - bbs.electronicchicken.com
    * Origin: Vertrauen - [vert/cvs/bbs].synchro.net (1:103/705)
  • From poindexter FORTRAN@1:103/705 to Hemo on Wed Apr 1 15:21:59 2020
    Re: webecv4 questions
    By: Hemo to All on Wed Apr 01 2020 03:45 pm


    I am wanting to have the BBS web pages present, but not allow anyone to browse the message areas unless logged in. Perhaps allow one or two areas like a local/main, if possible. I want to shutdown the network areas from being web crawling/indexing targets.

    The security levels of the groups determine what can be seen on the web.
    The guest user's security level controls what un-authenticated users can see from the web.

    I have one message group, Local Messages, that I don't mind being seen by guest. I have the group access requirements set to LEVEL 40, and for each message sub-board, the access, reading and posting requirement in LEVEL 40.

    All of my network boards are set to LEVEL 50, and sysop-only boards/areas set to LEVEL 90.

    That way, crawlers can get to the message areas limited by your control.

    I'm not sure what ROBOTS.TXT should look like with ecweb4.

    ---
    þ Synchronet þ realitycheckBBS -- http://realitycheckBBS.org
    * Origin: Vertrauen - [vert/cvs/bbs].synchro.net (1:103/705)
  • From Rampage@1:103/705 to Al on Wed Apr 1 18:14:28 2020
    Re: webecv4 questions
    By: Al to Hemo on Wed Apr 01 2020 14:28:08


    You can stop the web crawlers with your robots.txt.

    you can control good spiders/crawlers with robots.txt... bad ones will ignore it or use it to specifically target the listed areas...


    )\/(ark

    ---
    þ Synchronet þ The SouthEast Star Mail HUB - SESTAR
    * Origin: Vertrauen - [vert/cvs/bbs].synchro.net (1:103/705)
  • From Hemo@1:103/705 to Al on Wed Apr 1 18:39:33 2020
    Re: webecv4 questions
    By: Al to Hemo on Wed Apr 01 2020
    02:28 pm

    I've looked for a while but my google-foo is failing me.

    I am wanting to have the BBS web pages present, but not allow anyone
    to browse the message areas unless logged in. Perhaps allow one or two
    areas like a local/main, if possible. I want to shutdown the network
    areas from being web crawling/indexing targets.

    You can stop the web crawlers with your robots.txt.

    I'm not sure but I think the default robots.txt that comes with
    Synchronet
    will do this. My own robots.txt looks like this..

    User-agent: *
    Disallow: /bbbs


    I've got this:
    User-agent: *
    Disallow: /

    Its not stopping things taht are not identifying as a crawler. I think. I think a legitimate crawler starts by looking for the robots.txt file, I see some of those too.

    Here snips of what I see in the log:

    Apr 1 12:31:32 bbs synchronet: web 0045 HTTP connection accepted from: 52.82.96.27 port 49946
    Apr 1 12:31:32 bbs synchronet: web 0045 Hostname: ec2-52-82-96-27.cn-northwest-1.compute.amazonaws.com.cn [52.82.96.27]
    Apr 1 12:31:32 bbs synchronet: web 0045 Request: GET /api/files.ssjs?call=download-file&dir=sndmodv1mod_hl&file=INFLNCIA.MOD HTTP/1.1
    Apr 1 12:31:32 bbs synchronet: web 0045 Unable to send to peer
    Apr 1 12:31:32 bbs synchronet: web 0045 Sending file: /sbbs/tmp/SBBS_SSJS.31685.45.html (0 bytes)
    Apr 1 12:31:33 bbs synchronet: web 0045 Session thread terminated (0 clients, 3 threads remain, 219 served)
    Apr 1 12:32:16 bbs synchronet: web 0045 HTTPS connection accepted from: 111.225.148.163 port 55238
    Apr 1 12:32:17 bbs synchronet: web 0045 Hostname: bytespider-111-225-148-163.crawl.bytedance.com [111.225.148.163]
    Apr 1 12:32:17 bbs synchronet: web 0045 Request: GET /robots.txt HTTP/1.1
    Apr 1 12:32:17 bbs synchronet: web 0045 Sending file: /sbbs/webv4/root/robots.txt (2076 bytes)
    Apr 1 12:32:17 bbs synchronet: web 0045 Sent file: /sbbs/webv4/root/robots.txt (2076 bytes)
    Apr 1 12:32:18 bbs synchronet: web 0045 Session thread terminated (0 clients, 3 threads remain, 220 served)
    Apr 1 12:32:58 bbs synchronet: web 0045 HTTP connection accepted from: 111.225.148.177 port 46388
    Apr 1 12:32:58 bbs synchronet: web 0045 Hostname: bytespider-111-225-148-177.crawl.bytedance.com [111.225.148.177]
    Apr 1 12:32:58 bbs synchronet: web 0045 Request: GET /robots.txt HTTP/1.1
    Apr 1 12:32:58 bbs synchronet: web 0045 Sending file: /sbbs/webv4/root/robots.txt (2076 bytes)
    Apr 1 12:32:58 bbs synchronet: web 0045 Sent file: /sbbs/webv4/root/robots.txt (2076 bytes)
    Apr 1 12:32:59 bbs synchronet: web 0045 Session thread terminated (0 clients, 3 threads remain, 221 served)
    Apr 1 12:33:42 bbs synchronet: web 0045 HTTPS connection accepted from: 52.83.249.124 port 52734
    Apr 1 12:33:42 bbs synchronet: web 0045 Hostname: ec2-52-83-249-124.cn-northwest-1.compute.amazonaws.com.cn [52.83.249.124]
    Apr 1 12:33:43 bbs synchronet: web 0045 Request: GET /api/files.ssjs?call=download-file&dir=st20s92msdosc&file=CNEWS003.ARC HTTP/1.1
    Apr 1 12:33:43 bbs synchronet: web 0045 Sending file: /sbbs/tmp/SBBS_SSJS.31685.45.html (0 bytes)
    Apr 1 12:33:44 bbs synchronet: web 0045 Session thread terminated (0 clients, 3 threads remain, 222 served)


    every minute or so, something comes in and goes directly to a specific file and tries to download it. Most of these seem to come from cn-northwest-1.compute.amazonaws.com.cn
    --
    H

    ... It is impossible to please the whole world and your mother-in-law.

    ---
    þ Synchronet þ - Running madly into the wind and screaming - bbs.ujoint.org
    * Origin: Vertrauen - [vert/cvs/bbs].synchro.net (1:103/705)
  • From Hemo@1:103/705 to poindexter FORTRAN on Wed Apr 1 18:49:38 2020
    Re: webecv4 questions
    By: poindexter FORTRAN to Hemo on
    Wed Apr 01 2020 03:21 pm

    Re: webecv4 questions
    By: Hemo to All on Wed Apr 01 2020 03:45 pm

    I am wanting to have the BBS web pages present, but not allow anyone
    to browse the message areas unless logged in. Perhaps allow one or
    two areas like a local/main, if possible. I want to shutdown the
    network areas from being web crawling/indexing targets.

    The security levels of the groups determine what can be seen on the web. The guest user's security level controls what un-authenticated users can see from the web.

    Boom. that was it, thank you. I wasn't picking up that the 'non-logged-in' access to the web was controlled by the security level of the Guest account and somehow my Guest account got 'validated' ( well.. I'm sure I either did that not realizing the implications or it was an mis-typed key ). Validation cranked up the security level and opened up the forums and files on the web pages to the public.

    I got my Guest account security back where it should be and that sealed up the web portion back to where I wanted it. whew!

    thanks!

    --
    Hemo

    ... I love criticism just so long as it's unqualified praise.

    ---
    þ Synchronet þ - Running madly into the wind and screaming - bbs.ujoint.org
    * Origin: Vertrauen - [vert/cvs/bbs].synchro.net (1:103/705)
  • From Rampage@1:103/705 to Hemo on Thu Apr 2 07:27:16 2020
    Re: webecv4 questions
    By: Hemo to Al on Wed Apr 01 2020 18:39:33


    Hemo> Its not stopping things taht are not identifying as a crawler. I think.

    robots.txt cannot stop anything... it is only a guide from the site operator to the spider operator indicating the areas the spider is allowed to crawl or not...

    Hemo> I think a legitimate crawler starts by looking for the robots.txt file,
    Hemo> I see some of those too.

    close... robots.txt may or may not be gathered on each visit by a spider... if it is gathered, it may not be taken into account until later visits...


    Hemo> every minute or so, something comes in and goes directly to a specific
    Hemo> file and tries to download it. Most of these seem to come from
    Hemo> cn-northwest-1.compute.amazonaws.com.cn

    look in your /sbbs/data/logs directory for the http logs (if you have them enabled) and you will see a traditional apache-style log format... the last field contains the user agent which will generally tell you if the visitor really is a spider or not... what you're seeing from that amazon cloud domain may be a spider or it may be someone's file getter or possible even an indexer (which is like a spider or crawler)...


    )\/(ark

    ---
    þ Synchronet þ The SouthEast Star Mail HUB - SESTAR
    * Origin: Vertrauen - [vert/cvs/bbs].synchro.net (1:103/705)
  • From Hemo@1:103/705 to Rampage on Thu Apr 2 13:17:26 2020
    Re: webecv4 questions
    By: Rampage to Hemo on Thu Apr 02
    2020 07:27 am

    Re: webecv4 questions
    By: Hemo to Al on Wed Apr 01 2020 18:39:33
    Hemo> every minute or so, something comes in and goes directly to a specific Hemo> file and tries to download it. Most of these seem to come from Hemo> cn-northwest-1.compute.amazonaws.com.cn

    look in your /sbbs/data/logs directory for the http logs (if you have
    them
    enabled) and you will see a traditional apache-style log format... the last field contains the user agent which will generally tell you if the visitor really is a spider or not... what you're seeing from that amazon cloud domain may be a spider or it may be someone's file getter or possible even an indexer (which is like a spider or crawler)...

    Interesting. I need to spend more time reading logs, I think.
    The lines in question all show this:
    "Mozilla/5.0 (Linux; Android 6.0; Nexus 5 Build/MRA58N) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/45.0.9740.1940 Mobile Safari/537.36"

    I also see some that are not even trying to hide anything. polaris botnet, ZmEu, zgrab, The Knowledge AI, and so forth. Even some from this fella: "masscan/1.0 (https://github.com/robertdavidgraham/masscan)"


    coincidence or not, about an hour after closing down reading of files and forums to anyone/thing not logged in, I was slammed for a couple hours from no-reverse-dns-configured.com with what looks like attempted php exploits.

    I see the php exploit attempts randomly here and there in all the log files, but this period was non stop for about 2 hours, 2-5 attempts every second. the log file is huge.


    Man.. this stuff felt simpler when just dealing with a modem and baud rates.

    ... Buy Land Now. It's Not Being Made Any More.

    ---
    þ Synchronet þ - Running madly into the wind and screaming - bbs.ujoint.org
    * Origin: Vertrauen - [vert/cvs/bbs].synchro.net (1:103/705)
  • From poindexter FORTRAN@1:103/705 to Rampage on Thu Apr 2 09:09:00 2020
    Rampage wrote to Hemo <=-

    look in your /sbbs/data/logs directory for the http logs (if you have
    them enabled) and you will see a traditional apache-style log format... the last field contains the user agent which will generally tell you if the visitor really is a spider or not... what you're seeing from that amazon cloud domain may be a spider or it may be someone's file getter
    or possible even an indexer (which is like a spider or crawler)...

    That's a good point - ROBOTS.TXT can block by *user agent*, so if you have a particularly annoying web crawler, you can block that user agent from
    getting to anything instead of trying to block specific areas to all
    crawlers.

    This is all voluntary, a badly behaving crawler can just ignore your ROBOTS.TXT file.


    ... What do you think management's real interests are?
    --- MultiMail/XT v0.52
    þ Synchronet þ realitycheckBBS -- http://realitycheckBBS.org
    * Origin: Vertrauen - [vert/cvs/bbs].synchro.net (1:103/705)
  • From Mortifis@1:103/705 to Hemo on Fri Apr 3 15:49:07 2020
    Re: webecv4 questions
    By: Al to Hemo on Wed Apr 01
    2020 02:28 pm

    I've looked for a while but my google-foo is failing me.

    I am wanting to have the BBS web pages present, but not allow anyone
    to browse the message areas unless logged in. Perhaps allow one or two
    areas like a local/main, if possible. I want to shutdown the network
    areas from being web crawling/indexing targets.

    You can stop the web crawlers with your robots.txt.

    I'm not sure but I think the default robots.txt that comes with Synchronet will do this. My own robots.txt looks like this..

    User-agent: *
    Disallow: /bbbs


    I've got this:
    User-agent: *
    Disallow: /

    Its not stopping things taht are not identifying as a crawler. I think. I think a legitimate crawler starts by looking for the robots.txt file, I see some of those too.

    Here snips of what I see in the log:

    every minute or so, something comes in and goes directly to a specific file and tries to download it. Most of these seem to come from cn-northwest-1.compute.amazonaws.com.cn
    --
    H

    I wonder if adding if(user.alias === 'Guest') { writeln('You must be logged in to view files!'); exit(); } to /sbbs/webv4/root/api/files.ssjs would help? o0r something like that

    ---
    þ Synchronet þ AlleyCat! BBS Lake Echo, NS Canada
    * Origin: Vertrauen - [vert/cvs/bbs].synchro.net (1:103/705)
  • From Digital Man@1:103/705 to Mortifis on Fri Apr 3 12:11:03 2020
    Re: Re: webecv4 questions
    By: Mortifis to Hemo on Fri Apr 03 2020 03:49 pm

    Re: webecv4 questions
    By: Al to Hemo on Wed Apr
    01
    2020 02:28 pm

    I've looked for a while but my google-foo is failing me.

    I am wanting to have the BBS web pages present, but not allow anyone
    to browse the message areas unless logged in. Perhaps allow one or
    two
    areas like a local/main, if possible. I want to shutdown the network
    areas from being web crawling/indexing targets.

    You can stop the web crawlers with your robots.txt.

    I'm not sure but I think the default robots.txt that comes with Synchronet will do this. My own robots.txt looks like this..

    User-agent: *
    Disallow: /bbbs


    I've got this:
    User-agent: *
    Disallow: /

    Its not stopping things taht are not identifying as a crawler. I think.
    I think a legitimate crawler starts by looking for the robots.txt file, see some of those too.

    Here snips of what I see in the log:

    every minute or so, something comes in and goes directly to a specific file and tries to download it. Most of these seem to come from cn-northwest-1.compute.amazonaws.com.cn
    --
    H

    I wonder if adding if(user.alias === 'Guest') { writeln('You must be logged in to view files!'); exit(); } to /sbbs/webv4/root/api/files.ssjs would help? o0r something like that

    I don't think bots are logging in as Guest, but ecweb might do an auto-login-as-guest thing.

    digital man

    Synchronet "Real Fact" #4:
    Synchronet version 3 is written mostly in C, with some C++, x86 ASM, and Pascal.
    Norco, CA WX: 63.1øF, 58.0% humidity, 3 mph E wind, 0.00 inches rain/24hrs
    --- SBBSecho 3.10-Linux
    * Origin: Vertrauen - [vert/cvs/bbs].synchro.net (1:103/705)
  • From echicken@1:103/705 to Mortifis on Fri Apr 3 15:28:05 2020
    Re: Re: webecv4 questions
    By: Mortifis to Hemo on Fri Apr 03 2020 15:49:07

    I wonder if adding if(user.alias === 'Guest') { writeln('You must be
    logged in
    to view files!'); exit(); } to /sbbs/webv4/root/api/files.ssjs would
    help? o0r
    something like that

    That script already checks if the current user has the ability to download, so this shouldn't be necessary.

    Likewise I think all of the file stuff uses 'file_area.lib_list', which is:

    "File Transfer Libraries (current user has access to) - introduced in v3.10"

    So I would expect it not to include areas that the current user isn't supposed to be able to see. Maybe I'm wrong or maybe that isn't working as expected.

    I suspect OP needs to tweak the guest account in use, along with settings on file and message areas.

    ---
    echicken
    electronic chicken bbs - bbs.electronicchicken.com
    þ Synchronet þ electronic chicken bbs - bbs.electronicchicken.com
    * Origin: Vertrauen - [vert/cvs/bbs].synchro.net (1:103/705)