• /ads.txt

    From Eli the Bearded@21:1/5 to All on Fri Aug 24 23:14:05 2018
    Anyone here?

    I'm curious about /ads.txt. I've read some background material on it
    that outlines how it is supposed to be used for validating ad sales
    inventory or something like that.

    https://digiday.com/marketing/wtf-ads-txt/

    I do not put any ads on my site, I do not run any ads for my site, and I
    do not sell or host ads for anyone else.

    So why am I seeing so many hits to ads.txt?

    Some are from Google, some are from who knows where (35.224.0.0/12 is
    "Google Cloud"; 165.227.0.0/16 is Digital Ocean):

    35.229.103.78 - - [24/Aug/2018:12:29:22 -0400] "GET /ads.txt HTTP/1.1" 404 398 "-" "bidswitchbot/1.0"
    66.249.70.24 - - [24/Aug/2018:12:52:38 -0400] "GET /ads.txt HTTP/1.1" 404 398 "-" "Mozilla/5.0 (compatible; Googlebot/2.1; +http://www.google.com/bot.html)"
    165.227.100.219 - - [24/Aug/2018:13:22:09 -0400] "GET http://qaz.wtf/ads.txt HTTP/1.1" 404 398 "-" "lua-resty-http/0.11 (Lua) ngx_lua/10010"

    I've created a blank ads.txt now. Will that work to tell all of these
    bots that no ads should exist for my site?

    Elijah
    ------
    organic traffic only

    --- SoupGate-Win32 v1.05
    * Origin: fsxNet Usenet Gateway (21:1/5)
  • From Doc O'Leary@21:1/5 to Eli the Bearded on Sat Aug 25 18:45:57 2018
    For your reference, records indicate that
    Eli the Bearded <*@eli.users.panix.com> wrote:

    So why am I seeing so many hits to ads.txt?

    Malicious scans. Do you also see a lot of bogus WordPress URLs?
    Same thing.

    I've created a blank ads.txt now. Will that work to tell all of these
    bots that no ads should exist for my site?

    It’s better to block their IP address completely. Even better, block
    entire ranges by those “cloud” providers. Stop the abuse rather than
    the notification of the problem.

    --
    "Also . . . I can kill you with my brain."
    River Tam, Trash, Firefly

    --- SoupGate-Win32 v1.05
    * Origin: fsxNet Usenet Gateway (21:1/5)
  • From Eli the Bearded@21:1/5 to droleary@2017usenet1.subsume.com on Sun Aug 26 02:47:14 2018
    In comp.infosystems.www.misc,
    Doc O'Leary <droleary@2017usenet1.subsume.com> wrote:
    For your reference, records indicate that
    Eli the Bearded <*@eli.users.panix.com> wrote:
    So why am I seeing so many hits to ads.txt?
    Malicious scans. Do you also see a lot of bogus WordPress URLs?
    Same thing.

    These days the biggest malicious scan offender is the D-Link one
    (tries to use /login.cgi to wget and run a shell script). I don't
    have any reason to think Googlebot doing a GET on a .txt file is
    a malicious scan.

    It’s better to block their IP address completely. Even better, block entire ranges by those “cloud” providers. Stop the abuse rather than
    the notification of the problem.

    Advice like this I get can get from any hypochondriac webmaster forum.
    I'm perfectly capable of deciding what to block or not block on my own.
    My question was just about how ad agencies use ads.txt.

    Elijah
    ------
    doesn't think the dlink scan has yet repeated a source IP address

    --- SoupGate-Win32 v1.05
    * Origin: fsxNet Usenet Gateway (21:1/5)
  • From Doc O'Leary@21:1/5 to Eli the Bearded on Sun Aug 26 16:20:48 2018
    For your reference, records indicate that
    Eli the Bearded <*@eli.users.panix.com> wrote:

    I don't
    have any reason to think Googlebot doing a GET on a .txt file is
    a malicious scan.

    Since you aren’t in a business relationship with them for AdSense or
    any other advertising service, there’s really no legitimate reason for
    them to be scanning unpublished URLs like that. Save, of course, for
    that fact they they’re looking to hoover up any and all information
    about everyone they can get their hands on. I see them probing under /.well-known/ and random 404 URLs as well. Google stopped playing
    nice a long time ago.

    It’s better to block their IP address completely. Even better, block entire ranges by those “cloud” providers. Stop the abuse rather than the notification of the problem.

    Advice like this I get can get from any hypochondriac webmaster forum.
    I'm perfectly capable of deciding what to block or not block on my own.
    My question was just about how ad agencies use ads.txt.

    There’s plenty of information online about legitimate uses for that
    file. What should concern you, since you don’t apparently buy or show
    ads, is the improper uses. Same as for any other scans for invalid
    URLs on your site. Serve up a blank file if you like. I personally
    issue a 204 response for things like that, saving the bans for probes
    that are directly going after exploit URLs.

    --
    "Also . . . I can kill you with my brain."
    River Tam, Trash, Firefly

    --- SoupGate-Win32 v1.05
    * Origin: fsxNet Usenet Gateway (21:1/5)