• Accessing a password-protected page via wget

    From Harriet Bazley@21:1/5 to All on Fri Jan 21 13:43:58 2022
    I've been trying to use wget to retrieve a page that is only accessible
    to logged-in users (user stats - so that I can analyse them and keep a
    running record of changes).
    Basically, I can't seem to get the correct syntax for the site to receive/recognise my name and password in the first place, let alone to
    serve up the stats page requested....


    I don't really know how to use the relevant features of
    wget and have been flailing around rather at random. Simply using

    wget --ask-password STATS_URL

    doesn't produce the desired result; it prompts for the password all
    right, but when I supply it the fetch then gets redirected to retrieve
    the log-in page instead, just as if I had supplied no password or the
    wrong one.

    wget --user=USERNAME --ask-password STATS_URL

    prompts "Password for user" instead of just "Password", but still
    doesn't seem to pass the required data.

    Same result from

    wget --user=USERNAME --password=PASS STATS_URL

    (the retrieved page states 'sorry, you don't have access to view the
    page you were trying to reach, please log in')


    After looking for advice on the Web I tried fetching the log-in page
    directly using the same methods and using --keep-session-cookies before
    running a second command to fetch the stats page immediately afterwards,
    but that didn't work. It fetches the login page, then redirects and
    fetches it again under a different name, the only difference being the
    error:

    <div class="flash error">Sorry, you don&#39;t have permission to access the page you were trying to reach. Please log in.</div>


    I then tried using --save-cookies followed by --load-cookies for the
    second request, but that didn't work, doubtless because the resulting
    'cookies' file had no content:

    # HTTP cookie file.
    # Generated by Wget on 2022-01-21 13:37:33.
    # Edit at your own risk.


    I then tried

    wget --post-data 'user_login=USERNAME&user_password=PASS' LOGIN_URL

    where the relevant form reads

    <dt><label for="user_login">User name or email:</label></dt>
    <dd><input type="text" name="user[login]" id="user_login"/></dd>
    <dt><label for="user_password">Password:</label></dt>
    <dd><input type="password" name="user[password]" id="user_password"/></dd>
    <dt><label for="user_remember_me">Remember me</label></dt>
    <dd><input name="user[remember_me]" type="hidden" value="0"/><input type="checkbox" value="1" name="user[remember_me]" id="user_remember_me"/></dd>
    <dt class="landmark">Submit</dt>
    <dd class="submit actions">
    <input type="submit" name="commit" value="Log in" class="submit"/>
    </dd>

    but still had no luck.

    I'm simply not managing to submit the name/password combination in any
    way that the site will acknowledge.

    --
    Harriet Bazley == Loyaulte me lie ==

    We are not punished for our sins, but by them.

    --- SoupGate-Win32 v1.05
    * Origin: fsxNet Usenet Gateway (21:1/5)
  • From Harriet Bazley@21:1/5 to Harriet Bazley on Fri Jan 21 15:11:42 2022
    On 21 Jan 2022 as I do recall,
    Harriet Bazley wrote:

    I've been trying to use wget to retrieve a page that is only accessible
    to logged-in users (user stats - so that I can analyse them and keep a running record of changes).
    Basically, I can't seem to get the correct syntax for the site to receive/recognise my name and password in the first place, let alone to
    serve up the stats page requested....



    I suspect this may have something to do with it:

    <div id="loginform">
    <form class="new_user" id="new_user" action="/users/login" accept-charset="UTF-8" method="post"><input name="utf8" type="hidden" value="&#x2713;"/><input type="hidden" name="authenticity_token" value="
    VfGGu3jwjsf6xNQmlmuu3Qkgc1BsZzgu0ikhluwqmVHU9RFVQQUUANuaza9HFgXr_c71SiKwBLz8XA8bQ4hSOA"/>

    Unfortunately reading and submitting the 'authenticity token' remotely
    might be a bit tricky, as I assume it's intended to prevent precisely
    that!

    I've tried pointing the --load-cookies option at the cookie file from a logged-in copy of Netsurf, but the cookie format is evidently not
    compatible.


    --
    Harriet Bazley == Loyaulte me lie ==

    Lies, damned lies and user documentation.

    --- SoupGate-Win32 v1.05
    * Origin: fsxNet Usenet Gateway (21:1/5)
  • From Kevin Wells@21:1/5 to Harriet Bazley on Fri Jan 21 16:54:24 2022
    In message <dd8749ae59.harriet@bazleyfamily.co.uk>
    Harriet Bazley <harriet@bazleyfamily.co.uk> wrote:

    I've been trying to use wget to retrieve a page that is only accessible
    to logged-in users (user stats - so that I can analyse them and keep a >running record of changes).
    Basically, I can't seem to get the correct syntax for the site to >receive/recognise my name and password in the first place, let alone to
    serve up the stats page requested....


    I don't really know how to use the relevant features of
    wget and have been flailing around rather at random. Simply using

    If you use the -S option you get the server response, which if used woth
    the -o option you can then save it and see what the server is saying.

    wget --ask-password STATS_URL

    doesn't produce the desired result; it prompts for the password all
    right, but when I supply it the fetch then gets redirected to retrieve
    the log-in page instead, just as if I had supplied no password or the
    wrong one.

    wget --user=USERNAME --ask-password STATS_URL

    prompts "Password for user" instead of just "Password", but still
    doesn't seem to pass the required data.

    Same result from

    wget --user=USERNAME --password=PASS STATS_URL

    (the retrieved page states 'sorry, you don't have access to view the
    page you were trying to reach, please log in')


    After looking for advice on the Web I tried fetching the log-in page
    directly using the same methods and using --keep-session-cookies before >running a second command to fetch the stats page immediately afterwards,
    but that didn't work. It fetches the login page, then redirects and
    fetches it again under a different name, the only difference being the
    error:

    <div class="flash error">Sorry, you don&#39;t have permission to access the page you were trying to reach. Please log in.</div>


    I then tried using --save-cookies followed by --load-cookies for the
    second request, but that didn't work, doubtless because the resulting >'cookies' file had no content:

    # HTTP cookie file.
    # Generated by Wget on 2022-01-21 13:37:33.
    # Edit at your own risk.


    I then tried

    wget --post-data 'user_login=USERNAME&user_password=PASS' LOGIN_URL

    where the relevant form reads

    <dt><label for="user_login">User name or email:</label></dt>
    <dd><input type="text" name="user[login]" id="user_login"/></dd>
    <dt><label for="user_password">Password:</label></dt>
    <dd><input type="password" name="user[password]" id="user_password"/></dd>
    <dt><label for="user_remember_me">Remember me</label></dt>
    <dd><input name="user[remember_me]" type="hidden" value="0"/><input type="checkbox" value="1" name="user[remember_me]" id="user_remember_me"/></dd>
    <dt class="landmark">Submit</dt>
    <dd class="submit actions">
    <input type="submit" name="commit" value="Log in" class="submit"/>
    </dd>

    but still had no luck.

    I'm simply not managing to submit the name/password combination in any
    way that the site will acknowledge.



    --
    Kev Wells
    http://kevsoft.co.uk/ https://ko-fi.com/kevsoft
    carpe cervisium
    I went into a theatre as sober as could be,

    --- SoupGate-Win32 v1.05
    * Origin: fsxNet Usenet Gateway (21:1/5)
  • From Harriet Bazley@21:1/5 to Kevin Wells on Fri Jan 21 17:45:16 2022
    On 21 Jan 2022 as I do recall,
    Kevin Wells wrote:

    In message <dd8749ae59.harriet@bazleyfamily.co.uk>
    Harriet Bazley <harriet@bazleyfamily.co.uk> wrote:

    I've been trying to use wget to retrieve a page that is only accessible
    to logged-in users (user stats - so that I can analyse them and keep a >running record of changes).
    Basically, I can't seem to get the correct syntax for the site to >receive/recognise my name and password in the first place, let alone to >serve up the stats page requested....

    If you use the -S option you get the server response, which if used woth
    the -o option you can then save it and see what the server is saying.

    Well, that's interesting - it's doing a 'set-cookie', but no cookies are
    being stored by wget....


    HTTP request sent, awaiting response...
    HTTP/1.1 200 OK
    Server: nginx/1.19.6
    Content-Type: text/html; charset=utf-8
    Transfer-Encoding: chunked
    Connection: close
    referrer-policy: strict-origin-when-cross-origin
    x-frame-options: SAMEORIGIN
    x-xss-protection: 1; mode=block
    x-content-type-options: nosniff
    x-download-options: noopen
    x-permitted-cross-domain-policies: none
    set-cookie: _otwarchive_session= WVRjTGJ4U2dHck5NWXIrRlZXaGhHZmpZSjRxODZCa2hodDRwTWFlQ0VUOVJiVmcxVEtDaGNSaU9XRmthWjNRL3ljeXlnY1dCc0F5Q2pCblZNbE9mWk5obVNreC9PT1JVU2Y1YmY1Rkd1OWVxSVlVYkxnSlVDY3FrMlZrNmhVZVB3QVFRdGVGTU9ETk5ZalFEWGVqeDZudllHYUJ5R3VIUTV4OUU0RkZTVkFpZ1ZBd2E2SDJ2a3JvZkdxbkZJYWp
    CLS1CeTB2dFpxV2kzbUdWYXFpZGpxbTFBPT0%3D--55a268d9001c5202764dff147620a50e8e226676; path=/; expires=Fri, 04 Feb 2022 17:43:11 GMT; HttpOnly
    x-request-id: 484a28ab-af27-47b6-bc77-b96615c72331
    x-runtime: 0.029676

    [snip]

    --
    Harriet Bazley == Loyaulte me lie ==

    It is better to have loved and lost than just to have lost.

    --- SoupGate-Win32 v1.05
    * Origin: fsxNet Usenet Gateway (21:1/5)
  • From Steve Fryatt@21:1/5 to Harriet Bazley on Fri Jan 21 18:57:36 2022
    On 21 Jan, Harriet Bazley wrote in message
    <569f5fae59.harriet@bazleyfamily.co.uk>:

    Well, that's interesting - it's doing a 'set-cookie', but no cookies are being stored by wget....

    Are you using wget's --save-cookies and --keep-session-cookies options? You then load then with --load-cookies on subsequent calls.

    I /assume/ that you would do this on each call to wget, passing the cookies from call to call in that way, but I've just skimmed the man page on Linux
    so a) I've not tried it, and b) I've no idea how the current version on an Ubuntu box relates to what we have on RISC OS.

    The Google search that led me to the above also mentioned loading in cookies saved from Firefox as you describe, and suggested that care needs to be
    taken so as not to include any cookies from other sites in the process. I'd suggest that using wget for the whole thing, and not trying to apply cookies saved from a browser, might be a safer option.

    --
    Steve Fryatt - Leeds, England

    http://www.stevefryatt.org.uk/

    --- SoupGate-Win32 v1.05
    * Origin: fsxNet Usenet Gateway (21:1/5)
  • From Harriet Bazley@21:1/5 to Steve Fryatt on Fri Jan 21 20:28:42 2022
    On 21 Jan 2022 as I do recall,
    Steve Fryatt wrote:

    On 21 Jan, Harriet Bazley wrote in message
    <569f5fae59.harriet@bazleyfamily.co.uk>:

    Well, that's interesting - it's doing a 'set-cookie', but no cookies are being stored by wget....

    Are you using wget's --save-cookies and --keep-session-cookies options? You then load then with --load-cookies on subsequent calls.


    Yes -- as I mentioned in my original post, I end up with a blank cookies
    file when I use the save-cookies option.

    ------------------------------------------------------------------------------
    # HTTP cookie file. # Generated by Wget on 2022-01-20 22:47:54. #
    Edit at your own risk.

    ------------------------------------------------------------------------------


    --
    Harriet Bazley == Loyaulte me lie ==

    Down with categorical imperatives!

    --- SoupGate-Win32 v1.05
    * Origin: fsxNet Usenet Gateway (21:1/5)
  • From druck@21:1/5 to Harriet Bazley on Sat Jan 22 10:25:36 2022
    On 21/01/2022 13:43, Harriet Bazley wrote:
    I've been trying to use wget to retrieve a page that is only accessible
    to logged-in users (user stats - so that I can analyse them and keep a running record of changes).
    Basically, I can't seem to get the correct syntax for the site to receive/recognise my name and password in the first place, let alone to
    serve up the stats page requested....

    It will be possible to do this with wget (or curl), but as you can see
    from the other responses, it involves sprinkling fairy dust over the
    correct magic runes. It may be easier to use Python with the requests
    module for this, as it can set up auth headers and suchlike.

    ---druck

    --- SoupGate-Win32 v1.05
    * Origin: fsxNet Usenet Gateway (21:1/5)
  • From Harriet Bazley@21:1/5 to druck on Sat Jan 22 12:17:13 2022
    On 22 Jan 2022 as I do recall,
    druck wrote:

    On 21/01/2022 13:43, Harriet Bazley wrote:
    I've been trying to use wget to retrieve a page that is only accessible
    to logged-in users (user stats - so that I can analyse them and keep a running record of changes).
    Basically, I can't seem to get the correct syntax for the site to receive/recognise my name and password in the first place, let alone to serve up the stats page requested....

    It will be possible to do this with wget (or curl), but as you can see
    from the other responses, it involves sprinkling fairy dust over the
    correct magic runes. It may be easier to use Python with the requests
    module for this, as it can set up auth headers and suchlike.

    Given that I can *see* the relevant cookies in Netsurf (and can copy
    them from the text file they're stored in), it might be easier just to
    find the format used by Wget and manually construct a file to be used
    via --load-cookies.
    If I were actually certain that I've got the cookie-handling sections of
    Wget at all working. Can anyone suggest a test page that *should* work without requiring hidden hashed magic values?

    --
    Harriet Bazley == Loyaulte me lie ==

    What's the point in being grown up if you can't be childish sometimes?

    --- SoupGate-Win32 v1.05
    * Origin: fsxNet Usenet Gateway (21:1/5)