How does one find out what network calls, browser requests to load web pages?
The simple method - download the HTML page, parse the page, find out all the network calls using web parsers like beautifulsoup.
The shortcoming in the method, what about the network calls made by your browser before requesting the web page? For example, firefox makes a call to
ocsp.digicert.com to obtain revocation status on digital certificates. The protocol is Online Certificate Status Protocol.
The network tab in the browser dev tool doesn’t display the network call to ocsp.digicert.com.
One of the ways to find out all the requests originating from the machine is by using a proxy.
After installing the proxy, you can run the command
mitmdump. Both will run the proxy in the localhost in the port 8080, with
mitmweb, you get intuitive web UI to browse the request/response cycle and order of the web requests.
As the name indicate, “man in the middle,” the proxy gives the ability to modify the request and response. To do so, a simple python script with a custom function can do the trick. The script accepts two python functions,
response. After every request,
request and after every response,
response method will be called. There are other supported concepts like addons.
To extract the details of the request and response. Here is a small script, https://gitlab.com/snippets/1933443.
After receiving a successful response, the MITM invokes the function
response; the function collects the details and dumps the details to the JSON file.
uuid4 will ensure a unique file name for every function call.
Even though MITM provides an option to dump the response, it’s in binary format and suited for data analysis.Next is to simulate the browser request.
Selenium is one of the tools used by web-developers and testers for an end to end web testing. Selenium helps developers to test the code on the browser and assert the HTML elements on the same page.
To get selenium Python web driver working, one needs to install Java first, followed by geckodriver, and finally Python selenium driver(
pip install selenium). Don’t forget to place geckodriver in
Run the MITM in a terminal, then selenium another terminal, the output directory must be filled with JSON files.
Here is how the directory looks
The sample JSON files output
- Parameterize Python Tests
- “Don’t touch your face” - Neural Network will warn you
- How long do Python Postgres tools take to load data?
- Debugging Python multiprocessing program with strace
- Book Review: Software Architecture with Python
This work is licensed under a Creative Commons Attribution-ShareAlike 4.0 International License.