Parameterize Python Tests

Introduction A single test case follows a pattern. Setup the data, invoke the function or method with the arguments and assert the return data or the state changes. A function will have a minimum of one or more test cases for various success and failure cases. Here is an example implementation of wc command for a single file that returns number of words, lines, and characters for an ASCII text file. [Read More]

“Don’t touch your face” - Neural Network will warn you

A few days back, Keras creator, Francois Chollet tweeted A Keras/TF.js challenge: make a model that processes a webcam feed and detects when someone touches their face (triggering a loud beep).“ The very next day, I tried the Keras yolov3 model available in the Github. It was trained on the coco80 dataset and could detect person but not the face touch. V1 Training The original Keras implementation lacked documentation to train fromscratch or transfer learning. [Read More]

Capture all browser HTTP[s] calls to load a web page

How does one find out what network calls, browser requests to load web pages? The simple method - download the HTML page, parse the page, find out all the network calls using web parsers like beautifulsoup. The shortcoming in the method, what about the network calls made by your browser before requesting the web page? For example, firefox makes a call to to obtain revocation status on digital certificates. The protocol is Online Certificate Status Protocol. [Read More]
python  proxy  HTTP 

How long do Python Postgres tools take to load data?

Data is crucial for all applications. While fetching a significant amount of data from database multiple times, faster data load times improve performance. The post considers tools like SQLAlchemy statement, SQLAlchemy ORM, Pscopg2, psql for measuring latency. And to measure the python tool timing, jupyter notebook’s timeit is used. Psql is for the lowest time taken reference. Table Structure annotation=> \d data; Table "" Column | Type | Modifiers --------+-----------+--------------------------------------------------- id | integer | not null default nextval('data_id_seq'::regclass) value | integer | label | integer | x | integer[] | y | integer[] | Indexes: "data_pkey" PRIMARY KEY, btree (id) "ix_data_label" btree (label) annotation=> select count(*) from data; count --------- 1050475 (1 row) SQLAlchemy ORM Declaration class Data(Base): __tablename__ = 'data' id = Column(Integer, primary_key=True) value = Column(Integer) # 0 => Training, 1 => test label = Column(Integer, default=0, index=True) x = Column(postgresql. [Read More]

Debugging Python multiprocessing program with strace

Debugging is a time consuming and brain draining process. It’s essential part of learning and writing maintainable code. Every person has their way of debugging, approaches and tools. Sometimes you can view the traceback, pull the code from memory, and find a quick fix. Some other times, you opt different tricks like the print statement, debugger, and rubber duck method. Debugging multi-processing bug in Python is hard because of various reasons. [Read More]

Book Review: Software Architecture with Python

The book Software Architecture with Python is by Anand B Pillai. The book explains various aspects of software architecture like testability, performance, scaling, concurrency and design patterns. The book has ten chapters. The first chapter speaks about different architect roles like solution architect, enterprise architect, technical architect what is the role of an architect and difference between design and architecture. The book covers two lesser spoken topics debugging and code security which I liked. [Read More]

Return Postgres data as JSON in Python

Postgres supports JSON and JSONB for a couple of years now. The support for JSON-functions landed in version 9.2. These functions let Postgres server to return JSON serialized data. This is a handy feature. Consider a case; Python client fetches 20 records from Postgres. The client converts the data returned by the server to tuple/dict/proxy. The application or web server converts tuple again back to JSON and sends to the client. [Read More]

Expose jupyter notebook over the network

What is the Jupyter notebook? The Jupyter Notebook is a web application that allows you to create and share documents that contain live code, equations, visualisations and explanatory text. The definition is from the official site. I use IPython/Jupyter shell all time. If you haven’t tried, spend 30 minutes and witness the power! At times, I want to share some code snippet with folks in the same building during work, workshop or training. [Read More]

RC Week 0010

This week has been a mixed ride with the torrent client. I completed the two pending features seeding and UDP tracker. The torrent client has a major issue with downloading larger torrent file like ubuntu iso file. The client starts the downloads from a set of peers and slowly halts at sock.recv after exchanging a handful of packets. At this juncture CPU spikes to 100% when sock.recv blocks. Initially, the code relied on asyncio only features, now the code uses curio library. [Read More]

RC week 0001

This week, I made considerable progress on the BitTorrent client which I started a week back. The client is in a usable state to download the data from the swarm. The source code is available on GitHub. The project uses Python 3.5 async/await and asyncio. I presented the torrent client in RC Thursday five minute presentation evening slot. Here is the link to the slides. Here is quick video demo recorded with asciinema. [Read More]