ndjson

ndjson

curl http://localhost:11434/api/generate -d '{
                                                                                                                            "model": "llama3.2",
                                                                                                                            "prompt": "Where is Dublin? Answer in a six words"
                                                                                                                          }'
{"model":"llama3.2","created_at":"2025-01-14T17:48:33.15898Z","response":"Located","done":false}
{"model":"llama3.2","created_at":"2025-01-14T17:48:33.183229Z","response":" on","done":false}
{"model":"llama3.2","created_at":"2025-01-14T17:48:33.206942Z","response":" the","done":false}
{"model":"llama3.2","created_at":"2025-01-14T17:48:33.230918Z","response":" east","done":false}
{"model":"llama3.2","created_at":"2025-01-14T17:48:33.254533Z","response":" coast","done":false}
{"model":"llama3.2","created_at":"2025-01-14T17:48:33.278113Z","response":" Ireland","done":false}
{"model":"llama3.2","created_at":"2025-01-14T17:48:33.301689Z","response":".","done":false}
{"model":"llama3.2","created_at":"2025-01-14T17:48:33.3255Z","response":"","done":true,"done_reason":"stop","context":[128006,9125,128007,271,38766,1303,33025,2696,25,6790,220,2366,18,271,128009,128006,882,128007,271,9241,374,33977,30,22559,304,264,4848,4339,128009,128006,78191,128007,271,48852,389,279,11226,13962,14990,13],"total_duration":2392671125,"load_duration":575523041,"prompt_eval_count":34,"prompt_eval_duration":1649000000,"eval_count":8,"eval_duration":167000000}

I was playing around with ollama API to explore the API capabilities and noticed the HTTP response was streaming JSON that prompted me to look into the response headers.

curl -v http://localhost:11434/api/generate -d '{
                                                                                                                            "model": "llama3.2",
                                                                                                                            "prompt": "Where is Dublin? Answer in a six words"
                                                                                                                          }'
* Host localhost:11434 was resolved.
* IPv6: ::1
* IPv4: 127.0.0.1
*   Trying [::1]:11434...
* connect to ::1 port 11434 from ::1 port 49217 failed: Connection refused
*   Trying 127.0.0.1:11434...
* Connected to localhost (127.0.0.1) port 11434
> POST /api/generate HTTP/1.1
> Host: localhost:11434
> User-Agent: curl/8.7.1
> Accept: */*
> Content-Length: 250
> Content-Type: application/x-www-form-urlencoded
>
* upload completely sent off: 250 bytes
< HTTP/1.1 200 OK
< Content-Type: application/x-ndjson
< Date: Tue, 14 Jan 2025 17:49:29 GMT
< Transfer-Encoding: chunked
<
...

The content type is application/x-ndjson and quick search hinted it’s a new line separated JSON that can be used in streaming protocols. Also the Transfer-Encodingis chunked and fits well with for LLM responses over the wire.

Subtitle Generator Using Whisper

I want to generate the subtitles for the Normal PeopleTV series in my laptop using LLM. After searching a bit, whisper from OpenAI was a proper fit.

Step 1: Extracting Audio from Video

The first step is to extract the audio from the video file using ffmpeg and store it separately.

ffmpeg -i /Users/kracekumar/Movies/TV/Normal.People.S01/Normal.People.S01E01.mp4 -vn -acodec copy /Users/kracekumar/Movies/TV/Normal.People.S01/audio/Normal.People.S01E01.aac

Step 2: Converting Audio to Text

The second step is to run the audio file through the whisper model from OpenAI. I use uv to install and run inside a project.

Open-webui in personal laptop

In 2024, Large Language Models (LLMs) and Generative AI(GenAI) exploded at an unimaginable rate. I didn’t follow the trend. Currently, there is a news every day on new models. Also, the explosion of models reached a stage where local MacBooks can run a decent enough model. I want to have local model with a decent UI support through web or terminal that provides clean user interface.

I stumbled upon open-webui.

Open WebUI is an extensible, feature-rich, and user-friendly self-hosted WebUI designed to operate entirely offline. It supports various LLM runners, including Ollama and OpenAI-compatible APIs. For more information, be sure to check out our Open WebUI Documentation.

Chatgpt Generate Ruby Code to Check User Exists in Github

On saturday night, I want to work on a side project. To find a name for the project and I wanted to create an Github organization in the same name. I started trying out the name one after the other, all the names were taken and thought about writing small script in ruby. Then from nowhere I decided to let chatgpt to write the code for me.

In this blog post, I’ll share some of the code generated by chatgpt for the checking whether a profile name exists in the Github with increasing complexity.

Python 3.11 micro-benchmark

speed.python.org tracks Python module performance improvement against several modules across Python versions. In the real world, the module level speed improvements don’t directly translate to application performance improvements. The application is composed of several hundreds of dependencies the performance of one specific module doesn’t improve total application performance. Nonetheless, it can improve performance parts of the API or certain flows.

When I first heard the faster CPython initiative, I was intrigued to find out, how does it translate to small application performance across various versions since a lot of critical components are already in C like Postgresql driver. The faster CPython presentation clear states, the performance boost is only guaranteed for pure python code and not C-extensions.

Bazel Build System Introduction for Java

You can find the source code of tutorial in bazel-101 branch.

What will you learn?

  • Introduction to bazel build system
  • How to build and run Java package?
  • How to add maven dependency to bazel build files?
  • How to add protobuf compiler to bazel build?

Introduction

Bazel is imperative build system that can build packages for Java, C++, Python, Ruby, Go, etc … The two main advantages of bazel,

  1. One build tool can build packages for variety of languages and easier for platform teams to build packages across variety of languages. Consider learning many different build systems - Pip, bundle, maven, etc…
  2. Bazel build system can cache already built packages in a remote or local environment and can reuse it without compiling be it for binary, library, or tests.

The main difference between bazel and other build/dependency management systems is imperative vs declarative approach.

Notes from Tail Latency Aware Caching Paper by RobinHood

The problem

Application

When the web service latency increases, the first suggested technique is to cache. The cache is a good solution when your system is a read heavy system.

The common technique is to cache the frequently used objects. The method generally reduces the latency, but doesn’t help much for tail latency (p99). The paper “Tail Latency Aware caching - Dynamically Reallocating from cache rich to cache poor” proposes a novel solution for maintaining low request tail latency.

Dia Duit Dublin, Bye Bengaluru

Dia Duit Dublin, Bye Bengaluru

TL;DR: After working for a decade and a year in Bengaluru, I decided to join Stripe in
Dublin, Ireland as a software engineer.

Bangalore Collage

When I was in final year of college, I had to choose job location preferences over Chennai and Bengaluru. I choose Bengaluru for two reasons - startups and weather. After working a decade and a year in Bengaluru over seven companies, I decided to leave the startup scene, city, and the country.

Rafting Raft Workshop

Last week, May 2-6, 2022, I attended Rafting Raft workshop by David Beazley. The workshop focussed on building a raft implementation and it was intense and exhausting.

A few folks had asked me about the workshop and how it works. The post focuses on what happens before and during the workshop, so future attendees can decide.

Day -43

Someday when you’re contemplating attending the workshop, you register on the website. You get a confirmation email from David Beazley to confirm your availability to attend the workshop.

Profiling Django App

TL:DR

  • Pyinstrument is a call stack sampling profiler with low overhead to find out time spent in your Django application.
  • QueryCount is a simplistic ORM query count middleware that counts the number of ORM queries, finds duplicate queries, and prints them in the terminal.
  • Django Silk is an extensive Django profiler that records the entire execution, SQL queries, source of origin, and persists the recordings. The complete Django profiler.

🔬 What’s Profiling? 🔬

Profiling is a dynamic program analysis that measures a running program’s time and(or) memory consumption. The profiler can instrument the entire running program or record samples for a fixed duration of time.

HTTPie and Print HTTP Request

HTTPie is a command-line utility for making HTTP requests with more straightforward syntax(controversial, I agree). The interesting feature is --offline flag which prints HTTP raw request text. The client sends the HTTP request to the server, and the server responds to the request. It’s an alternate to curl.

HTTP Syntax

HTTP Flow and syntrax from Wikipedia.

A client sends request messages to the server, which consist of

  • a request line, consisting of the case-sensitive request method, a space, the request target, another space, the protocol version, a carriage return, and a line feed (e.g. GET /images/logo.png HTTP/1.1)
  • zero or more request header fields, each consisting of the case-insensitive field name, a colon, optional leading whitespace, the field value, and optional trailing whitespace (e.g. Accept-Language: en), and ending with a carriage return and a line feed.
  • an empty line, consisting of a carriage return and a line feed;
  • an optional message body.
  • In the HTTP/1.1 protocol, all header fields except Host are optional.
  • A request line containing only the path name is accepted by servers to maintain compatibility with HTTP clients before the HTTP/1.0 specification in RFC 1945.

Throughout the post, I’ll use --offline feature to understand how the HTTP request structure looks for educational purposes.

Type Check Your Django Application

Recently, I gave a talk, Type Check your Django app at two conferences - Euro Python 2021 and PyCon India 2021. The talk was about adding Python gradual typing to Django using third-party package Django-stubs focussed heavily around Django Models. The blog post is the write-up of the talk. Here is the unofficial link recorded video of the PyCon India talk.

Here is the link to PyCon India Slides. The slides to Euro Python Talk (both slides are similar).

Pulse Plus

PhonePe recently released Pulse repo from their payment data. It was hard to get an overview of the data without doing some data transformation.

The data is eight levels deep, nested, and multiple files for similar purpose data. Hard to do any command-line aggregate queries for data exploration.

It’s hard to do any analysis with 2000+ files. So I created an SQLite database of the data using python sqlite-utils.

The SQLite database aggregated data and top data in 5 tables - aggregated_user, aggregated_user_device, aggregated_transaction, top_user, top_transaction. Link to the schema - https://github.com/kracekumar/pulse-plus#all-tables-schema.

TIL - A new site

Quite often, as a programmer, I learn something new. Some are utilitarian; some are philosophical; some are opinions in programming. I want to document these learnings for later use and also to remember. So I’m starting a new site, til.kracekumar.com, to demonstrate this learning. So far, there are six posts.

The inspiration comes from Simon Willson’s TIL website.

Why the new site?

Two reasons

  1. I’m planning to write often; sometimes, the post will fit in just two tweets.
  2. I don’t want the existing followers of the site to see a lot of small and new content.

In case you’d to follow the learnings and tips, you can subscribe to RSS feed.

Python Typing Koans

Python 3 introduced type annotation syntax. PEP 484 introduced a provisional module to provide these standard definitions and tools, along with some conventions for situations where annotations are not available.

Python is a dynamic language and follows gradual typing. When a static type checker runs a python code, the type checker considers code without type hints as Any.

def print_name(name):
       print(name)

planet: str = "earth"

In the above example, the name argument type hint will be Any since the type hint is missing while the type hint for the planet variable is a string.

Model Field - Django ORM Working - Part 2

The last post covered the structure of Django Model. This post covers how the model field works, what are the some important methods and functionality and properties of the field.

Object-Relational Mapper is a technique of declaring, querying the database tables using Object relationship in the programming language. Here is a sample model declaration in Django.


class Question(models.Model):
    question_text = models.CharField(max_length=200)
    pub_date = models.DateTimeField('date published')

Each class inherits from models.Model becomes a table inside the SQL database unless explicitly marked as abstract. The Question model becomes <app_name>_question table in the database. question_text and pub_date become columns in the table. The properties of the each field are declared by instantiating the respective class. Below is the method resolution order for CharField.

Structure - Django ORM Working - Part 1

Django ORM hides a lot of complexity while developing the web application. The data model declaration and querying pattern are simplified, whereas it’s structured differently behind the scenes. The series of blog posts will explain Django ORM working(not just converting Python code to SQL), model declaration, querying (manager, queryset), supporting multiple drivers, writing custom queries, migrations etc…

Consider a model definition from the Django tutorial.

from django.db import models


class Question(models.Model):
    question_text = models.CharField(max_length=200)
    pub_date = models.DateTimeField('date published')


class Choice(models.Model):
    question = models.ForeignKey(Question, on_delete=models.CASCADE)
    choice_text = models.CharField(max_length=200)
    votes = models.IntegerField(default=0)

The Question and Choice model class derives from models.Model. Inheriting Model signals Django at run-time the class is a database model. Question model(later converted to a table) contains two extra class variables, question_text and pub_date, which will be two columns in the table. Their type is indicated by creating an instance of the respective type of fields, here models.CharField, models.DateTimeField. A similar work applies to the Choice model.

jut - render jupyter notebook in the terminal

The Jupyter Notebook is an open-source web application that allows you to create and share documents that contain live code, equations, visualizations and narrative text. Uses include: data cleaning and transformation, numerical simulation, statistical modeling, data visualization, machine learning, and much more. The definition copied from the official website.

It’s becoming common to use Jupyter notebook to write books, do data analysis, reproducible experiments, etc… The file produced out of notebook follows JSON Schema. Yet to view the file, the user needs to use web-application or local notebook instance or browser instance.

Five reasons to use Py.test

Pytest library provides a better way to write tests, run the tests, and report the test results. This post is a comparison between the Python unit test standard library and pytest features and leaves out other libraries like nose2.

TL;DR

  • Single assert statement(assert a == b) over 40 different assert methods(self.assertEqual, self.assertIsInstance, self.assertDictEqual)
  • Better and detailed error messages on failure.
  • Useful command line options for reporting, discovering, and reporting tests like --last-failed, --collect-only.
  • Pytest plugins for extending the pytest functionalities and modifying default behavior. pytest-mon, pytest-clarity, pytest-cov
  • Pytest fixtures for seed data and implementation custom test behaviors

1. Single assert statement over 40 different assert methods

Here is a sample unittest code

Build Plugins with Pluggy

Introduction

The blog post is a write up of my two talks from PyGotham and PyCon India titled, Build Plugins with Pluggy. The write-up covers a trivial use-case, discusses why a plugin-based architecture is a good fit, what is plugin-based architecture, how to develop plugin-based architecture using pluggy, and how pluggy works.

Link to PyCon India 2020 Talk

Trivial Use Case

For the scope of the blog post, consider a command-line application queries gutenberg service, processes the data, and displays the relevant information. Let’s see how to build such an application using pluggy.

Render local images in datasette using datasette-render-local-images

Datasette python library lets publishing a dataset as a SQLite database and inspects them using a web browser. The database can store text, numbers, arrays, images, binary data, and all possible formats. datasette-render-images lets you save the image in the database and render the image using data-uris.

Sometimes, while publishing the dataset, you may want to store the images in a separate directory for various reasons and include the relative path to the database’s images in the table.

Tamil 1k Tweets For Binary Sentiment Analysis

To find a labeled data for Tamil NLP task is a difficult task. Some papers talk about Tamil Neural Translation, but the article doesn’t release code. If you’re working part-time or possess an interest in Tamil NLP, you have a tough time finding data.

When I was looking for labeled data for simple sentiment analysis, I couldn’t find any. It’s understandable because there is no one working on it. So I decided to build my dataset. Twitter seemed a perfect place with lots of data. I scrapped data using Twint Python library.

About

I’m Kracekumar, Software Engineer based out of Dublin. Currently, I work at Stripe building Local Payment Methods in Europe.

I have deep expertise in backend technologies and proven tracken record of tech leadership. In the past, I have spoken in various technical conferences like PyCon India, Euro Python, PyGotham, etc…

I was the organizer of Bangalore Python User group and volunteered for PyCon India from 2012 to 2016. In 2017, I founded RFCs We Love Bangalore Meetup and run by various other volunteers. The meetup discusses on published RFCs content and discussion around it.

Parameterize Python Tests

Introduction

A single test case follows a pattern. Setup the data, invoke the function or method with the arguments and assert the return data or the state changes. A function will have a minimum of one or more test cases for various success and failure cases.

Here is an example implementation of wc command for a single file that returns number of words, lines, and characters for an ASCII text file.

Incomplete data is useless - COVID-19 India data

The data is a representation of reality. When a value is missing in the piece of data, it makes it less useful and reliable. Every day, articles, a news report about COVID-19 discuss the new cases, recovered cases, and deceased cases. This information gives you a sense of hope or reality or confusion.

Regarding COVID-19, everyone believes or accepts specific details as fact like mortality rate is 2 to 3 percent, over the age of fifty, the chance of death is 30 to 50 percent. These are established based on previously affected places, and some details come out of the simulation. The mortality rate, deceased age distribution, patient age distribution, mode of spread differs from region to country. With accurate and complete data, one can understand the situation, make the decision, and update the facts.

“Don’t touch your face” - Neural Network will warn you

A few days back, Keras creator, Francois Chollet tweeted

A Keras/TF.js challenge: make a model that processes a webcam feed and detects when someone touches their face (triggering a loud beep).“ The very next day, I tried the Keras yolov3 model available in the Github. It was trained on the coco80 dataset and could detect person but not the face touch.

V1 Training

The original Keras implementation lacked documentation to train fromscratch or transfer learning. While looking for alternative implementation, I came across the PyTorch implemetation with complete documentation.

1000 more whitelist sites in Kashmir, yet no Internet

Kashmir is under lockdown for more than

200,200,200,200,200,200,200,200,200,200,
200,200,200,200,200,200,200,200,200,200,
200,200,200,200,200,200,200,200,200,200,
200,200,200,200,200,200,200,200,200,200,
200,200,200,200,200,200,200,200,200,200,
200,200,200,200,200,200,200,200,200,200,
200,200,200,200,200,200,200,200,200,200,
200,200,200,200,200,200,200,200,200,200,
200,200,200,200,200,200,200,200,200,200,
200,200,200,200,200,200,200,200,200,200,
200,200,200,200,200,200,200,200,200,200,
200,200,200,200,200,200,200,200,200,200,
200,200,200,200,200,200,200,200,200,200,
200,200,200,200,200,200,200,200,200,200,
200,200,200,200,200,200,200,200,200,200,
200,200,200,200,200,200,200,200,200,200,
200,200,200,200,200,200,200,200,200,200,
200,200,200,200,200,200,200,200,200,200,
200,200,200,200,200,200,200,200,200,200,
200,200,200,200,200,200,200,200,200,200,

200 days.

Last Friday(14th Feb, 2020), Government released a documentwith list of allowed whitelist sites at 2G speed. The text file with URLs extracted from the PDF - https://gitlab.com/snippets/1943725

In case you’re not interested in tech details, analysis, or short of time, look at the summary section at the bottom.

Capture all browser HTTP[s] calls to load a web page

How does one find out what network calls, browser requests to load web pages?

The simple method - download the HTML page, parse the page, find out all the network calls using web parsers like beautifulsoup.

The shortcoming in the method, what about the network calls made by your browser before requesting the web page? For example, firefox makes a call to ocsp.digicert.com to obtain revocation status on digital certificates. The protocol is Online Certificate Status Protocol.

153 sites allowed in Kashmir but no internet

Kashmir is locked down without the internet for more than 167 days as of 19th Jan 2020 since 5th Aug 2019. The wire recently published an article wherein the Government of India whitelisted 153 websites access in Kashmir. Below is the list extracted from the document

. The internet shutdown is becoming common in recent days during protests.

Anyone with little knowledge to create a web application or work can say, every web application will make network calls to other sites to load JavaScript, Style sheets, Maps, Videos, Images, etc.

How long do Python Postgres tools take to load data?

Data is crucial for all applications. While fetching a significant amount of data from database multiple times, faster data load times improve performance.

The post considers tools like SQLAlchemy statement, SQLAlchemy ORM, Pscopg2, psql for measuring latency. And to measure the python tool timing, jupyter notebook’s timeit is used. Psql is for the lowest time taken reference.

Table Structure

annotation=> \d data;
                      Table "public.data"
Column |   Type    |                     Modifiers
--------+-----------+---------------------------------------------------
id     | integer   | not null default nextval('data_id_seq'::regclass)
value  | integer   |
label  | integer   |
x      | integer[] |
y      | integer[] |
Indexes:
    "data_pkey" PRIMARY KEY, btree (id)
    "ix_data_label" btree (label)

annotation=> select count(*) from data;
   count
---------
1050475
(1 row)

SQLAlchemy ORM Declaration

class Data(Base):
    __tablename__ = 'data'
    id = Column(Integer, primary_key=True)
    value = Column(Integer)
    # 0 => Training, 1 => test
    label = Column(Integer, default=0, index=True)
    x = Column(postgresql.ARRAY(Integer))
    y = Column(postgresql.ARRAY(Integer))

SQLAlchemy ORM

def sa_orm(limit=20):
    sess = create_session()
    try:
        return sess.query(Data.value, Data.label).limit(limit).all()
    finally:
        sess.close()

Time taken

%timeit sa_orm(1)
28.9 ms ± 4.5 ms per loop (mean ± std. dev. of 7 runs, 10 loops each)

The time is taken in milliseconds to fetch 1, 20, 100, 1000, 10000 queries.

Debugging Python multiprocessing program with strace

Debugging is a time consuming and brain draining process. It’s essential part of learning and writing maintainable code. Every person has their way of debugging, approaches and tools. Sometimes you can view the traceback, pull the code from memory, and find a quick fix. Some other times, you opt different tricks like the print statement, debugger, and rubber duck method.

Debugging multi-processing bug in Python is hard because of various reasons.

Notes from Root Conf Day 2 - 2017

On day 2, I spent a considerable amount of time networking and attend only four sessions.

Spotswap: running production APIs on Spot instance

  • Amazon EC2 spot instances are cheaper than on-demand server costs. Spot instances run when the bid price is greater than market/spot instance price.
  • Mapbox API server uses spot instances which are part of auto-scaling server
  • Auto scaling group is configured with min, desired, max parameters.
  • Latency should be low and cost effective
  • EC2 has three types of instances: On demand, reserved and spot. The spot instance comes from unused space and unstable pricing.
  • Spot market starts with bid price and market price.
  • In winter 2015 traffic increased and price also increased increased
  • To spin up a new machine with code takes almost two minutes
  • Our machine fleet encompasses of spot and on-demand instances
  • When one spot machine from the fleet goes down, and auto scaling group spins up an on-demand machine.
  • Race condition: several instances go down at same time.
  • Aggressive spin up in on-demand machines when market is volatile.
  • Tag EC2 machines going down and then spin up AWS lambda.When spot instance returns shit down a lambda or on-demand instance. Auto Scaling group can take care of this.
  • Savings 50% to 80%
  • Source code: https://github.com/mapbox/spotswap
  • No latency because over-provisioned
  • Set bid price as on-demand price.
  • Didn’t try to increase spot instance before going on-demand
  • Cfconfig to deploy and Cloud formation template from AWS

Adventures with Postgres

Notes from Root Conf Day 1 - 2017

Root Conf is a conference on DevOps and Cloud Infrastructure. 2017 edition’s theme is service reliability. Following is my notes from Day 1.

  1. State of the open source monitoring landscape

    • The speaker of the session is the co-founder of Icinga monitoring system. I missed first ten minutes of the talk.-The talk is a comparison of all available OSS options for monitoring, visualization.
    • Auto-discovery is hard.
    • As per 2015 monitoring tool usage survey, Nagios is widely used.
    • Nagios is reliable and stable.
    • Icinga 2 is a fork of Nagios, rewrite in c++. It’s modern, web 2.0 with APIs, extensions and multiple backends.
    • Sensu has limited features on OSS side and a lot of features on enterprise version. OSS version isn’t useful much.
    • Zabbix is full featured, out of box monitoring system written in C. It provides logging and graphing features. Scaling is hard since all writes are written to single Postgres DB.
    • Riemann is stream processor and written in Clojure. The DSL stream processing language needs knowledge of Clojure. The system is stateless.
    • OpenNMS is a network monitoring tool written in Java and good for auto discovery. Using plugins for a non-Java environment is slow.
    • Graphite is flexible, a popular monitoring tool for time series database.
    • Prometheus is flexible rule-based alerting and time series database metrics.
    • Elastic comes with Elastic search, log stash, and kibana. It’s picking up a lot of traction. Elastic Stack is extensible using X-PACK feature.
    • Grafana is best for visualizing time series database. Easy to get started and combine multiple backends. - - Grafana annotations easy to use and tag the events.
    • There is no one tool which fits everyone’s case. You have to start somewhere. So pick up a monitoring tool, see if it works for you else try the next one til you settle down.
<a href="https://rootconf.talkfunnel.com/2017/17-deployment-strategies-with-kubernetes" target="_blank">Deployment strategies with Kubernetes</a>



*   This was talk with a live demo.
*   Canary deployment: Route a small amount of traffic to a new host to test functioning.
*   If new hosts don’t act normal roll back the deployment.
*   <a href="https://www.martinfowler.com/bliki/BlueGreenDeployment.html" target="_blank">Blue Green Deployment</a> is a procedure to minimize the downtime of the deployment. The idea is to have two set of machines with identical configuration but one with the latest code, rev 2 and other with rev 1. Once the machines with latest code act correctly, spin down the machines with rev 1 code.
*   Then a demo of `` kubectl `` with adding a new host to the cluster and roll back.
<a href="https://rootconf.talkfunnel.com/2017/7-a-little-bot-for-big-cause" target="_blank">A little bot for big cause</a>



*   The talk is on a story on developing, push to GitHub, merge and release. And shit hits the fan. Now, what to do?
*   The problem is developer didn’t get the code reviewed.
*   How can automation help here?
*   Enforcing standard like I unreviewed merge is reverted using GitHub API, Slack Bot, Hubot.
*   As soon as developer opens a PR, <a href="https://github.com/moengage/alice" target="_blank">alice</a>, the bot adds a comment to the PR with the checklist. When the code is merged, bot verifies the checklist, if items are unchecked, the bot reverts the merge.
*   The bot can do more work. DM the bot in the slack to issue commands and bot can interact with Jenkins to roll back the deployed code.
*   The bot can receive commands via slack personal message.
<a href="https://rootconf.talkfunnel.com/2017/18-necessary-tooling-and-monitoring-for-performance-c" target="_blank">Necessary tooling and monitoring for performance critical applications</a>



*   The talk is about collecting metrics for German E-commerce company Otto.
*   The company receives two orders/sec, million visitors per day.On an average, it takes eight clicks/pages to complete an order.
*   Monitor database, response time, throughput, requests/second, and measure state of the system
*   Metrics everywhere! We talk about metrics to decide and diagnose the problem.
*   <a href="http://metrics-clojure.readthedocs.io/en/latest/" target="_blank">Metrics</a> is a Clojure library to measure and record the data to the external system.
*   The library offers various features like Counter, gauges, meters, timers, histogram percentile.
*   Rather than extracting data from the log file, measure details from the code and write to the data store.
*   Third party libraries are available for visualization.
*   The demo used d3.js application for annotation and visualization. In-house solution.
*   While measuring the metrics, measure from all possible places and store separately. If the web application makes a call to the recommendation engine, collect the metrics from the web application and recommendation for a single task and push to the data store.
<a href="https://rootconf.talkfunnel.com/2017/51-what-should-be-pid-1-in-a-container" target="_blank">What should be PID 1 in a container?</a>



*   In older version of Docker, Docker doesn’t reap child process correctly. As a result, for every request, docker spawns a new application and never terminated. This is called <a href="https://rootconf.talkfunnel.com/2017/51-what-should-be-pid-1-in-a-container" target="_blank">PID 1 zombie problem</a>.
*   This will eat all available PIDs in the container.
*   Use `` Sysctl-a | grep pid_max `` to find maximum available PIDs in the container.
*   In the bare metal machine, PID 1 is `` systemd `` or any init program.
*   If the first process in the container is bash, then is PID 1 zombie process doesn’t occur.
*   Using bash is to handle all signal handlers is messy.
*   Yelp came up with <a href="https://github.com/Yelp/dumb-init" target="_blank">Yelp/dumb-init</a>. Now, `` dumb-init `` is PID 1 and no more zombie processes.
*   Docker-1.13, introduced the flag, `` --init ``.
*   Another solution uses `` system `` as PID 1
*   Docker allows running `` system `` without privilege mode.
*   Running system as PID 1 has other useful features like managing logs.
<a href="https://rootconf.talkfunnel.com/2017/9-razor-sharp-provisioning-for-baremetal-servers" target="_blank">‘Razor’ sharp provision for bare metal servers</a>



*   I attended only first half of the talk, fifteen minutes.
*   When you buy physical rack space in a data server how will you install the OS? You’re in Bangalore and server is in Amsterdam.
*   First OS installation on bare metal is hard.
*   There comes Network boot!
*   <a href="http://www.syslinux.org/wiki/index.php?title=PXELINUX" target="_blank">PXELinux</a> is a syslinux derivative to boot OS from NIC card.
*   Once the machine comes up, DHCP request is broadcasted, and DHCP server responds.
*   <a href="https://cobbler.github.io/" target="_blank">Cobbler</a> helps in managing all services running the network.
*   DHCP server, TFTP server, and config are required to complete the installation.
*   Microkernel in placed in TFTP server.
*   <a href="https://puppet.com/blog/introducing-razor-a-next-generation-provisioning-solution" target="_blank">Razor</a> is a tool to automate provisioning bare metal installation.
*   Razor philosophy, consume the hardware resource like the virtual resource.
*   Razor components - Nodes, Tags, Repository, policy, Brokers, Hooks
<a href="https://rootconf.talkfunnel.com/2017/77-freebsd-is-not-a-linux-distribution" target="_blank">FreeBSD is not a Linux distribution</a>



*   FreeBSD is a complete OS, not a distribution
*   Who uses? NetFlix, WhatsApp, Yahoo!, NetApp and more
*   Great tools, mature release model, excellent documentation, friendly license.
*   Now a lot of forks NetBSD, FreeBSD, OpenBSD and few more
*   Good file system. UFS, and ZFS. UFS high performance and reliable. - If you don’t want to lose data use ZFS!
*   Jails - GNU/Linux copied this and called containers!
*   No GCC only llvm/clang.
*   FreeBSD is forefront in developing next generation tools.
*   Pluggable TCP stacks - BBR, RACK, CUBIC, NewReno
*   Firewalls - Ipfw , PF
*   Dummynet - live network emulation tool
*   FreeBSD can run Linux binaries in userspace. It maps GNU/Linux system call with FreeBSD.
*   It can run on 256 cores machine.
*   Hard Ware - <a href="https://en.wikipedia.org/wiki/Non-uniform_memory_access" target="_blank">NUMA</a>, ARM64, Secure boot/UEFI
*   Politics - Democratically elected core team
*   Join the Mailing list and send patches, you will get a commit bit.
*   Excellent mentor program - GSoC copied our idea.
*   FreeBSD uses SVN and Git revision control.
*   Took a dig at GPLV2 and not a business friendly license.
*   Read out BSD license on the stage.

Book Review: The Culture Map

The Culture Map: Breaking Through the Invisible Boundaries of Global Business is a book on cultural differences in communication by Erin Meyer.

Last year, I spent three months in NYC. Whenever I entered food outlet or made an eye contact, all the conversation started with “How are you doing?”. I replied, “I’m good and how are you doing?”. Most of the times, I didn’t receive a response. It was a sign. Pedestrians smile at you if you make an eye contact in the US. This doesn’t happen back in India. Neither I’d dare to do it. That’s considered crazy. You smile at someone whom you know or about to talk. Anything else is inviting problem.

Book Review: Software Architecture with Python

The book Software Architecture with Python is by Anand B Pillai. The book explains various aspects of software architecture like testability, performance, scaling, concurrency and design patterns.

The book has ten chapters. The first chapter speaks about different architect roles like solution architect, enterprise architect, technical architect what is the role of an architect and difference between design and architecture. The book covers two lesser spoken topics debugging and code security which I liked. There is very few literature available in debugging. The author has provided real use cases of different debugging tips and tools without picking sides. The book has some good examples on OverFlowErrors in Python.

RC checklist for Indian Applicants

One Sunny Sunday morning one can get up and question their self-existence, or one can ask every few days or few months what am I doing at the current job? The answer will push you to a place you have never been.

One can meticulously plan for extravagant programming tour or programmer’s pilgrimage for three months. Yes, that what my outlook of RC is! RC is a different place from a usual workplace, meetups, college or any educational institute. The two striking reasons are peers and social rules. If you haven’t thought of attending it, give it a thought. I am jotting down the list of steps to ease the planning.

Return Postgres data as JSON in Python

Postgres supports JSON and JSONB for a couple of years now. The support for JSON-functions landed in version 9.2. These functions let Postgres server to return JSON serialized data. This is a handy feature. Consider a case; Python client fetches 20 records from Postgres. The client converts the data returned by the server to tuple/dict/proxy. The application or web server converts tuple again back to JSON and sends to the client. The mentioned case is common in a web application. Not all API’s fit in the mentioned. But there is a use case.

RFCS We Love

A simple question can open a door for new exploration. While I was at RC, I tweeted “Is there any meetup group similar to papers we love for discussing RFCS?”. In an interval of nine minutes, Jaseem replied: “Let’s start one :)”

Above discussion on Twitter, lead to a new focus group RFCS We Love in Bangalore by Nemo, Jaseem, Avinash and Kracekumar. The first meetup held today, 28-01-2017. 15 interested and ethusiatic people attended the meetup. Anand and Govind presented RFC 3629 (UTF-8) and RFC 6238 (TOTP) respectively. The slides and videos links are available in GitHub README.

Expose jupyter notebook over the network

What is the Jupyter notebook?

The Jupyter Notebook is a web application that allows you to create and share documents that contain live code, equations, visualisations and explanatory text.

The definition is from the official site. I use IPython/Jupyter shell all time. If you haven’t tried, spend 30 minutes and witness the power!

At times, I want to share some code snippet with folks in the same building during work, workshop or training. The Jupyter notebook configuration allows the user to expose the notebook cluster, terminal over the network or the internet. The notebook is available over the network with two changes in the configuration file. The first config value is IP other than default localhost. The second one is the password for users to connect to the notebook.

RC Return Statement

The Recurse Center is a free, three-month self-directed program for people who want to get better at programming. I attended fall batch 2 in 2016 from September to December. If you aren’t aware of Recurse Centre, take a couple of minutes to go through recurse.com and read the rest of the piece.

Every day we open a significant amount of doors to get to desired places. Some doors are unique but look dull and plain, but the incredible universe behind the door hides excitement, adventures, gems, friends, and insights.

Grokking algorithm: Book Review

Grokking algorithm - An illustrated guide for programmers and other curious people is the book by Aditya Y. Bhargava about understanding algorithms. The book is different from other algorithm books. The book explains the chosen concepts with illustrated guide like cartoons, xkcd and doesn’t let the followers lost in the sea of mathematical formulas or procedure.

The book discusses a handful of foundational topics on algorithms like Big O notation, Binary Search, Sorting, Recursion, Hash Tables, graphs, greedy algorithm, dynamic programming. Here is the link to Table Of Contents. The book is straightforward and easy to follow. The pictures with detailed walkthrough make learning algorithm fun; lively and joyous. The book covers multiple worked out (pictures!) problems for harder concepts like Breadth-first search and knapsack problems.

2016 Books

I was watching one of my favorite Tamil writer S Ramakrishnan’s video about books. Somewhere in the middle of the talk, he suggests the audience to write down the list of books they read this year and share with others. Irrespective of how short or tall the list is, undoubtedly people will pick up from the list someday. I bought the idea and documenting the book list.

English

Read

Currently reading

Dropped the ball

Tamil

Read

If you read so far, you care for books. Write a blog post and share with the world. I’d be happy to see your reading list. Happy reading!

RC Week 1011

EDHT

I continued to work on the project and added the few features.

  • Data replication - One of the main features of DHT is replicate the data across a subset of nodes in the cluster. Remember not all nodes! Depending on the number of copies to store, say N, the data is stored in N - 1 nodes starting from the primary node in the anti-clockwise direction.
  • Routing - Every node in the cluster has equal responsibility, and there is no master. For key metamorphosis the data is stored in the nodes n1, n2, n3 as per consistent hashing. The node n4 receives the GET request for the key metamorphosis. Node n4 doesn’t have the data and acts as coordinator. The coordinator forwards the requests to any one of the nodes, n1, n2, n3. Depending on the minimum number of successful response configuration, the receiver requests other nodes, collates the response and sends the response back to the coordinator.

I read multiple articles about Vector clocks - vector clocks are easy, vector clocks are hard, why Cassandra doesn’t need vector clocks. The vector clock is the next feature to implement in the project. The vector clock’s primary usage is conflict resolution. The DynamoDB paper suggests offloading the conflict resolution to the client.

RC Week 1010

This week has mostly been calm and cold in New York.

EDHT

Distributed Hash Table implementation in Erlang is slowly coming along. The project now supports multi-node communication.

The project uses bitcask which riak uses. Erlang’s Key/Value store data is local to the single process. Building Key/Value store from the ground up requires reinventing wheel and time consuming. The leveraging existing library made sense. Bitcask takes care of persisting the data to the disk and can access data without race conditions so far.

RC week 1001

This week I spent most of the time on working with Zulip, Erlang and EDHT.

Zulip

This week, we (Stan, Jennifer and I) continued efforts on implementing message reaction. We made decent progress and had a good create reaction frontend. During the journey, we discovered an interesting bug. Debugging the bug on production minified JS with the debugger was fun. We are close to completing the completing the initial version. It’s still a few days away from closure.

RC week 1000

This week has been quiet, bikeshedding, holiday week and unproductive week so far in RC.

Zulip

Zulip is a Python-based open source group chat. RC uses zulip for the internal chat. Unlike GitHub, zulip doesn’t support inline emoji reaction for messages. I am collaborating with Arpith and Stan Zheng to add the missing feature. The first time I came across the limitation of Django’s values method. If you’re considering setting up a group chat for your community or company, try out zulip!

RC Week 0111

This week most of the time I spent learning Distributed Hash Tables and Erlang. That means I didn’t write a significant amount of code for the project.

I am reading DynamoDB paper which is Distributed Hash Table and Distributed Key/Value store. Some implementation like etcd and consul provides Distributed Key/Value store but replicates all data across nodes. So those aren’t Distributed Hash Table. I am reading the one sections at a time. DynamoDB paper is the first technical paper reading session and on fourth sections. There are a lot of new concepts for me like vector clocks, internode communication, consistent hashing etc … I will be implementing DHT in Erlang.

Side Project Feasibility

It’s common for developers to work on side projects. The reason why developers work on a side project is a long list of imaginable and unimaginable reasons. My main reason to work on a side project is to learn how things work or to build a utility program for use.

One of my recent projects is to monitor the internet traffic and aggregate the traffic based on the domain name - bandwidth monitor. Before jumping into the code, I read a little bit about OSI Layers, TCP, UDP, IP and Ethernet packets. The project revolves around “domain name.” The assumption was at some layer domain name will be available in the packet. After capturing the packets, decoding the packet to TCP layer, I realized domain name is present in the HTTP header. I was happy with the approach. Everything worked for HTTP traffic. During the conversation of the project with James J Porter, I mentioned parsing HTTP header is the way to retrieve the domain name and aggregate traffic information. He gave an alternative idea of caching DNS requests. But I was stuck with the notion of parsing HTTP request.

RC Week 0110

This week, I reached the first milestone of the project - imon which I worked on for past couple of weeks.

imon is a command-line utility to record all network traffic and classify the data according to the domain name.

Here is a high-level presentation of the project.I am looking for code review on the project and happy to answer any questions.

I wrote a blog post about my experience with Rust. I am excited about the project and hope to keep working on it. If you have any comments on Rust or have any experience using it, please share.

My Experience With Rust

When I was about to leave to RC in few weeks, wrote an E-mail to Puneeth asking for Do's and Don'ts at RC. One of the line in the mail said,

Since you are a Python guy, don’t write any Python code while you are there. Do something completely different.

I contemplated which language to choose. Other than Python, I knew a decent amount of Go-lang and Javascript. I previously attempted to learn rust but never dived deep into it. I reconsidered learning it and came up with the project idea.

RC week 0101

Time flies faster than you can perceive.

The first half of twelve weeks experimental journey ends this week.

This week was the last week for Fall 01, 2016 batch and my batch is due in next six weeks. I wish everyone in Fall 01, 2016 batch Good luck for their future endeavors. I made few friends, helped few people in their project; few helped me in my projects and a lot of unnoticed learnings.

Man's Search for Meaning - Book Review

What can you do during acute inhuman soul-crushing conditions in life? Hope. This book is about “Hope.” - dopamine for life.

Psychiatrist Viktor Frankl’s presents his life in Nazi death camps and learning for spiritual survival. The narrative is based on his experience and the experiences of others he treated later in his practice. The author aptly reiterated throughout the book “We cannot avoid suffering, but we can choose how to cope with it, and find a meaning to live.”

RC Week 0100

I had a decent progress with my rust HTTP traffic monitor. I read a lot about the different type of packets like Ethernet Packet, IPv4 Packet, TCP Packet, UDP Packet, DNS Packets. I wrote a parser for Physical Layer packet, TCP Packet, UDP Packet, DNS Packet. The parser helped me understand a lot about rust functions, data types like array, vectors, string and string literal. I am not enjoying the relationship with rust lifetimes. Lifetime is getting harder for me to grok with nested struct’s lifetime.

RC week 0011

This week commenced with the second project at RC in rust. I am building a command line utility to monitor internet bandwidth consumption categorized by websites. I had a general idea of how to go about it and drew rough sketches.

As a lot of people suggested me, I am fighting battles with rust compiler in a life time and borrowing. I hit my first road block with pcap library and here is the issue. I am using threads, socket in the application together (Tough battles at the same time!). So far progress is dull because of the unfamiliarity of the language. In a week’s time, I will have a useful piece which does one part.

Quiet - Book Review

Quiet: The Power of Introverts in a World That Can’t Stop Talking is an excellent book about introverts and what makes introvert an introvert. The book dives deep how different culture views introvert, analyses what intrigues introverts and in what circumstances they effectively work. The book makes a lot of parallel comparisons with extroverts. What intrigues extroverts and why introverts are good at solving complex problems.

The book accounts for various behavior patterns like how introverts behave in the social events, how to raise introvert kids without nonsense advice. The author has drawn a lot of case studies from researchers about extroverts and introvert behavior; what happens when introvert and extrovert talk each other and what sort of topics comes up and when does introvert act like an extrovert.

RC Week 0010

This week has been a mixed ride with the torrent client. I completed the two pending features seeding and UDP tracker. The torrent client has a major issue with downloading larger torrent file like ubuntu iso file. The client starts the downloads from a set of peers and slowly halts at sock.recv after exchanging a handful of packets. At this juncture CPU spikes to 100% when sock.recv blocks. Initially, the code relied on asyncio only features, now the code uses curio library. Next time you write async code in Python 3, I would suggest use curio. Curio’s single feature of tracking all tasks states is magical wand for debugging. The live debugging facility helped me track down the blocking part of my code. Here is how curio’s debug monitor looks

RC week 0001

This week, I made considerable progress on the BitTorrent client which I started a week back. The client is in a usable state to download the data from the swarm. The source code is available on GitHub. The project uses Python 3.5 async/await and asyncio. I presented the torrent client in RC Thursday five minute presentation evening slot. Here is the link to the slides.

Here is quick video demo recorded with asciinema.

RC week 0000

The long awaited Recurse Center debut day, 26th Sep 2016 kick started with a welcome note by Nicholas Bergson-Shilcock and David Albert ; decorated by other events and activities to get to know the batchmates; the culture of RC and ended with closing note by Sonali Sridhar.

Bittorrent new logo

At the end of the day, I had decided to build a BitTorrent client as a first project. I was at the crossroad to choose Python or Rust or Go for the project. After a quick chat with batch mate, I decided to write the BitTorrent client in Python. I neither knew Rust well nor wrote a BitTorrent client in the past. Fighting two battles at the same time is hard.

None

I’m attending Recurse Center fall'02, 2016 batch. I’m excited (yes, without emoticons) to be in RC and New York. RC is different from schools, boot camps, universities for various reasons. The two reason I like the most is “Never Graduate” and “self-directed learning.”

A lot to learn from peers, experiments, and discussions. Next, 12 weeks will be intense and what can a programmer ask for more than time to solve problems and understand how things for. Now, I’m afraid, 12 weeks may not be enough to learn what I’ve in mind, that perfectly justifies “Never Graduate.”

Language is power

Road transport between Coimbatore and Bangalore is hit badly by fear, tension, security and protest. As of 18th Sep 2016, The current news is only two-wheelers are allowed to cross the Hosur border. The Tamil Nadu vehicles carrying vegetables, bananas, and other commodities are prohibited from entering Karnataka. Ordinary people travel till Hosur, cross the Border and walk for one or two kilometers to get a public transport and reach Bangalore. The another option is to take the train. My luck was another side, and two of my tickets were on wait list number 15 and 3. The only alternative left was to take the flight. I booked a flight from Coimbatore to Bengaluru which departs on 19th Sep, at 10.45 AM.

Flowers of Bangalore

Bangalore is called the Garden City of India. How often do you walk into a garden? I visit often. I envy you if you own a garden at home!

Every day you can spot a lot of flowers and trees during transit to work, casual walk, etc … In the evening, trees welcome all passer-bys with chosen fallen flowers in the pathway. Especially golden flowers from Copper pod. Trees and flowers are predominant in the buzzing part of the city.

Real time mobile app failure

August 11th to 15th is a long weekend and I decided to leave Bangalore on August 11th. I booked KSRTC to Coimbatore. The boarding point, St John’s office is few kilometers away from my office in teachers colony. The pickup time was 9:45 PM. I booked Uber pool at 8:35. The first driver handcuffed me to cancel the ride with notorious usual reason - traffic. It took fifteen minutes to find an another car. The driver picked me up at 8:50 and started the vehicle in mushroomed traffic. I sat in the front seat. He asked my destination and reciprocated me about the co-rider pick up point. After battling traffic for fifteen, the driver picked up the second rider. As soon the co-rider stepped in, the driver tapped the unresponsive app on his android phone. The app wasn’t loading the information of the second rider. The loader was rotating in its galactic path. The driver killed the app and opened again. He restarted the app multiple times; nothing happened; finally, he restarted the phone. He started the car, moved few meters and stopped the card and opened the app. As expected, the co-rider information showed up on the screen. We lost 5 to 10 minutes and now the time was 9:15.

HTTP Exception as control flow

As per Wikipedia , Exception handling is the process of responding to the occurrence, during computation, of exceptions – anomalous or exceptional conditions requiring special processing – often changing the flow of program execution.

In Python errors like SyntaxError, ZeroDivisionError are exceptions.Exception paves the way to alter the normal execution path.

While working with API, a web request goes through the following process,authentication, authorization, input validation, business logic and finally, the response is given out. Depending on complexity, few more steps can be involved.

State machine in DB model

A state machine is an abstract machine that can be in one of a finite number of states. The machine is in only one state at a time; the state it is in at any given time is called the current state.

While using the database, individual records should be in allowed states. The database or application stores rules for the states. There are many ways to design the database schema to achieve this. The most followed methods are using int or string field, where each value represents a state. Above is the direct method approach. An indirect and unaware method is the use of multiple boolean values to calculate the state. Like is_archived and is_published. The combination of two fields shows four different states.

Animal Farm Review

Animal Farm’s satire makes the novella interesting to read. Written after Bolshevik revolution and no reason why lot go over gaga. Confining to Bolshevik revolution is tunnel view. The core of the novella resembles with lot of today’s politics and ancient kingdom. For a moment forget about the bolshevik revolution. Rebellion outbursts, the leader comes to power and slowly vanquishes non believers in ideas or voices. Slowly becomes benevolent dictator for life, corrupt and power moves from people to few hands. Change the word rebellion to whatever it is/was called during the time period. Isn’t this the perfect politics all over the world ? Think of Indian political parties, Tamil Nadu parties. You got it right ? It rises few questions can some one have too much of power ? Rather than human, an organization can have (some say it is already at place) all power at disposal controlled by few.

Asyncio and uvloop

Today, I read an article about uvloop. I am aware of libuv and its behind nodejs. What caught me was “In fact, it is at least 2x faster than any other Python asynchronous framework.”. So I decided to give it a try with aiohttp.

The test program was simple websocket code which receives a text message, doubles the content and echoes back. Here is the complete snippet with uvloop.

I ran naive benchmark using thor and results favoured uvloop.

Permissions in Django Admin

Admin dashboard is one of the Django’s useful feature. Admin dashboard allows super users to create, read, update, delete database objects. The super users have full control over the data. Staff user can login into admin dashboard but can’t access data. In few cases, staff users needs restricted access . Super user can access all data from various in built and third party apps. Here is a screenshot of Super user admin interface after login.

Testing Django Views

Majority of web frameworks promote MVC/MTV software pattern. The way web applications are designed today aren’t same as 5-6 years back. Back then it was server side templates, HTML, API’s weren’t widespread and mobile apps were becoming popular. Rise of mobile and Single Page Application shifted majority of web development towards API centric development. Testing API is super simple with data in and data out, but testing a django view in classic web application is difficult since HTML is returned. REST semantics and status code helped to distinguish response without inspecting body.

Simple Json Response basic test between Flask and Django

Django and Flask are two well known Python web frameworks. There are lot of benchmarks claim Flask is 2xfaster for simple JSON Response, one such is Techempower. After lookinginto the source, it struckme Django can do better!

I will compare Flask and Django for simple json response. The machine used is Macbook pro, Intel Core i5-4258U CPU @ 2.40GHz,with 8 GB Memory on OS X 10.10.3. gunicorn==19.3.0 will be used for serving WSGI application.

django print exception to console

Django has very good debug toolbar for debugging SQL. While working with Single Page Application and API exceptions can’t be displayed in browser. Exception is sent to front end. What if the exception can be printed to console ?

Django middleware gets called for every request/response. The small helper class looks like

Add the filename and class name to MIDDLEWARE_CLASSES in settings file like

This is how exceptions looks

Check for custom objects in Python set.

Python set data structure is commonly used for removing duplicate entriesand make lookup faster (O(1)). Any hashable object can be stored in set.For example, list and dict can’t be stored.

User defined objects can be stored. Here is how it looks.

class Person(object):
    def __init__(self, name, age):
        self.name, self.age = name, age


In [25]: s = set()
In [26]: s.add(Person('kracekumar', 25))

In [27]: s
Out[27]: set([<__main__.Person at 0x1033c5e10>])

In [29]: Person('kracekumar', 25) in s
Out[29]: False

Implement equality check

Even though Person object with same value is present but check failed.This is because default python __eq__ checks for reference.

class as decorator

Decorator

Decorator is a callable which can modify the function, method, class on runtime. Most ofthe decorators uses closure but it is possible to use class.

Closure

import functools


def cache(f):
    storage = {}

    @functools.wraps(f)
    def inner(n):
        value = storage.get(n)
        if value:
            print("Returning value from cache")
            return value
        value = f(n)
        storage[n] = value
        return value
    return inner


@cache
def factorial(n):
    if n <= 1:
        return 1
    return n * factorial(n - 1)

>>>factorial(20)
2432902008176640000
>>>factorial(20)
Returning from cache
2432902008176640000

cache is a function which takes function as an argument and returns a function.factorial is a function which calculates the factorial of a given number, decoratedby cache. If factorial of a number is calculated or less than already calculatednumber it is retrieved from storage.

Fluent interface in python

Fluent Interface is an implementation of API which improves readability.

Example

Poem('The Road Not Taken').indent(4).suffix('Robert Frost').

Fluent Interface is similar to method chaining. I was wondering how to implement this in Python.Returning self during method call seemed good idea .

class Poem(object):
    def __init__(self, content):
        self.content = content

    def indent(self, spaces=4):
        self.content = " " * spaces + self.content
        return self

    def suffix(self, content):
        self.content += " - {}".format(content)
        return self

    def __str__(self):
        return self.content

>>>print Poem('Road Not Taken').indent(4).suffix('Rober Frost').content
    Road Not Taken - Rober Frost

Everything seems to be ok here.

Python global keyword

Python’s global keyword allows to modify the variable which is out of current scope.

In [13]: bar = 1

In [14]: def foo():
....:     global bar
....:     bar = 2
....:

In [15]: bar
Out[15]: 1

In [16]: foo()

In [17]: bar
Out[17]: 2

In the above example, bar was declared before foo function. global bar refers to the bar variablewhich is outside the foo scope. After foo invocation bar value was modified inside foo. The value ismodified globally.

python source fileencoding

Some of the python source file starts with -*- coding: utf-8 -*-. This particular linetells python interpreter all the content (byte string) is utf-8 encoded. Lets see how it affects the code.

uni1.py:

# -*- coding: utf-8 -*-
print("welcome")
print("animé")

output:

➜  code$ python2 uni1.py
   welcome
   animé

Third line had a accented character and it wasn’t explictly stated as unicode. print function passed successfully.Since first line instructed interpreter all the sequences from here on will follow utf-8, so it worked.

How to install externally hosted files using pip

As of writing (12, May 2014) latest version of pip is 1.5.1. pip doesn’tallow installing packages from non PyPI based url.It is possible to upload tar or zip or tar.gz file to PyPI or specifydownload url which points other sites(Example: pyPdf points to http://pybrary.net/pyPdf/pyPdf-1.13.tar.gz).pip considers externally hosted packages as insecure. Agreed.

This is one of the reason why I kept using pip 1.4.1. Finally decided to fix this issue.Below is the sample error which pip throws.

Bus journey

I am big fan of bus travel. Still it is my only mode of transportation. The two reason I love it are wind and sight seeing. Whenever the wind kisses me I forget myself and start thinking about memories.

The best part of the wind (Thendral) is it kindles happiness, sad moments, memorable ones, wishes and missing. Thendral has complete effect of changing my mood and mode.

I don’t think only thendral has this effect. Trees, plants, flowers and water also produces same effect. Bus journey sows lot of peace in me.

How to learn Python ?

Over period of time few people have asked me in meetups, online I want to learn python. Suggest me few ways to learn. Everyone who asked me had differentbackground and different intentions. Before answering the question I try to collectmore information about their interest and their previous approaches. Some learnt basicsfrom codecademy, some attended beginners session in Bangpypers meetup. In this postI will cover general questions asked and my suggested approach.

Stop iteration when condition is meet while iterating

We are writing a small utility function called is_valid_mime_type. The function takes a mime_typeas an argument and checks if the mime type is one of the allowed types. Code looks like

ALLOWED_MIME_TYPE = ('application/json', 'text/plain', 'text/html')

def is_valid_mimetype(mime_type):
    """Returns True or False.

    :param mime_type string or unicode: HTTP header mime type
    """
    for item in ALLOWED_MIME_TYPE:
        if mime_type.startswith(item):
            return True
    return False

Above code can refactored into single line using any.

Best weekend in recent times

Normally I don’t plan weekends. I code, watch movies. This weekend (8th March) was different though. March 7th, friday evening wasn’t good. I was banging my head at work to get api working. Then came back home. Relaxed for an hour Facebook, Youtube. Then opened emacs and started to play Raja sir’s music. Stared at code, walked along the execution. Figured out the issue. Can’t ask for more. Calm and code. Slept at 3.00 AM.

Find n largest and smallest number in an iterable

Python has sorted function which sorts iterable in ascending or descending order.

# Sort descending
In [95]: sorted([1, 2, 3, 4], reverse=True)
Out[95]: [4, 3, 2, 1]

# Sort ascending
In [96]: sorted([1, 2, 3, 4], reverse=False)
Out[96]: [1, 2, 3, 4]

sorted(iterable, reverse=True)[:n] will yield first n largest numbers. There is an alternate way.

Python has heapq which implements heap datastructure. heapq has function nlargest and nsmallest which take arguments n number of elements, iterable like list, dict, tuple, generator and optional argument key.

Counting elements with dictionary

Let’s say you want to find how many times each element is present in the list or tuple.

Normal approach

words = ['a', 'the', 'an', 'a', 'an', 'the']
d = {}
for word in words:
    if word in d:
        d[word] += 1
    else:
        d[word] = 1
print d
{'a': 2, 'the': 2, 'an': 2} 

Better approach

words = ['a', 'the', 'an', 'a', 'an', 'the']
d = {}
for word in words:
    d[word] = d.get(word, 0) + 1

print d
{'a': 2, 'the': 2, 'an': 2

Both the approach returned same values. The first one has 6 lines of logic and second has 3 lines of logic (less code less management).

Two scoops of django

Two Scoops of Django -1.5 is book by Pydanny and Audrey Roy focusing on writing clean and better Django application.

If you are using Django in production this is must read book.

Q: I am using django since 0.8 do I need this book ?

A: Yes, consider the book as starting point to validate your assumption.

Q: I just started using django, should I read this ?

A: Yes. I started to use django in production last month. Sometimes I felt I should finish this book before pushing any code further. For every two or three chapters I can clearly find mistakes and fix it.

Updating model instance attribute in django

It is very common to update single attribute of a model instance (say update first name in user profile) and save it to db.

In [18]: u = User.objects.get(id=1)

In [19]: u.first_name = u"kracekumar"

In [20]: u.save()

Very straight forward approach. How does django send the sql query to database ?

In [22]: from django.db import connection

In [22]: connection.queries
Out[22]: 
[... 
{u'sql': u'UPDATE "auth_user" SET "password" = \'pbkdf2_sha256$12000$vsHWOlo1ZhZg$DrC46wq+a2jEtEzxmUEw4vQw8oV/rxEK7zVi30QLGF4=\', "last_login" = \'2014-02-01 06:55:44.741284+00:00\', "is_superuser" = true, "username" = \'kracekumar\', "first_name" = \'kracekumar\', "last_name" = \'\', "email" = \'me@kracekumar.com\', "is_staff" = true, "is_active" = true, "date_joined" = \'2014-01-30 18:41:18.174353+00:00\' WHERE "auth_user"."id" = 1 ', u'time': u'0.001'}]

Not happy. Honestly it should be UPDATE auth_user SET first_name = 'kracekumar' WHERE id = 1. Django should ideally update modified fields.

how not to insult developer while hiring

In December I was looking for new job. I came across one and applied. After couple of rounds of interview, co founder informed me we will get back to you, but never happened. This sounds normal but it is not.

Interview

First round of interview started with online pair programming with one of the co-founder X. After that I went to their office, met product manager, co founder X had discussion for an hour. Then we decided we will meet again. After couple of days solved two more problems. Then again pair programmed with other co-founder Y. Then discussed with co-founder X about company style, roles, expectation, before leaving I was interviewed by another team member over cup of coffee for half an hour. Co-founder X replied, I will get back to you tonight. Three days passed nothing happened, I dropped a Thank you email, still no response.

On leaving HasGeek

Today (31-12-2013) is my last working day at HasGeek. It was in July 2012, I joined HasGeek. It was fabulous journey for past 18 months, meeting lot of new people, being part of events, writing lot of code. I was part of large, medium, small conferences like Fifth elephant (2012, 2013), Cartonama (2012), JSFoo(2012, 2013), Droidcon (2012, 2013), Metarefresh (2013), various hacknight and geekup.

I will be joning Aplopio Technology Inc, flagship product recruiterbox on 16 January, 2014.

introduction to python

This is the material which I use for teaching python to beginners.

tld;dr: Very minimal explanation more code.

Python?

  • Interpreted language
  • Multiparadigm

Introduction

hasgeek@hasgeek-MacBook:~/codes/python/hacknight$ python
Python 2.7.3 (default, Aug  1 2012, 05:14:39)
[GCC 4.6.3] on linux2
Type "help", "copyright", "credits" or "license" for more information.
>>>


>>> print "Let's learn Python"
Let's learn Python

Numbers

>>> 23 + 43
66
>>> 23 - 45
-22
>>> 23 * 45
1035
>>> 23 ** 4
279841
>>> 23 / 4
5
>>> 23 / 4.0
5.75
>>> 7 % 2
1

Expressions

Deploying full fledged flask app in production

This article will focus on deploying flask app starting from scratch like creating separate linux user, installating database, web server. Web server will be nginx, database will be postgres, python 2.7 middleware will be uwsgi, server ubuntu 13.10 x64. Flask app name is fido. Demo is carried out in Digital ocean.

Step 1 - Installation

Python header

root@fido:~# apt-get install -y build-essential python-dev

Install uwsgi dependencies

root@fido:~# apt-get install -y libxml2-dev libxslt1-dev

Nginx, uwsgi

ipynb2viewer - Afternoon hack

ipython nbconvert has lot of handy options to convert ipynb to markdown, html etc… But I wanted to upload ipynb to gist.github.com and create a link in nbviewer.ipython.org. Started with curl and soon realized, it is getting messy. So wrote a small python program ipynb2viewer, Source code.

Install

  • pip install ipynb2viewer

Usage

Upload all ipynb files in the given path to gist.github.com and return nbviewer urls.

  • ipynb2viewer all <path>

Upload mentioned file to gist.github.com and return nbviewer url.

Autogenerate Dockerfile from ubuntu image

I was learning docker to use in one of my projects. I kept installing packages, finally created required Ubuntu image. Suddenly thought it would be cool if I can generate Dockerfile like pip freeze > requirements.txt.

sudo docker run ubuntu dpkg --get-selections | awk '{print $1}' >  base.list &amp;&amp; sudo docker run pyextra dpkg --get-selections | awk '{print $1}'> pyextra.list &amp;&amp; sort base.list pyextra.list | uniq -u > op.list &amp;&amp; python -c "f = open('Dockerfile', 'w');f.write('from ubuntu\n');f.write('RUN apt-get update\n');for line in open('op.list').readlines():f.write('run apt-get install -y {0}'.format(line));"

Breakdown

Start the docker in daemon mode, then execute following commands.

easy_install broken in os x mavericks

I hardly use easy_install. Nowadays all the python requirements are installed via pip.

IPython is my primary python console. After installing mavericks, I installed IPython and fired IPython console. Following warning message appeared

➜  ~  ipython
/Library/Python/2.7/site-packages/IPython/utils/rlineimpl.py:94: RuntimeWarning:
******************************************************************************
libedit detected - readline will not be well behaved, including but not limited to:
   * crashes on tab completion
   * incorrect history navigation
   * corrupting long-lines
   * failure to wrap or indent lines properly
It is highly recommended that you install readline, which is easy_installable:
     easy_install readline
Note that `pip install readline` generally DOES NOT WORK, because
it installs to site-packages, which come *after* lib-dynload in sys.path,
where readline is located.  It must be `easy_install readline`, or to a custom
location on your PYTHONPATH (even --user comes after lib-dyload).
******************************************************************************
  RuntimeWarning)
Python 2.7.5 (default, Aug 25 2013, 00:04:04)
Type "copyright", "credits" or "license" for more information.

IPython complains readline is missing and insisting to use easy_install. Then I tried

Why programmers should love to read and write

Everyday as a programmer we solve problems and introduce new problems. Most of the time is spent in reading other people source code, library documentation, replying to developers email. Though communication is what programmers do all the time with computer and humans.

Programmers around the globe suggests books like SICP. I have never come across people who suggests books like on writing well for programmers. Though I haven’t read the book myself, the point is why aren’t people recommending books like how to write well.

taking rest

On my way back to home from work I was thinking, I should take rest. In my world rest always meant working on some non work codebase, reading a book, watching a movie. But I don’t want to do any of these. Then I thought I would just listen to songs, quickly remembered that will not work out because I will start coding.

Finally I started to ask myself, is it possible for any human being to sit idle and take rest. Immediately I gave up. Brain will start thinking about something or other, you will be pulled into it.

Check Tamil word or sentence is palindrome

How to check given text is palindrome or not

def sanitize(text):
    for char in [" ", ".", ",", ";", "\n"]:
        text = text.replace(char, "")
    return text

def palindrome(word):
    # This approach is o(n), problem can be solved with o(n/2)
    # I am using this approach for brevity
    return word == word[::-1]

palindrome(sanitize("madam")) # True
palindrome(sanitize(u"விகடகவி")) # False

Here is the hand made version for Tamil

# Assign the variable meal the value 44.50 on line 3!
# hex values from க..வ
def sanitize(text):
    for char in [" ", ".", ",", ";", "\n"]:
        text = text.replace(char, "")
    return text

dependent_vowel_range = range(0xbbe, 0xbce)

def palindrome_tamil(text):
    front, rear = 0, len(text) - 1
    while True:
        #We will start checking from both ends
        #If code reached centre exit
        if front == rear or abs(front - rear) == 1:
            return True
        else:
            if ord(text[front+1]) in dependent_vowel_range and ord(text[rear]) in dependent_vowel_range:
                if text[front] == text[rear-1] and text[front+1] == text[rear]:
                    front += 2
                    rear -= 2
                else:
                    return False
            else:
                if text[front] == text[rear]:
                    front += 1
                    rear -= 1
                else:
                    return False




print palindrome_tamil(sanitize(u"விகடகவி")) == True
text = u"""
யாமாமாநீ யாமாமா யாழீகாமா காணாகா
காணாகாமா காழீயா மாமாயாநீ மாமாயா
"""
print palindrome_tamil(sanitize(text)) == True

#output
True
True

How Tamil Unicode works

Tamil has 247 characters. No panic. It is simple. 12 uyir eluthu(அ,ஆ..ஔ), 18 mei eluthu(க்,ங்..) , 216 uyirmei eluthu(12 * 18 க,ங ).1 ayutham(ஃ).

I assume you know what is unicode. If not read The Absolute Minimum Every Software Developer Absolutely, Positively Must Know About Unicode and Character Sets and then read wikipedia page. You will understand most of it. Back to the post.

Every character or letter in unicode has value called code point. This is similar to ASCII where value of a is 97. All code point value is represented in hexadecimal. Tamil unicode character range starts from 0B80 - 0BFF. Unicode consortium has a complete mappings. So value of அ is 0B85.

coverage.py to test web application coverage without writing tests

Tests are mandatory for packages, modules, web apps. If you are lazy to write unit tests but wanted to see the coverage for web app here is the short cut.

Lets consider a Python web app uses Flask, runserver.py runs the development web server. Normally server is started by the command python runserver.py. Now use coverage run runserver.py.

coverage python package will run the python program til program exits or user halts. Once web app testing is complete via browser, run coverage report -m this will produce big list of lines covered in modules during execution.

Observations from handling python workshop in engineering colleges

Observations from handling python workshop in engineering colleges

I handled 5 python workshop/sessions for novices from 8th march to 20th august, each one stretching from 1 hour to 2 days. It was worth the time.

Students who participated in the workshop where from Computer Science, Electronics background of Under Graduate and Post Graduate level. Minimum strength was 60 and maximum was close to 100.

  1. When handling the strength of 60 students in lab, remember every one have their own pace of picking up.
  2. Distribute the material in html or pdf format so students can look into it. Some students in the workshop start doing the examples as soon as they receive the material and these students will learn on their own, now you have segmented the students who need instructor support. These students also help their friends when stuck.
  3. Go slow and repeat twice every concept. Students from engineering college use C, C++. Python’s ease is difficult to digest at first.
  4. Make sure you don’t stand near your laptop for more than 20 minutes. Teach topic, show the example and move around so the students can approach you and you will get to know the difficulties of students. Not everyone will ask questions.
  5. Give problems for them to solve and don’t give problems which takes more than 10 minutes to solve. Spend the time in front of students terminal and help the struggling students. At the end of 10 minutes you will know how the students approached the problem and also you can get an insight how much students grasped. This gives you the signal whether your explanation understandable or not.
  6. Write the code for the problem in front of the students, please don’t show already written code. Discuss what are the approaches for the same problem and how others students solved.
  7. Having few volunteers to help during workshops is great. Students will start approaching them for hurdles.
  8. Don’t flood students with too much of data in a single day. Make sure workshop is for only 6 - 7 hours per day. They need to time to digest.
  9. When you are teaching list comprehensions make sure students write the same example using for loop and show them the one liner. Here most students gets confused with syntax. Now give them more problems to solve using list comprehension.
  10. Be careful in using variable names, students will use the same variable names in their code.
  11. Don’t teach classes for beginners, you will waste lot of time explaining public, private method, __init__, self. Instead use the time to solve problems.
  12. Spend enough time in writing small programs(use text editor) using if, else, elif, for so that they get used to indentation.
  13. Give problems like greatest of three numbers to show them the use case of a > b > c rather than using a > b and b > c.
  14. Give problems like finding total number of lines, words in a file. This helps in getting rid of for loop with counter, rather encourages to use len(f.readlines()).
  15. Don’t teach *args, **kwargs but spend time in making them understand function can accept functions as parameters, so it becomes easy for them to digest len(f.readlines()).
  16. Make sure to teach dir and help this helps people who are interested to explore further.
  17. If you want to enforce pythonic way of writing code like list comprehension, passing function to function show few examples comparing pythonic and non pythonic way. Advocate the advantage.
  18. Leave your email id with students and collect the feedback via google forms or physical form make sure it is anonymous.
  19. Students will ask recommendation for books, projects etc … Be prepared to handle.
  20. Don’t teach raw_input or input teach them how to accept command line parameters.

There's no such thing as a bad student, only a bad teacher - Unknown

avoid accessing wrong column in csv

Avoid accessing wrong column in csv

I was parsing few csv files. All were from same provider. Format isDate, Name, Email, Company, City etc in one file. I made an assumption all the downloaded files are in the same format. For my surprise few files had same format and others din’t.

with open(filename, 'rb') as f:
    reader = csv.reader(f)
    reader.next() # first row contains column names
    for row in reader:
        name, email, company = row[1], row[2], row[3]
        #save to db

In the above code the fundamental mistake is fixing the position of the columns in csv file. What happens if email and company position is interchanged? In simple screwed.

Funny experience of using trace module to trace function call

I came across this issue in httpie and started my investigation.

The problem is while pretty printing the json, output is alpha sorted because keys are hashed and user wanted to preserve the order. Then I made 3 comments to the issue. First comment was half correct and explained why it isn’t possible to get the desired output, quickly I figured my assumptions were wrong and second comment explained what is actually happening, finally I proposed the solution. Since I made wrong assumptions and to make further debugging easy, I want to find easiest way to trace all functions/methods invocations.

http request examples for luasocket

I was looking for http library in lua and landed in luasocket.http page. It isn’t well documented, sent few GET, POST, PUT requests and figured few bits. This blog post aims in bridging the gap(code examples).

In this example, I will use httpbin as target site. The complete code is available as gist.

The following code should be executed like a standalone lua file(lua lua_httpbin.lua) and while executing the code in interpreter please make local variables http, ltn12, base_url as global variables.

Why This Kolaveri Di song words language

I was wondering how many words in why this kolaveri di song belongs to english. So I wrote this code to evaluate.

#! /usr/bin/env
#! -*- coding: utf-8 -*-

lyrics = """
yo boys i am singing song
soup song
flop song
why this kolaveri kolaveri kolaveri di
why this kolaveri kolaveri kolaveri di
rhythm correct
why this kolaveri kolaveri kolaveri di
maintain please
why this kolaveri di

distance la moon-u moon-u
moon-u color-u white-u
white background night-u night-u
night-u color-u black-u

why this kolaveri kolaveri kolaveri di
why this kolaveri kolaveri kolaveri di

white skin-u girl-u girl-u
girl-u heart-u black-u
eyes-u eyes-u meet-u meet-u
my future dark

why this kolaveri kolaveri kolaveri di
why this kolaveri kolaveri kolaveri di

maama notes eduthuko
apdiye kaila snacks eduthuko
pa pa paan pa pa paan pa pa paa pa pa paan
sariya vaasi
super maama ready
ready 1 2 3 4

whaa wat a change over maama

ok maama now tune change-u

kaila glass
only english

hand la glass
glass la scotch
eyes-u full-a tear-u
empty life-u
girl-u come-u
life reverse gear-u
love-u love-u
oh my love-u
you showed me bouv-u
cow-u cow-u holy cow-u
i want you hear now-u
god i am dying now-u
she is happy how-u

this song for soup boys-u
we dont have choice-u

why this kolaveri kolaveri kolaveri di
why this kolaveri kolaveri kolaveri di
why this kolaveri kolaveri kolaveri di
why this kolaveri kolaveri kolaveri di

flop song
"""
dict_file_path = "/usr/share/dict/words"


def sanitize(words):
    for index, word in enumerate(words):
        if word.endswith("-u") or word.endswith("-a"):
            words[index] = word[:-2]


if __name__ == "__main__":
    # Get all words
    words = [word for line in lyrics.split("\n") for word in line.split(" ") if word != ""]
    # Load english words
    dictionary_words = open(dict_file_path).readlines()
    # Remove \n in dictionary words
    dictionary_words = [word.split("\n")[0] for word in dictionary_words]
    # Add missing words
    dictionary_words.append("boys")
    dictionary_words.append("snacks")
    dictionary_words.append("eyes")
    dictionary_words.append("english")
    dictionary_words.append("1")
    dictionary_words.append("2")
    dictionary_words.append("3")
    dictionary_words.append("4")
    dictionary_words.append("notes")
    dictionary_words.append("ok")
    dictionary_words.append("showed")
    # Remove -u which sounds like Tamil words
    sanitize(words)
    # Find unique words
    unique_words = set(words)
    # Find english words
    eng_words = [word for word in unique_words if word in dictionary_words]
    non_eng_words = unique_words - set(eng_words)
    # Remove empty element
    non_eng_words = [word for word in non_eng_words if word != ""]
    print("==English Words==")
    print(eng_words)
    print("==Non English Words==")
    print(non_eng_words)
    print("Total unique words: %d,\n English words: %d,\n Non English words: %d,\n percentage of english words: %f" % (len(unique_words), len(eng_words), len(non_eng_words), float(len(eng_words))/len(unique_words) * 100))

Output

SSL for flask local development

Recently at HasGeek we moved all our web application to https. So I wanted to have all my development environment urls to have https.

How to have https in flask app

Method 1

from flask import Flask
app = Flask(__name__)
app.run('0.0.0.0', debug=True, port=8100, ssl_context='adhoc')

In the above piece of code, ssl_context variable is passed to werkezug.run_simple which creates SSL certificates using OpenSSL, you may need to install pyopenssl. I had issues with this method, so I generated self signed certificate.

cp command implementation and benchmark in python, go, lua

I was wondering how much will be the speed difference between cp command, rsync and implementation in python, go, lua and so wrote this code.

Background

  1. python has two versions one with gevent and without gevent. Both the version uses shutil for copying files and directory tree.
  2. go uses https://github.com/opesun/copyrecur for copying recursively.
  3. lua uses lfs - LuaFileSystem module. lfs has support for creating directory but not for files, in order to copy the files low level file opening and writing to file technique is used.
  4. rsync --progress -ah -R was also added to the test.

Code

Reliance filed 420 case against me in Delhi consumer court

22, June 2013

I woke up like any other morning thinking of closing a github issue.I went out to pick up food items for the weekend, once I was back my friend said “ You got a call on Tamil Nadu number”.

I reverted the call and the person said “I am calling from Delhi consumer court, confirmed my name, address where I stayed in feb, 2009, added RC Gowtham has filled a 420 case against you(dc801/2013, couldn’t find status in http://164.100.72.12/ncdrcusersWeb/login.do?method=caseStatus) and gave his phone number”.

Little spoof of Kannukku Mai Azhagu lyrics

Today my sister was cleaning her contact lens in front of mirror. I suddenly remembered Tamil song Kannukku Mai Azhagu and lyrics and little poet in me whispered Kannukku Contact lens Azhagu. This induced me to spoof little bit of lyrics pertaining to current trend. I will spoof lyrics which I could.

Kannukku **Contact lens** Azhagu, Kavithaikku Poi Azhagu
Kannathil Kuzhi Azhagu, Kaar Koonthal **Kandalae** Azhagu
Ilamaiku **photo** Azhagu, Muthumaikku Narai Azhagu
Kalvarkku **phone number** Azhagu, Kaathalarkku **facebook** azhagu

** are spoofed ones.

Quora : I hate you for this

I have been using quora for almost two years. I have connected my Facebook, twitter, wordpress, tumblr accounts.

Unless you are logged into quora you cannot read the post, on the surface this is true. If you are web geek you know you can look into the source code and read the page content and by pass.

The Problem

Now I am logged into facebook in tab 1, tab 2 is loading quora url, after few seconds dialog box appears and I am automatically logged into quora account. Now I logout of quora and again I visit the quora url, I am logged back into quora. I deleted the cookies and tried again, still the same. Unless I am logged out of facebook I can’t logout of quora.

How much does it cost to spend 10 days in Mcleodganj

At HasGeek we decided to spend part of summer in McleodGanj. This trip’s main focus was to code and enjoy beauteous mcleodganj surrounding.

Trip

We(Kiran, Supreeth, Haris, Praseetha, me) started from Bangalore 13, April, 2013 and returned on 1, May, 2013. I have uploaded the photos in FaceBook, notes of food items I had in the trip, list of places visited in McleodGanj.

Following is the breakup of the trip cost.

13, April 10:00PM, Yeshwantpur Railway Station, Bangalore,

  • Dinner : 60
  • Taxi fare: 50
  • Train Ticket cost: 1750

14, April - In Train crossing central India

  • Breakfast: 40
  • lunch: 100
  • Dinner: 80
  • Ice Cream: 20

15, April: Delhi

Jama Masjid

hardest feature request

I was working on Hgtv feature, syncing slides and videos, when video is viewed slide changes automatically. Seems easy but guess what someone has to take pain to watch entire video and collect the details about timings of the video and slide number . Then pass on the info to presentz.js which syncs video and slides.

Iframe

If you look into source code how slide show and video is inserted, it is iframe. All videos are from youtube, slides are from speakerdeck, slideshare. Now to sync video and slide, I need to fetch the slide number and current time of video. Youtube has js api, which was easy to figure, but speakerdeck and slideshare inserts images into iframe. When next button is clicked image is changed. If I can access DOM I am done, but unfortunately you cannot access the DOM of an Iframe for a Cross Origin Request. I found this info after one whole day of tinkering and trying all answers in stackoverflow. Then I looked into presentz JS how it handles slide changing. Speakerdeck receives postMessage, it accepts nextSlide, previousSlide, goToSlide messages. Once speakerdeck processed the messages and sends message to originator, and the received message has to be processed(window.addEventListener). Before figuring above messages I was brute forcing to get figure out how to get current slide. Once I figured it only accepts three message, then it was easy. Below is the code.

coding from balcony

I live in 5th floor, my table is near window. I spent saturday by watching 3 documentaries and 1 tamil movie Sindhu Bhairavi, yes no single line of code. Once I am done with movie it was 00:30 AM, now stepped into the balcony was mesmerized by breeze. I felt like a poet and casual thoughts.

Yes breeze indeed brought a new thought seed from distant place, “How about coding from balcony”. No second thought, cleaned, set up done. All ready now. Truly great to sit in bean bag hearing Tamil song with laptop, random stray dog barking, car sound and cool breeze.

Avvaiyar now International Icon

கற்றது கைமண் அளவு, கல்லாதது உலகளவு - ஔவையார்.

Yes, you must have read this is tamil text book in standard 1. Now this is translated into english and referenced in NASA. From NASA - Cosmic Questions Exhibit

What we have learned
Is like a handful of earth;
What we have yet to learn
Is like the whole world
    - Auvaiyar, 4th C poet, India  

Wikipedia has an article about Avvai paatti.

How much will it cost to attend Hacker School ?

Hacker School is a three-month, full-time school in New York for becoming a better programmer for free, but stay, travels is yours.

I have no idea how much it will cost for travel from India, stay, food, internet, transit so I asked the question in Quora. I got pretty good answers.

Then I started do my lame math.

Monthly Expense(USD)

 Rent = 1000
 Phone = 80
 Transit = 100
 Internet = 40
 Electricity = 40
 Food = 250
 Snacks = 100
 Outing = 120
 Misc = 100

Total = 1830.

Evaluate python code using client side javascript

Now Python code can be evaluated using Client side Javascript with the help of empythoned project. empythoned uses emscripten which convert LLVM bitcode to javascript.

What is empythoned?

Empythoned is project which has converted CPython to javascript. I have created a demo project to test how to use empythoned, have a look.

Python parallel assignment

Python supports parallel assignment like

>>> lang, version = "python", 2.7
>>> print lang, version
python 2.7

values are assigned to each variable without any issues.

>>> x, y, z = 1, 2, x + y
Traceback (most recent call last):
File "<stdin>", line 1, in <module>
NameError: name 'x' is not defined

First python tries to evaluate x + y expression. Since x, y is defined in same line, python is unable to access the variable x and y, so NameError is raised.

Setting up privoxy proxy server for browsing

I wanted to setup proxy server for browsing. Tried http://www.squid-cache.org/ felt cumbersome to configure though it has advanced features.

Finally decided to setup http://www.privoxy.org/. I assume you have personal server where all the requests are forwarded.

Installation

__ Server Config__

sudo apt-get install privoxy

sudo vim /etc/privoxy/config

look for listen-address and add ip:port listen-address 78.12.204.2:8118

To enable logging for all requests uncomment debug 1(you will need to rotate log file using cronjob).Done with server config.

2 weeks after installing Ubuntu 12.04 64bit in MacBook

Its been two weeks, since I installed Ubuntu 64 bit in MacBook. Let me say how I feel about.

My setup isn’t complicated, I use sublime2, terminator, gnome-terminal, ipython, pypy-2.0beta, postgres, mosh, firefox-nightly, chromium, chrome, clementine, xchat-gnome, hotot(recent addition).

Also installed nividia drivers for ubuntu. I am happy I installed ubuntu but had issue with right click and figured shift + F10 is short cut.

Hate Hate Hate

If you are using data-card you must have noticed default modem-manager is buggy. If I suspend my system for 10 hours and try to activate the device no response. You kill the process and restart it doesn’t work, So I need to restart the machine. Really din’t find solution for it :-(.

Stable Browser vs Fastest Browser

Firefox Vs Chrome Vs Chromium

Chrome & Firefox are my current favourite browser. I was previously using chromium but stopped using because of this bug.

Since then I switch between chrome and firefox. Chrome is fastest, better UI & UX. Firefox is stable.

Today I faced issue with my flash player in chrome(Ubuntu 64 bit OS). All the youtube video starts playing at 2x - 3x speed. After few mins video plays in normal speed. This is been really annoying me a lot, whereas there is no issue with Firefox.

Can PyPy be used for web application deployment ?

What is PyPy ?

  • PyPy is an implementation of Python in Python which uses JIT(Just In Time) translation.

Why to Use PyPy?

  • According to benchmarks “It depends greatly on the type of task being performed. The geometric average of all benchmarks is 0.18 or 5.6 times faster than CPython”.

Experience?

I have used PyPy for sandboxing for my project pylive, tested flask with pypy, ported brubeck to work on pypy, tested lastuser in PyPy 2.0beta1. In the experiment pypy-sandbox, requests, flask, werkzeug, jinja2, SQLAlchemy(postgres + sqlite), cython, greenlet, eventlet, markdown, gunicorn, dictshield, json, zmq was tested.

How to run Python Linux commands in PyPy ?

I have been using PyPy from 1.6. Now PyPy 2.0beta1 is out. Most of python libraries which don’t have c extensions as dependencies work(exceptions are present).

E.g: requests, fabric, gunicorn

Install PyPy2.0beta1

  1. Grab 32 or 64 bit pypy2.0 beta1. If you are using ubuntu grab libc2.5.

`` bzip2 pypy-2.0-beta1-linux64-libc2.15.tar.bz2 ``
`` tar -xvf pypy-2.0-beta1-linux64-libc2.15.tar ``
<code>curl -O <a href="http://python-distribute.org/distribute_setup.py" target="_blank">http://python-distribute.org/distribute_setup.py</a></code>
<code>curl -O <a href="https://raw.github.com/pypa/pip/master/contrib/get-pip.py" target="_blank">https://raw.github.com/pypa/pip/master/contrib/get-pip.py</a></code>
`` ./pypy-2.0-beta1/bin/pypy distribute_setup.py ``
`` ./pypy-2.0-beta1/bin/pypy get-pip.py ``
`` ./pypy-2.0-beta1/bin/pip install virtualenv ``
Please feel free to create `` alias ``. Here is sample alias.



alias pypy2b="/home/kracekumar/downloads/pypy-2.0-beta1/bin/pypy"
alias pypy2b-pip='/home/kracekumar/downloads/pypy-2.0-beta1/bin/pip'
alias pypy2b-venv='/home/kracekumar/downloads/pypy-2.0-beta1/bin/virtualenv'

Next ?

  1. pypy2b-pip install fabric

Why did I install Ubuntu in MacBook ?

Reasons

  • Mac doesn’t have package manager like synaptic. But you have App Store for apps.
  • You need to build xcode or gcc-installer to install any cocoa app.
  • I Use USB Data card as result I don’t have privilege to experience high speed broadband connection. As I have noticed OSX apps are heavy to download.
  • Linux fanboy :-)

Post PC OS Hate

Smartphones & Tablets are new plastic papers. Android and IOS dominates the market. I own Android Phone because its cheap. When it comes to Gadgets I am conservative. I really don’t like Android much because its written in Java and need a better hardware to run Interpreted Languages, JS. Android runs from low end cheap 4000Rs phone to Nexus Phone. So wide variety of device headaches to developers.

Every Android phone is one layer coated with device makers code, same with Linux distros. In Desktop/Laptop I really want to use Linux, give me a MacBook first thing I will do is (try to) Install Linux distro.

Microframeworks produces micro level progress in project

I created my first web app in PHP 5.2 with * no frameworks *, then learned drupal, tried codeigniter, joomla. Then I learned Rails for HappySchool project and learned Django since I am python lover. Tried Pylons and settled with Flask and experimenting brubeck.

Flask is microframework built around werkzeug datastructures.

Advantages Vs Disadvantages

  • Learn in depth working of HTTP vs Time consuming
  • Opportunity to create library vs Time consuming
  • Less batteries available vs More development time

It is highly loosely coupled which is good to replace the parts with best tools if available. Not suited for everyone. Unless you are ready to explore/headdesk/discover/ship/reship/learn/hack DON’T use microframework, choose full blown framework like Django/Rails.

Hackathons are to hack and * headdesk * moments

I feel whole idea of hackathon is to build stuff over night or a day or two. All the outcome of the hackathon is for good. As a beginner it could be useful to get started to technology/language. I am biased towards prizes in hackathon.

What GNU/Linux Operating System lacks ?

I am GNU/Linux user for almost 4 years, saying that I don’t switch back to windows for any day to day activities, being said that GNU/Linux operating system’s lacks lot of * applications *. There are lot superior command line applications but those are intended for religious command line humans.

I have been using mac osx 10.6. for 3-4 weeks, it seems to have all applications but I don’t feel at ~.

What I like about Python

Lynn Root asked in twitter what you like and like to improve in python https://twitter.com/roguelynn/status/259338864664125440. Following are my observation

Likes

  1. Importance to documentation..
  2. Clean syntax.
  3. Easy to get started for non CS background people.
  4. lot of smart programmers.
  5. Libraries like IPython, requests, flask.
  6. Creating libraries like pygments, sphinx, readthedocs to solve * REAL * problem.

I would like/want to improve

  1. python.org site
  2. While programmers are reading docs.python.org, code snippets should be executable right from the page(I have plans to start this as personal project).
  3. Ship real documentation server like godoc(I will tentatively start this work in second week of november).
  4. Solve concurrency in core with CPython.
  5. Make python first class programming language in Windows.
  6. Better support for tablet, smart phone application development.

Apart from all the above python community is awesome(yes my english is poor yu know that).

How Python makes learning simpler

Python is a simple language and developed with programmer’s productivity and code readability in mind. Learning new language would be simple to complicated depending on language syntax, wierdness and many other factors.

Its universally accepted best way to learn programming language is to write programs and rewrite again.

Python makes learning curve easier. Python has certain features which makes easier to learn inside interpreter.

  • help(object) ,help is function which takes a object or function and tells what exactly the object documentation.help(2) will print integer class properties and methods in nicer format.

How I got into HasGeek Crew

Background about me

I am kracekumar, graduated from Amrita school of Engineering, Coimbatore in B.Tech IT (2007-2011). I am working with IBM India Pvt Ltd, Bangalore as Associate System Engineer from 14th July, 2011 til 16th July, 2012.(C# developer but never wrote single line of code in c# in IBM).

I am GNU/Linux user for 3 years and developed application in PHP, Rails, Flask(all are hobby projects).

Scene

I was not happy with job at IBM and I had training bond for one year(14th July, 2011 to 13th July 2012), decided to resign from my job once bond period is over whether I have new job offer or not. I usually look for job posting in Hasgeek job board from time to time. I was very interested to work in Linux, Python or Rails and wasn’t interested in Java or C# or Windows Technology.

Fake Python switch statement

Python has no switch statement.

what is switch statement ? switch statement is an alternate to if - elseif - else statement.

Example in C


  int payment_status=1;
    switch(payment_status){
    case 1:
        process_pending_payment();
        break;
    case 2:
       process_paid();
        break;
    case 3:
        process_trans_failure();
       break;
    default:
       process_default();
}

In python we can achieve same behaviour using dict.

Fake switch statement in python

payment_functions = {
    1: process_pending_payment,
    2: process_paid,
    3: process_trans_failure
}
try:
    status =2
    payment_functions[status]()
except KeyError:
    process_default()

In above code payment_functions is dict, where key is the one of the value of status and corresponding value is function to be invoked(but () is not present immediately).

python `in` operator use cases

Python *in* operator is membership test operator.*Examples:*List—-


In [1]: python_webframeworks = ['flask', 'django', 'pylons', 'pyramid', 'brubeck']

In [2]: 'flask' in python_webframeworks

Out[2]: True

In [3]: 'web.py' in python_webframeworks

Out[3]: False

in operator iterates over the list of elements and returns  True or False.

What about nested list?



In [4]: webframeworks = [['flask', 'django', 'pyramid'],['rails', 'sintara'],['zend', 'symfony']]

In [5]: 'flask' in webframeworks
Out[5]: False

in isnt handy for nested list, unless it is overriden. 

Stats

The site uses Plausible.io to track user analytics.

The site receives a low traffic of 50 to 100 visitors everyday. I have no intention of partnering with any commerical content creators or publishing company to display or cross-link their posts.

You can see the complete site statistics in the URL - https://plausible.io/kracekumar.com or view in the below section

Stats powered by Plausible Analytics