python source fileencoding

Some of the python source file starts with -*- coding: utf-8 -*-. This particular linetells python interpreter all the content (byte string) is utf-8 encoded. Lets see how it affects the code. uni1.py: # -*- coding: utf-8 -*- print("welcome") print("animé") output: ➜ code$ python2 uni1.py welcome animé Third line had a accented character and it wasn’t explictly stated as unicode. print function passed successfully.Since first line instructed interpreter all the sequences from here on will follow utf-8, so it worked. [Read More]

How Tamil Unicode works

Tamil has 247 characters. No panic. It is simple. 12 uyir eluthu(அ,ஆ..ஔ), 18 mei eluthu(க்,ங்..) , 216 uyirmei eluthu(12 * 18 க,ங ).1 ayutham(ஃ). I assume you know what is unicode. If not read The Absolute Minimum Every Software Developer Absolutely, Positively Must Know About Unicode and Character Sets and then read wikipedia page. You will understand most of it. Back to the post. Every character or letter in unicode has value called code point. [Read More]