Kushal Das

FOSS and life. Kushal Das talks here.

kushal76uaid62oup5774umh654scnu5dwzh4u2534qxhcbi4wbab3ad.onion

ujson, a fast json implementation

ujson is an ultra fast JSON encoder and decoder written in pure C (as described by the upstream). It can be used as dropin replacement of the most known JSON implementations.

To install you just do:

$ pip install ujson

To test the speed I decided to write two simple test scripts with help from timeit_ module.

The execution:

$  python json_test.py 
5.39651107788

$ python ujson_test.py 
1.09844493866

Review request for the Fedora package is already filled.

Python powered :)

Today I moved my site to a Nikola generated static one from Wordpress. Being able to write the blog posts in reStructuredText format is always super easy compared to HTML.

This also has Disqus based comment support. upstream is very supportive and got many of my queries resolved.

I still need to import many old posts in this one, but that will take time. Nikola has a wordpress importer plugin, I am yet to check that. For my move I wrote one python script which got all my posts from wordpress and fixes the html links inside, added the categories as they were in previous one.

RSS feed

Job/Task queue quick tutorial using retask

Here is one quick tutorial of Job/Task queue using retask

You need to first install retask & redis-py from pypi.

$ pip install retask redis

Then start the redis server (if it is not already running, in my case I installed it from Fedora package repository).

service redis start

In out producer code we will create tasks to get current stock values of few companies. The workers will get the company symbols, then they will fetch the value and pass to the producer.

from retask.task import Task from retask.queue import Queue

import time
queue = Queue('sharevalue')
queue.connect()
symbols = ['AAPL', 'ORCL', 'MSFT']
jobs = []
for sym in symbols:
    info = {'name': sym}
    task = Task(info)
    jobs.append(queue.enqueue(task))

print [job.result for job in jobs if job.wait()]

Here we have a list of symbols and creating tasks based on them, as they are getting enqueued, a Job object is returned for each one of them.

In the workers we will use requests, so let us install it first.

$ pip install requests

then the actual worker code

import requests
from retask.queue import Queue

queue = Queue('sharevalue')
queue.connect()
while True:
    task = queue.wait()
    print "Received %s" % task.data['name']
    url = 'http://download.finance.yahoo.com/d/quotes.csv?s=%s&f=l1' % task.data['name']
    r = requests.get(url)
    queue.send(task, r.content.strip())

Here we wait for new tasks in the queue. After getting a new task, we find the latest share value and then return it to the producer. Please go through the documentation for more examples.

retask development update

This foss.in I managed to have long chats with Jace, Tarique and Mahendra over the design of retask project, a job/task queue for Python.

We discussed about various alternatives available and how they use those in their work. This lead to a more clear picture on requirements beyond my primary use cases.

Now we have asynchronous result communication between job producer and workers. We can also do blocking wait call on a job object (with an optional timeout value) till the worker returns a result.

You can find the code on github, please feel free to play with the examples (tutorials page) and provide suggestions on how to improve.

ipython qtconsole in Fedora

If you want to use ipython qtconsole in Fedora 17 or in Fedora 16, remember to install ipython-gui package. This will be automatically handled in Fedora 18 onwards.

Python development on 8th October

Use setUpClass() and tearDownClass() in test_multiprocessing.

Each manager test class now uses a separate manager. Also, process pools are no longer created before starting any tests.

Note that warnings are written if the manager for a test case still has live objects when it is shutdown. This is true for a few test cases which fail to wait for all child processes to end. Files changed:

  • Lib/test/test_multiprocessing.py

Make mp_main an alias for main in all processes to simplify pickling of classes defined in main module Files changed:

  • Lib/multiprocessing/init.py
  • Lib/multiprocessing/init.py

Issue #14783: Improve int() docstring and also str(), range(), and slice().

This commit rewrites the docstring for int() to incorporate the documentation changes made in issue #16036. It also switches the docstrings for int(), str(), range(), and slice() to use multi-line signatures. Files changed:

  • Doc/library/functions.rst
  • Objects/longobject.c
  • Objects/rangeobject.c
  • Objects/sliceobject.c
  • Objects/unicodeobject.c

Issue #16120: Use |yield from| in stdlib One more patch to use |yield from|. Files changed:

  • Lib/pkgutil.py

About the Pyhton development blog posts series

I am writing these summary posts about the code committed to cpython everyday by the core developers. I am doing this for the checkins in IST, so they will contain patches from previous day also.

I might skip a few days if I am travelling or without a proper network.

Python development on 5th October

Pushed PEP 428 - object-oriented filesystem paths.

This PEP proposes the inclusion of a third-party module, pathlib_, in the standard library. The aim of this library is to provide a simple hierarchy of classes to handle filesystem paths and the common operations users do over them. pathlib: http://pypi.python.org/pypi/pathlib/

Updated ftplib documentation The file obj passed to ftp.storbinary/storlines must be opened in binary mode. Files changed:

  • Doc/library/ftplib.rst

Issue #16138: Some typos got fixed, Files changed:

  • Doc/glossary.rst

Issue #14446: Remove deprecated tkinter functions Removed functions

  • static char * Merge(PyObject *args)
  • static char * AsString(PyObject *value, PyObject *tmp) Files changed:
  • Modules/_tkinter.c

Issue #16112: platform.architecture does not correctly escape argument to /usr/bin/file. Fix original patch Files changed:

  • Lib/platform.py Went in on 3.3, 3.2, 3.1 and on 2.7

Issue #16135: Removal of OS/2 support. Files changed:

  • Lib/test/regrtest.py
  • Lib/test/test_bz2.py
  • Lib/test/test_mailbox.py
  • Lib/test/test_select.py
  • Lib/test/test_signal.py
  • Lib/test/test_site.py
  • Lib/test/test_sundry.py
  • Lib/test/test_sys.py
  • Lib/test/test_sysconfig.py
  • Lib/test/test_thread.py
  • Lib/test/test_threadsignals.py
  • Python/importlib.h

In Lib/test/test_select.py SelectTestCase is now properly skipped in riscos.

Fix PyUnicode_Format(): return NULL if PyUnicode_READY(uformat) failed. This error cannot occur in practice: PyUnicode_FromObject() always return a "ready" string. Optimize unicode_compare(): use memcmp() when comparing two UCS1 strings Enable also ptr==ptr optimization in PyUnicode_Compare() It was already implemented in PyUnicode_RichCompare() Files changed:

  • Objects/unicodeobject.c
  • Include/unicodeobject.h

Issue #14997: disable in idle shell window. Files changed:

  • Lib/idlelib/config-extensions.def

Issue #15417: Add support for csh and fish in venv activation scripts. Files changed:

  • Doc/using/venv-create.inc
  • Lib/venv/scripts/posix/activate.csh
  • Lib/venv/scripts/posix/activate.fish

Python development on 4th October

Issue #16130: Typo fixed in whats new in 3.4

Issue #16126: PyErr_Format format mismatch in _testcapimodule.c PyErr_Format in Modules/_testcapimodule.c uses illegal format specifier (%s) for Py_ssize_t argument. It causes crash.

Issue #16112: platform.architecture does not correctly escape argument to /usr/bin/file. Fix original patch Files changed:

  • Lib/platform.py

Issue #15488: Buffered IO now frees the buffer when closed, instead of when deallocating. So now closed files will keep their buffer alive. Files changed:

  • Modules/_io/bufferedio.c
  • Lib/test/test_io.py

long_to_decimal_string_internal() doesn't need to write the final NULL character Files changed:

  • Objects/longobject.c

unicode_result_wchar(): move the assert() to the "#ifdef Py_DEBUG" block Files changed:

  • Objects/unicodeobject.c

Split the huge PyUnicode_Format() function (+540 lines) into subfunctions Files changed:

  • Objects/unicodeobject.c

PyUnicode_Format(): disable overallocation when we are writing the last part of the output string Files changed:

  • Objects/unicodeobject.c

Unicode: resize_compact() and resize_inplace() fills also the Unicode strings with invalid bytes in debug mode, as done by PyUnicode_New() Files changed:

  • Objects/unicodeobject.c

Python development on 3rd October

Issue #12947: Better workaround for the problem with doctest directives being stripped from code examples that are intended to illustrate those directives Concrete examples can be seen in the section http://docs.python.org/library/doctest.html#option-flags-and-directives For instance at http://docs.python.org/library/doctest.html#doctest.IGNORE_EXCEPTION_DETAIL The doctest flags present in the sources in http://docs.python.org/_sources/library/doctest.txt are all stripped.

Issue #15897: zipimport.c doesn't check return value of fseek() The code im zipimport.c doesn't check the return value of fseek() in at least four places. The missing checks may hide issues with the file or file system. It now properly prints "can't read Zip file: filename". This patch was applied to 3.3, 3.2 and 2.7 branch. Files changed:

  • Modules/zipimport.c

Issue #9650: List commonly used format codes in time.strftime and time.strptime docsttings time.strptime and time.strftime documentation now contains the following commonly used format codecs

Commonly used format codes:

%Y  Year with century as a decimal number.
%m  Month as a decimal number [01,12].
%d  Day of the month as a decimal number [01,31].
%H  Hour (24-hour clock) as a decimal number [00,23].
%M  Minute as a decimal number [00,59].
%S  Second as a decimal number [00,61].
%z  Time zone offset from UTC.
%a  Locale's abbreviated weekday name.
%A  Locale's full weekday name.
%b  Locale's abbreviated month name.
%B  Locale's full month name.
%c  Locale's appropriate date and time representation.
%I  Hour (12-hour clock) as a decimal number [01,12].
%p  Locale's equivalent of either AM or PM.

Doc/whatsnew/3.3.rst file also got more updates. On places like email.parser.BytesHeaderParser, logging.basicConfig, mmap.mmap.read, smtpd.