Python in programming contests

Thanks to recent addition of Python to the Topcoder python is the first language besides C++, Java and C# available on all major programming competitions (I did’t count ACM, because I can’t compete in it after finishing university).
I’m going to write about my perspective on using Python in programming contests.
Some of the points are targeted at people who didn’t use python at all yet, so for people knowing python they may sound basic.
Vexorian already had written a good piece about python in programming contests.
I am going to extend some arguments, disagree with some and provide some new ones. I’ll start with the things I like and end on the things I don’t like.

Better standard library and shorter code

Python have Bigints and Regexps support, so I no longer have to drop to java if I need these.
There are a lot of little handy utilities in python that are not available in C++ that make code much more easy to read and write. There have been written much on this subject elsewhere, so I am not going to dive into details.

Speed and memory

This is the biggest problem with python in Topcoder.
Some problems are really hard to fit into time limit using python .
Some problems are almost impossible .
My rule of thumb is that I shouldn’t try to do Topcoder problem that require >10^6 operations in python.
After doing some practice problems in python I would say that around 3/4 problems can be solved in python.

The root of the problem is not python itself, but topcoder rules, which does not allow to use Numpy or Cython. I personally think that topcoder should allow Cython. Cython is alternative way of running python programs, which instead of interpreting them translates python code to C. What’s more it allows you to add optional type information to Python code, what allows Cython to use this information to speed things up. With numpy and cython it’s possible to bring majority of programs to “orders of magnitude” of C++ speed.
You can use Cython in many popular contests such as Code Jam or IPSC.

What’s more, some programs take surprisingly a lot of memory in python. I ended up converting some practice problems from list of tuples to list of lists and get program accepted. As a rule of thumb, if I would see that I need to allocate at least 4mb in Top Coder problem I would go with C++ or Java.

Anyway, in any case you should read python docs entry about performance .

Better support for functional programming

Out of the Topcoder languages python currently have the best support for functional programming.
Although C++ got lambdas in C++11 the standard library didn’t catch up yet.
For example filter or map function in C++ are awkward.
Instead of taking a sequence and returning a sequence, as python is doing it, they are operating on iterators, what causes that one can’t easily chain function calls.
As example of what I mean:
Python:

map(lambda x: x*2, filter(lambda x: x%2==0, x))

C++11 (normally I would just write this functionality with just for loops):

vector  filtered;
copy_if(begin(v), end(v), back_inserter(filtered), [](int x) { return x % 2 == 0;});
vector  transformed;
transform(begin(filtered), end(filtered), back_inserter(transformed), [](int x) { return x * 2;});

There is handy Boost library that packs two iterators into an object, but sadly boost is not supported on Topcoder

Topcoder is good place to learn python

I prefer learning new languages by real work instead of reading tutorials.
By doing Topcoder problems in python you will be able to look at how other people approached the problem, what can be a good learning experience.
What’s more, because of tight memory and time limits on the Topcoder you will learn how to write well optimised python code.

Better debugging experience

There is one thing I love about debugging python – ability to breakpoint at some place of my program and play with current state of my program using real python code.
Although I have seen some tools that claim to offer similar functionality for C++ I never seen one work good enough to be usable.
When I go back to C++ from python I see that finding bugs take me longer than python, even if I had much more experience in debugging contest problems in C++.

Applicable to more problems

Since python is more general purpose language it can be used in wider range of contests.
It can be used to solve some sound/image analysis problems from contests like challenge24.
It can be used to solve hacking (networking etc.) problems from CTF contests.
It’s also useful for machine learning at contests like Kaggle.

Lack of static typing

Besides the speed issues I don’t think the lack of static typing is big problem for small programs, such as ones you find in programming competitions.
When writing Topcoder problems in C++ I usually have one big writing session and then one big debugging/testing session.
After watching some screencats this is what Petr tends to do. ( he usually just writes correct code anyway without debugging session ).
If you don’t use IDE you will find about type problems in python at about the same point as you would when writing C++.
What’s more, I started using PyCharm and it finds all syntax errors and it infers some obvious types. It sometimes finds basic typing errors or it finds out when I am using unused variable. What’s more you can tell PyCharm to collect type information when you run python program and you can use this information for code completion later.
It’s funny, because I have more “Staticish” experience in Python with PyCharm than I used to have in C++ with Vim and Emacs.

Recursion limit

There is default recursion limit in python of 2500 . You can change it with sys.setrecursionlimit(n) .
Problem here is that python uses much more memory per function call than C++ , so even with setrecursionlimit it’s quite limited.
Testng on topcoder following snippet gives segmentation fault at MAX_REC = 27000, but works for 26000.

MAX_REC = 26000
sys.setrecursionlimit(MAX_REC+10)
def recursion(n):
    if n < 0:
        return n
    return recursion(n-1)
return recursion(MAX_REC)

If you use Cython, you can make C functions using cdef, so it’s again problem with topcoder only.

Multi-dimensional arrays

This is again problem only in topcoder. As Vexorian said in his blog post:
What in c++ looks like :

  int dp[w][h][t][n][2][3];    // Just a typical day in topcoder land
  memset(dp, 0, sizeof(dp));
  dp[x][y][f][p][1][0] = 6;

In python it looks like:

  dp = [[[[[ [0]*3 for i in xrange(2) ] for j in xrange(n)] for k in xrange(t)] for a in xrange(h)] for b in xrange(w)]
  dp[x][y][f][p][1][0] = 6

However, in numpy it looks like:

dp = numpy.zeros([w,h,t,n,2,3])
dp[x][y][f][p][1][0] = 6;

What’s more, numpy have many utility features that could make your programs easier to read and write in Python than in C++. Sadly, numpy is not allowed on topcoder :(.

Lack of BST trees

There are only hash tables in python.
There is no standard library implementation of BST tree.
There are some problems, when you need a BST tree, for example when you need to add/remove elements and ask for lower bounds at any point.
Theoretically you could implement one in python and paste the code as you need it.
Sadly it would be pretty inefficient.
A lot of python built in structures like hash tables are implemented in C underneath, what causes any data structure implemented on python side to look very inefficient comparing to standard library implementations.

Python code during programming competitions resembles “real world” better

Common complaint about programming contests is that people write throwaway C++ code, while outside of programming competitions C++ is used to write highly polished things in highly constrained environments.
During programming competition you are writing C++ in “write and forget” mode, while majority of “real world” C++ is written in “built to last” mode.
On the other hand python is used much more for prototyping and scripting, so it is more frequent to write python in “write and forget” mode, so experience from programming competitions to spew code fast is going to be more applicable.

Conclusion

I like python a lot.
It is one of the languages every programmer should know and out of the mainstream languages it is my favorite.
I would like to write all algorithmic problems in python instead of C++.
Sadly, due to tight limits on Topcoder and Codeforces when I suspect python might be too slow I am still using C++.
On google code jam, Facebook hacker cup, IPSC or challenge24 you can use Numpy and Cython, so all problems are solvable in python.
If you look for place to practice your NumPy and Cython skills Kaggle (machine learning) or Project Euler seems like a good option.

Advertisements

5 Comments

  1. Topcoder’s restriction to only using the standard library for Python makes it pointless. Why bother contributing to the site if you have to rewrite all the libraries from scratch??

  2. Great analysis. 🙂 While I agree that Topcoder should support “standardish” librarries like numpy, there is an alternative in your example.
    np.zeros([a, b, c, d])
    can be rewritten in standard python as
    [[[[[0]*d]*c]*b]*a]
    Or if you really want to preserve your list of indexes you could use
    reduce(lambda xs, n:[[xs]*n], [d, c, b, a], 0)

  3. Hi, Great article, just the right amount of analysis which I was trying to find.

    Thanks for this, Kindly link your github profile if you have some solution to competitive programming problem, It will be a great learning experience for nOOb like me.

Leave a Reply

Fill in your details below or click an icon to log in:

WordPress.com Logo

You are commenting using your WordPress.com account. Log Out / Change )

Twitter picture

You are commenting using your Twitter account. Log Out / Change )

Facebook photo

You are commenting using your Facebook account. Log Out / Change )

Google+ photo

You are commenting using your Google+ account. Log Out / Change )

Connecting to %s