Kushal Das

FOSS and life. Kushal Das talks here.


More on thumbnailing and optimization

Following my last post, spent most of the of the time on different imaging libraries to find a faster way of doing thumbnails and optimization.

I tested the following libraries to create image thumbnails, GdkPixbuf, imlib2, ImageMagic, epeg. Pixbuf gives somewhat nice timings , imlib2 is fast but was leaking too much memory. ImageMagic seems to the slowest among them. Last try was with epeg which can only handle jpegs and it came out as the fastest. So wrote a C function and using it from inside vala code using extern.

Next target was to find better way to get thumbnails from RAW images, tried libopenraw and LibRaw for that. But with help from yorba developers found the way to do it using gexiv2 only.

In between tried few tools for profiling the application, sayamindu told me about sysprof which seems to be the easiest for my purpose. Using it I found gexiv2_metadata_open_path is taking around 67% of time, inside it Exiv2::TiffImage:readMetaData is taking 51% of time.

Now coming to the point of speed , 1st run is on 1GB of RAW files

real	0m2.946s
user	0m2.542s
sys	0m0.116s

2nd run is on same 36GB of images , among them around half is RAW.

real	4m0.807s
user	0m54.283s
sys	1m24.789s

Now this is fast in my textbook :D I should not forget to tell about the great help I got from #vala and Adrien Bustany in the whole work.

Speed , Vala, Sqlite3 and optimization

All started as just another stupid idea, writing an image indexer (reinventing the wheel ).  The code should do the following things:

  • Index images for any given folder with user provided tags
  • Extract and keep EXIF information from the files
  • Extract or generate image thumbnails for all files
  • Should be able to provide the thumbnail even if the original file is missing (may be in a usb hdd)

Wrote some test code to see how much time it takes to do the above for a folder with 36GB+ in size and 5044 images (both JPEG and NEF ) 

1st run 

real 30m39.178s
user 0m45.561s
sys 3m42.557s

2nd run with out thumbnail generation but with EXIF information

real 10m14.281s
user 0m58.208s
sys 0m45.969s

3rd run with out thumbnail generation and with out EXIF information

real 14m40.503s
user 0m1.585s
sys 0m7.535s

Here I am confused, managed to find out that transaction in sqlite can cause delay, so changed the code to do everything in single transaction 4th run with out thumbnail generation and with out EXIF information

real 0m1.032s
user 0m0.216s
sys 0m0.134s

5th run with out thumbnail generation but with EXIF information

real 3m1.191s
user 1m2.525s
sys 0m34.524s

6th run with everything 

real 16m47.241s
user 0m43.652s
sys 3m21.640s

So the major bottleneck is thumbnail generation, which I am currently doing Gdk.Pixbuf , for EXIF information I am using beautiful gexiv2 from awesome yorba guys.

Now to optimize I have to use some other library to generate thumbnails, which other libraries I can use ?

On the side note, I am not saving the thumbnails on disk but creating base64 encoded strings of them (I know I am bad, not following the thumbnail spec).