PyScan 0.6

10. October, 2009

I’ve done some more work on PyScan. Most work is under the hood but I’ve got a system to load and save projects and a bug was fixed: If you pressed the Scan button too quickly with 0.4, the last image could have been overwritten. PyScan is now based on hplip 3.9.8.

PyScan 0.6.tar.gz (16KB, MD5 Sum: 2b5e23099be438ceceb69ec23d64cec6)

See the original post for features and the system requirements.

PyScan: A Little Helper For The HP CM1312nfi MFP Scanner

18. April, 2009

Update: Find the latest version (0.6) here.

I recently bought a HP CM1312nfi MFP scanner (multi function device with scanner and color laser printer). After scanning some 1’000 pages, I’m still satisfied with the device. The document feeder (ADF) sometimes tries to eat the paper after spitting it out and the colors could be a little more brilliant but overall, a good deal for the price.

What bothered me was that the “Start scan” button doesn’t work on Linux. But someone posted a script in the bug report which can poll the button by reading the URL http://$ip/hp/device/notifications.xml (replace “$ip” with the IP address or DNS name of the scanner). This returns some XML with two interesting elements: StartScan and ADFLoaded. The first one becomes 1 when someone presses the “Start scan” button on the scanner and the second one tells us whether there is some paper in the ADF.

With that and some source code, it was simple to create a little tool that works quite like xsane but fixes a couple of things that annoyed me for a long time:

  1. The UI of xsane is dead while it scans
  2. There is no online preview of the scan; you have to open the file in some extra tool to verify that the scan looks OK
  3. xsane doesn’t know about scan “projects”
  4. xsane doesn’t start to scan when I press the button on the scanner

As with all OSS software, this thing can seriously ruin your day, so be a bit careful. One of the biggest problems is the file size: To be able to edit files without loss of quality, TIFF format is the default. Each full page scan takes 26MB, 100 pages need 2.6GB!

Plans for 0.5: Allow to edit projects in the UI, select them, save and load them. Right now, you must define your projects via the command line or by editing the source code.

Download: PyScan-0.4.tar.gz (12KB, MD5 checksum)

Dependencies (see README.txt for download links):

  • Python 2.6
  • PyQt4 4.4.3
  • Python Imaging Library 1.1.6
  • Python Imaging SANE 1.1.6 (needs included patch; see README.txt for instructions).


  • Code to load images in a background thread, generate thumbnails (compatible to Konqueror/Dolphin) and display them in a list view
  • Display a (big) image with various manual and automatic zoom levels and modes (fit to window, percent) with zoom and pan
  • Online preview of scan in progress

Hideous details of the source

Again and again, I’m astonished how simple some tasks are in Python and Qt … if you’re willing to accept some “non-OO-ness” of the solution. I’ll explain some things I did here to give you an idea what’s going on.

Online preview

PyScan has an online preview of the currently active scan. If you look at the documentation, the Python Imaging SANE interface offers no way to do that. After looking at the source, I found that the SANE interface simply reads bytes from the SANE scanner module and copies these into a PIL image which was created on the Python side.

So my solution is to be notified that a scan is in progress and then copy said image every second (all 26MB) into a string. That string is then used to build a QImage which is turned into a QPixmap which is then displayed in the right pixmap view. See pilImage2QImage() for the details.

Background threads

I also moved all expensive code into threads: Loading big TIFF images, scaling them down to thumbnails, saving the images, etc. All threads have a method to add work to their input queues and they send Qt signals when they’re done. Continuing the scanning when there is paper in the ADF tray was a bit of a problem, through.

Since the saving of the images is happening in a background thread, the code could start the next scan before the saving was completed. This wasn’t such a big problem except that the “scan next image” code looks for files on the disk to determine the next filename. This would lead to overwrites. So I had to synchronize this somehow. My simple hack was to set a boolean “waiting” in the scanner thread which indicates that the scanner has more paper to process and waits for the save thread to complete. When the UI gets the “image saved” signal, it triggers the scanner to continue.

Generating thumbnails

The last hack in the code is the generation of the thumbnails. The main issue here was that I need the thumbnails for the gallery view really deep down in the Qt render code. Wasting time at that level is really a no-no but at first glance, the API offers no way to defer loading of the images and then later update the items in the list view when the data is available. Keep in mind what I need to do:

  1. Load a 26MB file from disk
  2. Scale it with antialiasing
  3. … for hundreds, possibly thousands of files!

My solution: In the render code, I create a LazyPixmap. This is just dumb object to save the filename and a placeholder pixmap which is used into the real thumbnail becomes available. The LazyPixmap will schedule a job for the LoaderThread.

In my first code, I tried to create a QPixmap in the LoaderThread but that doesn’t work: Only the UI thread is allowed to create a QPixmap. Duh. But luckily, Qt offers the QImage class which works even without a UI and which offers basically the same API as QPixmap. So the LoaderThread can load the image from disk and scale it down (to save memory and avoid heavy computation in the UI thread) right before emitting a “loaded” signal.

There are two places where a LazyPixmap is used: In the PixmapWidget (which can display and zoom a QPixmap) and in the ThumbnailDelegate which draws the thumbnails for the filenames in the GalleryModel.

In the case of the PixmapWidget, the signal will be handled in lazyLoaded(). Here, we convert the QImage into a QPixmap (in lpm.getPixmap()) and assign that pixmap, recalculate the zoom factor, realign the view, etc.

The GalleryModel, I have the problem that I need to tell Qt somehow that the pixmap has changed but the API offers nothing except rendering the whole widget by calling update(). This will render at most (on a huge screen) 30 pixmaps. Happens one time per visible pixmap, causes no flicker. Probably not worth to waste another second on it.

If you look at the code, you’ll see that a class called KDEThumbnailCache is used. This class accesses the same thumbnails als konqueror (KDE3) or Dolphin (KDE4). This means once the images are scaled down (either by my code or Dolphin), all tools can quickly load the small, precalculated thumbnails instead of having to scale the 26MB files again.


Well, that’s it for a small walk through the code. Feel free to give feedback if you like PyScan (or not) or when you have patches.

UPCScan 0.7: Where is my stuff?

16. November, 2008

UPCScan 0.7 is released. New features:

  • UPCScan can now find music CDs
  • If UPCScan can’t find something on Amazon, it will still create an entry which you can then edit to fill in the details.
  • Entries can be deleted.
  • I’ve added lending information so you can quickly figure out who your new “ex-friends” should be.
  • I’m working on a series/issue information system to make it more simple to complete your collection. With this version, you’ll need to edit the database directly to add series/issue information but the user interface can already display this data.
  • I’m working on a feature to create an OpenOffice document with the locations. This would allow you to print this out and then scan the locations in as you scan your collection to tell UPCScan under which location to file the items. If you can’t wait, then you can use the script to generate PNG images with barcodes which you can import in OpenOffice to achieve the same effect.

Download: upcscan-0.7.tar.gz (26,921 Bytes, MD5)

UPCScan 0.6: It’s Qt, Man!

8. October, 2008

Update: Version 0.7 released.

Getting drowned in your ever growing CD, DVD, book or comic collection? Then UPCScan might be for you.

UPCScan 0.6 is ready for download. There are many fixed and improvements. The biggest one is probably the live PyQt4 user interface (live means that the UI saves all your changes instantly, so no data loss if your computer crashes because of some other program ;-)).

The search field accepts barcodes (from a barcode laser scanner) and ISBN numbers. There is a nice cover image dialog where you can download and assign images if Amazon doesn’t have one. Note: Amazon sometimes has an image but it’s marked as “customer image”. Use the “Visit” button on the UI to check if an image is missing and click on the “No Cover” button to open the “Cover Image” dialog where you can download and assign images. I haven’t checked if the result of the search query contains anything useful in this case.

UPCScan 0.6 – 24,055 bytes, MD5 Checksum. Needs Python 2.5. PyQt4 4.4.3 is optional.

Security notice: You need an Amazon Web Service Account (get one here). When you run the program for the first time, it will tell you what to do. This means two things:

  1. Your queries will be logged. So if you don’t want Amazon to know what you own, this program is not very useful for you.
  2. Your account ID will be stored in the article database at various places. I’m working on an export function which filters all private data out. Until then, don’t give this file to your friends unless you know what that means (and frankly, I don’t). You have been warned.

Portable UI

18. January, 2008

For many years, I’ve been looking for a way to write portable applications with a nice, responsive user interface. Many have tried and many have failed:

  • Python with tcl/tk – A nice experience from the developer side. The Python wrapper around the tk widget set shows how you can get compact, yet easy understandable code and write UI’s in short time. If it just weren’t that ugly …
  • Java with Swing – Swing borrows a lot from X11, the grandfather of all graphical desktops. I have yet to see anyone managing to impress the world with their grandfather …
  • Java with SWT – Now, here comes a contender. Java is pretty widely available (not quite as many platforms as Python, but still), it is pretty fast, okay, the download is a bit on the big side … but no DLL hell, easy to setup (especially if you don’t provide an installer and just push a ZIP out). SWT is nice, fast … and bare bones. MFC? Well, they have JFace and in a few years, there might even be a text editing component that can do word wrap and still show line numbers. Oh, and SWT is available on even fewer platforms than Java. Palm, anyone?
  • HTML – Web based apps are all the hype. If you want to use your app on the run, it gets tricky. I don’t know about the US, but here in Europe, going online with you mobile will ruin you. Literally. Also, I’ve had my struggles with HTML and CSS and I can do without. Either and both.

I’ve tried a few more but in the end, things never felt right. Until recently. I’m a big fan of treeline. Treeline uses Python and PyQt which wraps Qt (say: “cute”). Qt is a mature framework, currently at version 4.3.3, with 4.4 is around the corner. It doesn’t have all the nifty stuff I can imagine (like an RTF editor; QTextEdit can only do a (big) fraction of that) but it gets closer to what I want than anything else.

In the past two weeks, I wrote a little clone of yWriter4. The little baby has currently about 8000 loc and about half of the functionality I want to give it (especially the text editing is still leaving a lot to be desired). Except for two bugs (signal names and GC issues), it’s been a real pleasure to use. I managed to implement almost every feature within a few minutes or few hours (the storyboard took 6 hours, the scene chart view took two), also thanks to the good defaults of the framework. Here is an impression of v0.2:

So when you’re considering to write a small to medium sized application which needs to run on Windows, Linux and MacOS, give PyQt a try.

Sorting Number Table Columns in PyQt4

14. January, 2008

Here is a simple trick to sort number columns in the QTableWidget of Qt4 and PyQt4: Format the number as a right aligned string:

for i range(12):
    item = QTableWidgetItem(u'%7d' random.randint(1, 10000))
    table.setItem (i, 1, item)

%d bloggers like this: