PyScan: A Little Helper For The HP CM1312nfi MFP Scanner

Update: Find the latest version (0.6) here.

I recently bought a HP CM1312nfi MFP scanner (multi function device with scanner and color laser printer). After scanning some 1’000 pages, I’m still satisfied with the device. The document feeder (ADF) sometimes tries to eat the paper after spitting it out and the colors could be a little more brilliant but overall, a good deal for the price.

What bothered me was that the “Start scan” button doesn’t work on Linux. But someone posted a script in the bug report which can poll the button by reading the URL http://$ip/hp/device/notifications.xml (replace “$ip” with the IP address or DNS name of the scanner). This returns some XML with two interesting elements: StartScan and ADFLoaded. The first one becomes 1 when someone presses the “Start scan” button on the scanner and the second one tells us whether there is some paper in the ADF.

With that and some source code, it was simple to create a little tool that works quite like xsane but fixes a couple of things that annoyed me for a long time:

  1. The UI of xsane is dead while it scans
  2. There is no online preview of the scan; you have to open the file in some extra tool to verify that the scan looks OK
  3. xsane doesn’t know about scan “projects”
  4. xsane doesn’t start to scan when I press the button on the scanner

As with all OSS software, this thing can seriously ruin your day, so be a bit careful. One of the biggest problems is the file size: To be able to edit files without loss of quality, TIFF format is the default. Each full page scan takes 26MB, 100 pages need 2.6GB!

Plans for 0.5: Allow to edit projects in the UI, select them, save and load them. Right now, you must define your projects via the command line or by editing the source code.

Download: PyScan-0.4.tar.gz (12KB, MD5 checksum)

Dependencies (see README.txt for download links):

  • Python 2.6
  • PyQt4 4.4.3
  • Python Imaging Library 1.1.6
  • Python Imaging SANE 1.1.6 (needs included patch; see README.txt for instructions).

Features

  • Code to load images in a background thread, generate thumbnails (compatible to Konqueror/Dolphin) and display them in a list view
  • Display a (big) image with various manual and automatic zoom levels and modes (fit to window, percent) with zoom and pan
  • Online preview of scan in progress

Hideous details of the source

Again and again, I’m astonished how simple some tasks are in Python and Qt … if you’re willing to accept some “non-OO-ness” of the solution. I’ll explain some things I did here to give you an idea what’s going on.

Online preview

PyScan has an online preview of the currently active scan. If you look at the documentation, the Python Imaging SANE interface offers no way to do that. After looking at the source, I found that the SANE interface simply reads bytes from the SANE scanner module and copies these into a PIL image which was created on the Python side.

So my solution is to be notified that a scan is in progress and then copy said image every second (all 26MB) into a string. That string is then used to build a QImage which is turned into a QPixmap which is then displayed in the right pixmap view. See pilImage2QImage() for the details.

Background threads

I also moved all expensive code into threads: Loading big TIFF images, scaling them down to thumbnails, saving the images, etc. All threads have a method to add work to their input queues and they send Qt signals when they’re done. Continuing the scanning when there is paper in the ADF tray was a bit of a problem, through.

Since the saving of the images is happening in a background thread, the code could start the next scan before the saving was completed. This wasn’t such a big problem except that the “scan next image” code looks for files on the disk to determine the next filename. This would lead to overwrites. So I had to synchronize this somehow. My simple hack was to set a boolean “waiting” in the scanner thread which indicates that the scanner has more paper to process and waits for the save thread to complete. When the UI gets the “image saved” signal, it triggers the scanner to continue.

Generating thumbnails

The last hack in the code is the generation of the thumbnails. The main issue here was that I need the thumbnails for the gallery view really deep down in the Qt render code. Wasting time at that level is really a no-no but at first glance, the API offers no way to defer loading of the images and then later update the items in the list view when the data is available. Keep in mind what I need to do:

  1. Load a 26MB file from disk
  2. Scale it with antialiasing
  3. … for hundreds, possibly thousands of files!

My solution: In the render code, I create a LazyPixmap. This is just dumb object to save the filename and a placeholder pixmap which is used into the real thumbnail becomes available. The LazyPixmap will schedule a job for the LoaderThread.

In my first code, I tried to create a QPixmap in the LoaderThread but that doesn’t work: Only the UI thread is allowed to create a QPixmap. Duh. But luckily, Qt offers the QImage class which works even without a UI and which offers basically the same API as QPixmap. So the LoaderThread can load the image from disk and scale it down (to save memory and avoid heavy computation in the UI thread) right before emitting a “loaded” signal.

There are two places where a LazyPixmap is used: In the PixmapWidget (which can display and zoom a QPixmap) and in the ThumbnailDelegate which draws the thumbnails for the filenames in the GalleryModel.

In the case of the PixmapWidget, the signal will be handled in lazyLoaded(). Here, we convert the QImage into a QPixmap (in lpm.getPixmap()) and assign that pixmap, recalculate the zoom factor, realign the view, etc.

The GalleryModel, I have the problem that I need to tell Qt somehow that the pixmap has changed but the API offers nothing except rendering the whole widget by calling update(). This will render at most (on a huge screen) 30 pixmaps. Happens one time per visible pixmap, causes no flicker. Probably not worth to waste another second on it.

If you look at the code, you’ll see that a class called KDEThumbnailCache is used. This class accesses the same thumbnails als konqueror (KDE3) or Dolphin (KDE4). This means once the images are scaled down (either by my code or Dolphin), all tools can quickly load the small, precalculated thumbnails instead of having to scale the 26MB files again.

Conclusion

Well, that’s it for a small walk through the code. Feel free to give feedback if you like PyScan (or not) or when you have patches.

5 Responses to PyScan: A Little Helper For The HP CM1312nfi MFP Scanner

  1. fonsch says:

    Hello,

    Nice work you’re doing here. I have a problem though: I downloaded PyScan 0.4 and made sure my dependencies are correct. I read the readme.txt but I can’t get the patch-part to work (using Ubuntu 9.04).
    I can’t seem to find the file scan-py.patch. The output of “python -c ‘import sane; print sane.__file__'” is “/usr/lib/python2.6/dist-packages/sane.pyc”. Should this patch file be included in the tar.gz or am I missing something very obvious here?

    When I try the command “python2.6 scan.py _someFolder_ scan png” I get the following error (probably related to that patch?):

    Traceback (most recent call last):
    File “scan.py”, line 443, in
    dev = SaneDev(uri, False)
    TypeError: __init__() takes exactly 2 arguments (3 given)

    Did I do something wrong? If you need more information about my setup, just let me know.

    Thanks in advance

    PS. I noticed an IP-address in scan.py (line 440). I guess this has to be changed to the actual IP of my printer? If so, wouldn’t it be better to take the IP as an extra argument when starting the program?

    • digulla says:

      You’re absolutely right, I forgot to include the file. Sorry for that. I’ve built a new 0.4 package and uploaded it in the same place as the old one. Please try it again.

      As for the IP address, that will go into a config file. I was just too lazy to do it, yet. I’ve done a lot of work on PyScan and I hope to release 0.6 this weekend. But 0.6 needs hplip 3.9.4b. On the positive side, I can now update the image in a much saner way (without any hacks).

  2. Fridi says:

    Hi, I had the same approach but always had the problem that the signal that gets emitted by the LoaderThread is not received by the StandardItem (ui thread). Cant say why, and with migrating snippets of YOUR source code to my classes it is the same (no signals received)…did you have the same problems?

    • digulla says:

      I have fixed some bugs in my code and I dimly remember this issue. I’ll try to push a newer version to bitbucket.

      • Fridi says:

        Thanks, but digging deeper into the issue I found problems with my implementation of the underlying StandardItemModel.

Leave a Reply

Fill in your details below or click an icon to log in:

WordPress.com Logo

You are commenting using your WordPress.com account. Log Out / Change )

Twitter picture

You are commenting using your Twitter account. Log Out / Change )

Facebook photo

You are commenting using your Facebook account. Log Out / Change )

Google+ photo

You are commenting using your Google+ account. Log Out / Change )

Connecting to %s