Fernando J. Pereda’s blag

May 25, 2010

iostreams are VERY thread-unsafe on Mac OS X

Filed under: blag — Tags: , , , — Fernando J. Pereda @ 8:19 am

This was both surprising and disgusting. Our in-house data acquisition software needs to convert lots of floats to ascii so that they can ve viewed in real time. This is very easy to do with a stringify-like function like the one found in paludis/paludis/util/stringify.hh.

stringify looks trivial, there’s no shared state nor there is any static data, so I thought that I could call stringify from several threads at the same time. I don’t think there’s any guarantee about thread safety of ostringstream’s operator<< but it looked like a safe assumption. Well, incorrect.

operator<< calls vsnprintf, which should be thread-safe by itself. However, OSX's vsnprintf ultimately calls localeconv_l, which is not thread-safe. And, you end up with something like this:

#0 0x96f734a9 in malloc_error_break ()
#1 0x96f6e497 in szone_error ()
#2 0x96e98503 in szone_free ()
#3 0x96e9836d in free ()
#4 0x96e97f24 in localeconv_l ()
#5 0x96e93335 in __vfprintf ()
#6 0x96ecb9b5 in vsnprintf ()
#7 0x0144615e in std::__convert_from_v ()
#8 0x01437d2a in std::num_put<char, std::ostreambuf_iterator<char, std::char_traits > >::_M_insert_float ()
#9 0x0143803d in std::num_put<char, std::ostreambuf_iterator<char, std::char_traits > >::do_put ()
#10 0x0144ab6a in std::ostream::_M_insert ()
#11 0x0000fb25 in std::basic_ostringstream<char, std::char_traits, std::allocator >::str () at sstream:217
...

Which means there’s no way of converting stuff to human readable in a thread-safe manner. I’m not sure whether I’m missing something or not… but this looks very fishy. I ended up adding a stupid big mutex around my stringify calls, but that looks more like a workaround that a final solution to the problem. Plus, this doesn’t protect ANY other uses of iostream’s operator<< such as loggers.

Ok… looks like that wasn’t a very accurate diagnostic. This is going to be more annoying than I thought.

— ferdy

May 20, 2008

Gmandel, Gtk+ and Threads

Filed under: blag — Tags: , , , — Fernando J. Pereda @ 1:18 am

For gmandel (see Chaotic Stuff for more info) I had to add a little GtkProgressBar so that users have a rough estimate of the time it’ll take rendering the fractal.

I wanted to have a clean and easy to study code, so, to avoid threads (that are usually difficult to study for newcomers) I used the following hack to update the progress bar:

static inline void flush_events(void)
{
    while (gtk_events_pending())
        gtk_main_iteration();
}

And, with some code to save expose events and relaunch them once the image had been rendered, it worked fine. However, that’s a hack, and the real solution is to fire a new thread to do the rendering work and update the widget when it’s finished. Na├»ve I was.

Gtk+’s documentation states it very clearly: Make sure you protect all calls to Gtk with a gdk_threads_enter(), gdk_threads_leave() pair. With that, you can call Gtk functions from any thread you want and forget about implementing IPC mechanisms to make all the calls from one thread. One of the problems is that the default implementation of the Gdk lock is a non-reentrant mutex, which means you lock it while already holding it (otherwise, you deadlock).

It should be noted that gtk_main() should also be enclosed in an enter-leave which means that functions called by it, can’t try to lock Gdk’s mutex or they’ll deadlock. This is a problem if you are trying to share functions between event handlers and threads. The trick here is to replace that mutex with a couple of functions that lock a reentrant mutex using gtk_threads_set_lock_functions().

What isn’t clear is that gtk_events_pending() will try to acquire Gdk’s lock. For some reason, I forgot to remove the hack that events_flush() was, and had wasted lots of time trying to fix a small, difficult to reproduce, race condition because I was calling gtk_events_pending() while holding the Gdk lock.

So, really, working with threaded applications in Gtk+ is really easy. It is just a matter of protecting calls to Gtk with a critical section. I’m not sure why the documentation isn’t more clear, and the fact that some of the old information floating in the web still advises to only call Gtk from one thread doesn’t really help. Had I started with a threaded version instead of using events_flush() thing would’ve been far easier :)

Bottom line: If something is difficult, working it around won’t make it easier, quite the contrary.

— ferdy

Blog at WordPress.com.