diff options
author | rpj <rpj> | 2004-10-16 02:34:44 +0000 |
---|---|---|
committer | rpj <rpj> | 2004-10-16 02:34:44 +0000 |
commit | 45b1b8cb2a6588f9316f780d8cefe11c181a9a17 (patch) | |
tree | 24753e298d9933d48d764177baf183ef97f04156 /ChangeLog | |
parent | 9da8fdcb33373b4b2e1de2a8b7af3ed4b5811245 (diff) |
Mutex speedups cont'd
Diffstat (limited to 'ChangeLog')
-rw-r--r-- | ChangeLog | 117 |
1 files changed, 80 insertions, 37 deletions
@@ -1,45 +1,88 @@ -2004-10-08 Ross Johnson <rpj at callisto.canberra.edu.au> +2004-10-15 Ross Johnson <rpj at callisto.canberra.edu.au> - * pthread_mutex_destroy.c (pthread_mutex_destroy): Critical Section - element is no longer required. - * pthread_mutex_init.c (pthread_mutex_init): Likewise. - * pthread_mutex_lock.c (pthread_mutex_lock): New algorithm following Drepper's - paper at http://people.redhat.com/drepper/futex.pdf, but using the existing - semaphore in place of the futex described in the paper. Idea suggested by - Alexander Terekhov - see: - http://sources.redhat.com/ml/pthreads-win32/2003/msg00108.html - * pthread_mutex_timedlock.c pthread_mutex_timedlock(): Similarly. - * pthread_mutex_trylock.c (pthread_mutex_trylock): Similarly. - * pthread_mutex_unlock.c (pthread_mutex_unlock): Similarly. - * pthread_barrier_wait.c (pthread_barrier_wait): Use inlined version of - InterlockedCompareExchange() if possible - determined at build-time. - * pthread_spin_destroy.c pthread_spin_destroy(): Likewise. - * pthread_spin_lock.c pthread_spin_lock():Likewise. - * pthread_spin_trylock.c (pthread_spin_trylock):Likewise. - * pthread_spin_unlock.c (pthread_spin_unlock):Likewise. - * ptw32_InterlockedCompareExchange.c: Sets up macro for inlined use. - * implement.h (pthread_mutex_t_): Remove Critical Section element. - (PTW32_INTERLOCKED_COMPARE_EXCHANGE): Set to default non-inlined version of - InterlockedCompareExchange(). - * private.c: Include ptw32_InterlockedCompareExchange.c first for inlining. - * GNUmakefile: Add commandline option to use inlined InterlockedCompareExchange(). - * Makefile: Likewise. + * implement.h (othread_mutex_t_): Use an event in place of + the POSIX semaphore. + * pthread_mutex_init.c: Create the event; remove semaphore init. + * pthread_mutex_destroy.c: Delete the event. + * pthread_mutex_lock.c: Replace the semaphore wait with the event wait. + * pthread_mutex_trylock.c: Likewise. + * pthread_mutex_timedlock.c: Likewise. + * pthread_mutex_unlock.c: Set the event. + +2004-10-14 Ross Johnson <rpj at callisto.canberra.edu.au>
+ + * pthread_mutex_lock.c (pthread_mutex_lock): New algorithm using + Terekhov's xchg based variation of Drepper's cmpxchg model. + Theoretically, xchg uses fewer clock cycles than cmpxchg (using IA-32 + as a reference), however, in my opinion bus locking dominates the + equation on smp systems, so the model with the least number of bus + lock operations in the execution path should win, which is Terekhov's + variant. On IA-32 uni-processor systems, it's faster to use the + CMPXCHG instruction without locking the bus than to use the XCHG + instruction, which always locks the bus. This makes the two variants + equal for the non-contended lock (fast lane) execution path on up + IA-32. Testing shows that the xchg variant is faster on up IA-32 as + well if the test forces higher lock contention frequency, even though + kernel calls should be dominating the times (on up IA-32, both + variants used CMPXCHG instructions and neither locked the bus). + * pthread_mutex_timedlock.c pthread_mutex_timedlock(): Similarly.
+ * pthread_mutex_trylock.c (pthread_mutex_trylock): Similarly.
+ * pthread_mutex_unlock.c (pthread_mutex_unlock): Similarly.
+ * ptw32_InterlockedCompareExchange.c (ptw32_InterlockExchange): New + function. + (PTW32_INTERLOCKED_EXCHANGE): Sets up macro to use inlined + ptw32_InterlockedExchange.
+ * implement.h (PTW32_INTERLOCKED_EXCHANGE): Set default to + InterlockedExchange().
+ * Makefile: Building using /Ob2 so that asm sections within inline + functions are inlined.
+ +2004-10-08 Ross Johnson <rpj at callisto.canberra.edu.au>
+ + * pthread_mutex_destroy.c (pthread_mutex_destroy): Critical Section
+ element is no longer required.
+ * pthread_mutex_init.c (pthread_mutex_init): Likewise.
+ * pthread_mutex_lock.c (pthread_mutex_lock): New algorithm following + Drepper's paper at http://people.redhat.com/drepper/futex.pdf, but + using the existing semaphore in place of the futex described in the + paper. Idea suggested by Alexander Terekhov - see:
+ http://sources.redhat.com/ml/pthreads-win32/2003/msg00108.html
+ * pthread_mutex_timedlock.c pthread_mutex_timedlock(): Similarly.
+ * pthread_mutex_trylock.c (pthread_mutex_trylock): Similarly.
+ * pthread_mutex_unlock.c (pthread_mutex_unlock): Similarly.
+ * pthread_barrier_wait.c (pthread_barrier_wait): Use inlined version + of InterlockedCompareExchange() if possible - determined at + build-time.
+ * pthread_spin_destroy.c pthread_spin_destroy(): Likewise.
+ * pthread_spin_lock.c pthread_spin_lock():Likewise.
+ * pthread_spin_trylock.c (pthread_spin_trylock):Likewise.
+ * pthread_spin_unlock.c (pthread_spin_unlock):Likewise.
+ * ptw32_InterlockedCompareExchange.c: Sets up macro for inlined use.
+ * implement.h (pthread_mutex_t_): Remove Critical Section element.
+ (PTW32_INTERLOCKED_COMPARE_EXCHANGE): Set to default non-inlined + version of InterlockedCompareExchange().
+ * private.c: Include ptw32_InterlockedCompareExchange.c first for + inlining.
+ * GNUmakefile: Add commandline option to use inlined + InterlockedCompareExchange().
+ * Makefile: Likewise.
2004-09-27 Ross Johnson <rpj at callisto.canberra.edu.au>
- * pthread_mutex_lock.c (pthread_mutex_lock): Separate PTHREAD_MUTEX_NORMAL
- logic since we do not need to keep or check some state required by other
- mutex types; do not check mutex pointer arg for validity - leave this to
- the system since we are only checking for NULL pointers. This should improve
- speed of NORMAL mutexes and marginally improve speed of other type.
+ * pthread_mutex_lock.c (pthread_mutex_lock): Separate + PTHREAD_MUTEX_NORMAL logic since we do not need to keep or check some + state required by other mutex types; do not check mutex pointer arg + for validity - leave this to the system since we are only checking + for NULL pointers. This should improve speed of NORMAL mutexes and + marginally improve speed of other type.
* pthread_mutex_trylock.c (pthread_mutex_trylock): Likewise.
* pthread_mutex_unlock.c (pthread_mutex_unlock): Likewise; also avoid
- entering the critical section for the no-waiters case, with approx. 30%
- reduction in lock/unlock overhead for this case..
+ entering the critical section for the no-waiters case, with approx. + 30% reduction in lock/unlock overhead for this case.
* pthread_mutex_timedlock.c (pthread_mutex_timedlock): Likewise; also
- no longer keeps mutex if post-timeout second attempt succeeds - this will
- assist applications that wish to impose strict lock deadlines, rather than
- simply to escape from frozen locks.
+ no longer keeps mutex if post-timeout second attempt succeeds - this + will assist applications that wish to impose strict lock deadlines, + rather than simply to escape from frozen locks.
2004-09-09 Tristan Savatier <tristan at mpegtv.com>
* pthread.h (struct pthread_once_t_): Qualify the 'done' element
@@ -49,8 +92,8 @@ [Maintainer's note: the race condition is harmless on SPU systems
and only a problem on MPU systems if concurrent access results in an
exception (presumably generated by a hardware interrupt). There are
- other instances of similar harmless race conditions that have not been
- identified as issues.]
+ other instances of similar harmless race conditions that have not + been identified as issues.]
2004-09-09 Ross Johnson <rpj at callisto.canberra.edu.au>
|