Index

Show enters and exits. Hide enters and exits.

00:09:06evanwoop!
00:09:09evan45.92s!
00:09:11evanbest yet
00:09:22evannew InlineCode incoming@!
00:09:23evanduck and cover!
00:09:28boyscoutIntroduce InlineCache, a SendSite replacement - 4cecf54 - Evan Phoenix
00:10:19boyscoutCI: Build 4cecf54 failed. http://ci.rubini.us/rubinius/builds/4cecf544aa0cfce2742e4f904d33abbd80822f15
00:10:24evanwoo
00:10:38brixenwell, you did warn us :)
00:10:46sbryantoh snap.
00:10:55brixenheh, that's a funny error
00:11:00evanyeah
00:11:01evanreally
00:11:02brixenum, you didn't define anything
00:11:14brixenor declare rather
00:11:28boyscoutRemove a declaration that doesn't declare anything - 743b24b - Evan Phoenix
00:11:30evanit's a declaration
00:11:32brixenheh
00:11:34brixenyeah
00:11:34evanhow can it not declare anything?
00:11:50evanthats it's reason de etre!
00:11:51brixenC++ added customs agents
00:11:59evanhah
00:12:07evanI didn't fill out the form on the airplane
00:12:10brixenyep
00:12:12evanso i have to wait at the end of the line
00:12:31brixenI zee a prob'em vis dis declarashun
00:12:36evanhah
00:12:39evanLOL
00:13:05evanok, lets look at this linux thread issue.
00:13:12evani should be able to just run the thread specs
00:13:14evanand see it
00:13:15evanyesh?
00:13:22brixenI believe so
00:13:29evanok
00:13:34brixenyou might run bin/mspec ci though
00:13:42evank
00:13:46brixenwe've had issue before where the full run would trigger it
00:13:53brixenbut not the thread specs in isolation
00:14:02evank.
00:15:57evanwtf
00:16:02evanrake is taking 99% of the CPU on elle
00:16:07evanand choking g++
00:16:25evanWEIRD
00:16:28evanit's in the background...
00:17:15evanthat was odd.
00:17:19evanit must be CI running
00:17:20evanbut still.
00:22:45boyscoutCI: 743b24b success. 2709 files, 10767 examples, 33781 expectations, 0 failures, 0 errors
00:22:51evanwell, thats good!
00:23:12sbryantjawsome!
00:23:19sbryantbuilding git and stuff now.
00:23:32evanlocks sbryant in the jaws-of-awesome
00:23:40sbryanthaha.
00:25:58evanok
00:26:04evani'm seeing the specs 'stop'
00:26:10evanand CPU was pegged
00:27:04tarcieriugh, just ran into some of my coworker's code which works on 1.8.7 but not 1.8.6
00:27:23tarcieri@langs.each_key.to_a
00:27:24scoopryou misspelled cow-orker
00:27:33tarciericow horker?
00:27:57evantarcieri: eww.
00:28:00brixenevan: running with -fs?
00:28:10evantarcieri: what kind of person calls that rather than
00:28:14evan@langs.keys
00:28:17tarcieriyeah seriously
00:28:18tarcierilol
00:28:28evan"I wanted it to take a while."
00:28:31evanSUCCESS SIR
00:28:31tarcieriheh
00:28:40evanbrixen: no, just 'rake' atm.
00:28:42tarcieriwell on 1.8.7
00:28:47tarcierithat gets the enumerator
00:29:05tarcieriso fucking annoying
00:29:21tarcieriI'm the only one still running 1.8.6
00:29:34evani see the pauses on "describes a running thread"
00:29:59nemerleevan: i did some test with a few combinations of flags, to see what had largest impact on that threading issue http://pastie.org/517576
00:30:28evanHUH
00:30:35evandebugging the GIL helped!
00:30:40nemerlestrange eh ? :)
00:30:41tarcierihaha
00:30:44evanyeah
00:30:52evanit's clearly a thread timing issue
00:30:58evanand debugging the gil slows stuff down
00:31:06evanto remove some timing problems
00:31:13nemerlei think it broke up the busy waiting a bit somewhere ;)
00:31:50evanok
00:31:57evanthese ALL use the exact same mechanism
00:32:01evanThreadSpecs.status_of_running_thread
00:32:12evaninternal warning sensor just went off
00:32:58brixenevan: are you debugging this on elle?
00:33:09evanyes.
00:33:12brixenk
00:33:19brixenI won't run the benches then
00:33:24evanyeah
00:33:25evangive me a sec.
00:33:27nemerleonly difference is: infinite loop in the 'running_thread'
00:33:33evanright
00:33:34evanso
00:33:43evanyou know that article
00:33:53evanon why python's gil is trouble
00:33:58evani'm betting this is the EXACT same issue
00:34:10evancheck_interrupts is being called in that 'loop {}'
00:34:12evanwhich is good
00:34:12nemerleyeah, Channel is using GIL for Conditional
00:34:23evanbut it releases the mutex and grabs it again right away
00:34:34evanand starves the other threads
00:34:47evanmakes perfect sense it would a 'random' amount of time
00:34:57evanbecause it's purely based on who can require the GIL
00:35:18evanwhich depends on the kernel's timeslicing, signals, atomic operations
00:35:20evanthe works.
00:35:25ddubstupid dining philosophers
00:35:36ddubjust eat with your hands, dudes
00:36:11evanddub: MUNCH
00:37:05sbryantevan: ouch.
00:37:21evani'm going to do a few tests
00:37:22nemerlestupid question: would creating a new independent Mutex for each Channel solve the problem or just complicate things enormously ? :)
00:37:27evanshould be easy to show a fix
00:37:37evannemerle: it's not nothing to do with Channel.
00:38:11nemerleoh.. hmm so i must've been following the symptoms not the cause
00:41:14nemerlethat hang appeard after following seq: Th1:Channel::receive_timeout -> Th1 goes to sleep | Th2:Channel::send -> Th1 does not wake after (signal())
00:42:38evanyeah
00:42:41nemerlebut since that conditional is using GIL, which flickers like crazy at that point, it would explain that
00:42:41evanit's the same thing
00:42:52evanit's all about who manages to get the GIL when there is contention
00:42:58evanlinux seems to favor the last owner
00:43:04evanbecause thats the thread with the current time slice
00:43:31evanyep
00:43:33evanthat fixes it.
00:43:42evani added a call to nanosleep in check_interrupts
00:43:49evanto sleep for 1 microsecond
00:43:57evanrunning the thread specs now takes 3 seconds
00:44:15evanthats enough to make linux try and actually give the lock to someone else
00:44:45evanthe question is, what if i tell it sleep for 1 nanosecond
00:44:49evanis that enough to upset the balance.
00:44:54sbryantis it possible to do a something like GIL pair?
00:45:03evanGIL pair?
00:45:10evan2 locks
00:45:22evanyou unlock one, and lock the other?
00:45:42sbryantyeah to ease the contention on the lock, then signal both
00:45:53evanthe GIL has no signaling
00:45:53sbryantNot really well thought out.
00:45:58evanit doesn't use a condition variable
00:46:02sbryantOh.
00:46:02evanit's just a mutex
00:46:08ddubponders inviting some philosophers over for dinner
00:46:12evanthats actually part of this problem
00:46:16sbryantevan: oh yeah.
00:46:22sbryantTry a cond?
00:46:44sbryantor is it even applicable?
00:46:44evani think it would have the same problem
00:46:59evanbecause it would still boil down to who locks a contended lock
00:47:24evanunless running the signal causes the current thread to give up it's timeslice
00:47:29evanwhich i'm doubting it does.
00:47:45nemerleit doesn't
00:47:47ddubcondition variables only do stuff when you are waiting
00:48:01ddubwhat does the lock protect again?
00:48:13evanthe whole VM
00:48:25evanit's totally macrograined
00:48:27ddubwell then, the answer _obviously_ is to make the VM threadsafe :)
00:48:34evanyep!
00:48:40sbryantNot it.
00:48:42evanwe're on the way to that already
00:48:50evanok
00:48:52dduboh perfect, then
00:48:53evanso
00:48:55ddubI'll take a nap
00:48:59evandoing this
00:49:06evanstruct timespec ts = {0, 1};
00:49:13evannanosleep(&ts, NULL);
00:49:15evanfixes it.
00:49:24ddubso basically the issue is that on some operating systems, when there is contention around a mutex the current process wins
00:49:40evanexactly
00:49:56evanif T2 is stuck on pthread_mutex_lock()
00:49:58evanand T1 does
00:50:04ddubthis linux, right?
00:50:10evanpthread_mutex_unlock(); pthread_mutex_lock();
00:50:13evanwill T2 be woken up?
00:50:20evanon linux, the answer seems to be a big NO
00:50:26evanon darwin, the answer is yes.
00:50:28ddubits a big maybe
00:50:30ddubactually :)
00:50:32ddubnptl
00:50:34evancourse
00:50:40evanotherwise these specs would NEVER finish
00:50:45ddubhmm
00:50:48evanon linux, it's just highly unlikely
00:50:51ddubon solaris, they may very well never finish
00:50:59yakischlobatry sched_yield()?
00:51:04evanthe nanosleep does a syscall as I recall
00:51:06ddublinux has two thread types, user and process
00:51:09evanyakischloba: OH
00:51:10evanof course!
00:51:11evander.
00:51:13yakischloba:P
00:51:21evanyakischloba: i KNEW there was something i was forgetting
00:51:30yakischlobaheh.
00:51:31evannanosleep with 1ns is pretty much sched_yield()
00:51:32evan:D
00:51:41yakischlobathere was some thing like
00:51:54yakischlobapthread_yield() that i saw when i was learning the pthread api this past weeek
00:52:04yakischlobabut that function doesnt seem to exist so i came upon sched_yield() somehow
00:52:27yakischlobai guess that is the correct one now.
00:52:30evanyep!
00:52:31evantesting now
00:52:33evanif this works
00:52:37evanyou get the lurker award of the day!
00:52:48yakischlobahhah
00:53:52ddubawww
00:53:56yakischlobahavent really been following the whole thing but that is at least how i solved my problem, which was similar.
00:53:59ddubjust gives up and comes back tomorrow
00:54:48evanah damn.
00:54:50evanit doesn't solve it.
00:54:56yakischloba:(
00:55:14dduboooh, still might have a chance!
00:55:31evanthough the notes here do say it's useful for deal with heavily contented mutexs
00:55:35evanof which this is exactly that.
00:55:52evani'll try.. 3!
00:56:04yakischlobayeah. without that, my other thread was like *never* getting scheduled durng the brief unlock
00:56:05evanthis is call the "I SAID YIELD BITCH" method.
00:56:20yakischlobaheh
00:56:21yakischlobasleep(5)
00:56:32evannanosleep of 1ns does it fine
00:56:35sbryantevan: that is surprising.
00:56:45nemerlei wonder if thread scheduling policy might change things ( SCHED_RR ) ?
00:56:47yakischlobabut sched_yield() doesnt? weird.
00:56:53evanyakischloba: yeah, i thought so
00:56:56evani'm doing to do some testing
00:57:02yakischlobaall i was doing in my app was
00:57:24yakischlobapthread_mutex_unlock(); sched_yield(); pthread_mutex_lock();, and the other thread acquired like every single time no problem
00:57:25evanmaybe even sleep(0)
00:57:27evanthat might do it
00:57:35scooprevan, http://www.greenteapress.com/semaphores/downey08semaphores.pdf ~page 87, no-starve mutex ;)
00:58:12scoopr(no, I don't know anyting about threading/locking etc, but I remembered reading about that)
00:58:33evanhuh
00:58:37evansched_yield doesn't seem to work.
00:58:41evanweird.
00:59:11sbryantThtat is just strange.
00:59:38sbryantany custom policies or params?
00:59:51evannope
01:00:01evanLinux elle 2.6.25-2-686 #1 SMP Fri Jun 27 03:23:20 UTC 2008 i686 GNU/Linux
01:00:14evanperhaps because it's SMP?
01:00:14evanand the threads are getting scheduled on different CPUs
01:00:15scooproh right, you could try sched_setscheduler :P
01:01:23evani can't imagine it would be anything except for the standard policy
01:01:24yakischlobai was doing it on multi core machine as well with no prob
01:01:26sbryantHRRM maybe you could try some priorty switching before yiedling?
01:01:26evani certainly didn't set it
01:01:49evanto validate this experiment, in need to try nanosleep again
01:01:55evanto verify before continuinig
01:02:44ddubdoes nanosleep actually wait the process?
01:03:01ddubguess it does
01:04:28evanyeah
01:04:31evaneven with 1ns
01:04:34evanit seems to
01:04:46evani'm moving some code around to make experimenting with this a little easier
01:04:47evanone sec
01:04:52evanthen i'm going to try sleeping for 0ns
01:04:54evansee what happens
01:06:58sbryantSomething is very very strange with this
01:07:17yakischlobayeah. i dont really know the details about what you are doing but it should work without anything janky
01:07:42mahargit's pretty standard for sleeping for 0 time to mean to the thread scheduler "give up my slice"
01:12:52dgtizedso do benchmarks just run nightly?
01:13:06evanyeah
01:15:14dgtizedhmm, as long as we are running all this stuff like that, could we do a published profile run every night for these benchmarks as well?
01:16:05evanwhat do ya mean
01:16:57dgtizedwell we could finish running each of the benchmarks, and then run them again with a profiler on, and then generate the profiler data and put that up too
01:17:07evanoh
01:17:12evani guess.
01:17:39dgtizedI mean it doesn't matter as much as a time sensitive check, but it would do a good job of reminding us where the hotspots are
01:18:09dgtizedby time sensitive check, I mean the fact that we would have a timeline for it is not important, we could just always publish the latest set
01:19:00evanyeah
01:19:52dgtizeddunno might not be the most useful, but it could be nice to see as things like Hash drop off as a major bottleneck
01:20:30sbryantthe profiler will clear up the contention probably
01:40:22evanyeah, sched_yield really does not work.
01:40:25evanstrange.
01:40:52sbryantbut nanosleep of zero does?
01:41:16evanyep
01:42:35sbryantuse what works!
01:43:20evani should probably change that spec
01:43:24evanto not use a hot loop
01:43:28evanit should at least do a sleep or something in it
01:43:41evanoh, the test is that it's running though...
01:43:47evannot sleeping..
01:43:50evanwell, something
01:43:53evan1 + 1
01:43:55evananything
01:52:42boyscoutAdd code to deal with linux's mutex contention - a81968a - Evan Phoenix
01:52:51evanok, using the nanosleep hack for now.
01:53:02evanrelevant code goes in vm/global_lock.cpp
01:53:05evanGlobalLock::yield
01:53:10evanif anyone wants to play with it.
01:53:17sbryantI'll take a look at it.
01:53:22sbryantI've actually seen this before.
01:54:35sbryantI'll be back later.
02:00:09boyscoutCI: a81968a success. 2709 files, 10767 examples, 33781 expectations, 0 failures, 0 errors
02:00:32evanwell, off to get abby an iphone 3gs
02:01:47brixenheh sweet
02:01:54brixenshe has come over to the dark side
02:12:56ddubI want a 3gs :(
02:13:08ddubof course, I used to want a ][ GS as well
02:13:42brixenheh, me too!
10:51:30nemerle_nThe latest fix ( nanosleep ) to threading fixes the 'hanging' issue, but adding -Xvm.gil.debug flag still speeds up the mspec-ci run :)
10:51:59nemerle_nhttp://pastie.org/518414
11:27:26dbussinknemerle_n: they're probably all asleep :)
11:28:16nemerle_ni figured :P
11:28:59nemerle_ntimezone distribution of top rubini-minds sucks :)
11:31:34dbussinknemerle_n: yeah, where are you from?
11:31:53nemerle_npoland
11:32:25dbussinknemerle_n: ah, i could have whois'd too i guess :)
11:39:58radareknemerle_n: oo, I'm from Poland to:P
11:40:30nemerle_nradarek: cool
19:21:15radarekhttp://pastie.org/518678
19:21:34radarekwhen building rubinius it's segfaulting
19:21:46radarekubuntu 9.04, 64bit
19:41:45radarekhere is more specific info:
19:41:46radarekhttp://pastie.org/518694
19:42:16radarek(I'm trying build rbx with exported RBX_LLVM=1