Show enters and exits. Hide enters and exits.
| 00:09:06 | evan | woop! |
| 00:09:09 | evan | 45.92s! |
| 00:09:11 | evan | best yet |
| 00:09:22 | evan | new InlineCode incoming@! |
| 00:09:23 | evan | duck and cover! |
| 00:09:28 | boyscout | Introduce InlineCache, a SendSite replacement - 4cecf54 - Evan Phoenix |
| 00:10:19 | boyscout | CI: Build 4cecf54 failed. http://ci.rubini.us/rubinius/builds/4cecf544aa0cfce2742e4f904d33abbd80822f15 |
| 00:10:24 | evan | woo |
| 00:10:38 | brixen | well, you did warn us :) |
| 00:10:46 | sbryant | oh snap. |
| 00:10:55 | brixen | heh, that's a funny error |
| 00:11:00 | evan | yeah |
| 00:11:01 | evan | really |
| 00:11:02 | brixen | um, you didn't define anything |
| 00:11:14 | brixen | or declare rather |
| 00:11:28 | boyscout | Remove a declaration that doesn't declare anything - 743b24b - Evan Phoenix |
| 00:11:30 | evan | it's a declaration |
| 00:11:32 | brixen | heh |
| 00:11:34 | brixen | yeah |
| 00:11:34 | evan | how can it not declare anything? |
| 00:11:50 | evan | thats it's reason de etre! |
| 00:11:51 | brixen | C++ added customs agents |
| 00:11:59 | evan | hah |
| 00:12:07 | evan | I didn't fill out the form on the airplane |
| 00:12:10 | brixen | yep |
| 00:12:12 | evan | so i have to wait at the end of the line |
| 00:12:31 | brixen | I zee a prob'em vis dis declarashun |
| 00:12:36 | evan | hah |
| 00:12:39 | evan | LOL |
| 00:13:05 | evan | ok, lets look at this linux thread issue. |
| 00:13:12 | evan | i should be able to just run the thread specs |
| 00:13:14 | evan | and see it |
| 00:13:15 | evan | yesh? |
| 00:13:22 | brixen | I believe so |
| 00:13:29 | evan | ok |
| 00:13:34 | brixen | you might run bin/mspec ci though |
| 00:13:42 | evan | k |
| 00:13:46 | brixen | we've had issue before where the full run would trigger it |
| 00:13:53 | brixen | but not the thread specs in isolation |
| 00:14:02 | evan | k. |
| 00:15:57 | evan | wtf |
| 00:16:02 | evan | rake is taking 99% of the CPU on elle |
| 00:16:07 | evan | and choking g++ |
| 00:16:25 | evan | WEIRD |
| 00:16:28 | evan | it's in the background... |
| 00:17:15 | evan | that was odd. |
| 00:17:19 | evan | it must be CI running |
| 00:17:20 | evan | but still. |
| 00:22:45 | boyscout | CI: 743b24b success. 2709 files, 10767 examples, 33781 expectations, 0 failures, 0 errors |
| 00:22:51 | evan | well, thats good! |
| 00:23:12 | sbryant | jawsome! |
| 00:23:19 | sbryant | building git and stuff now. |
| 00:23:32 | evan | locks sbryant in the jaws-of-awesome |
| 00:23:40 | sbryant | haha. |
| 00:25:58 | evan | ok |
| 00:26:04 | evan | i'm seeing the specs 'stop' |
| 00:26:10 | evan | and CPU was pegged |
| 00:27:04 | tarcieri | ugh, just ran into some of my coworker's code which works on 1.8.7 but not 1.8.6 |
| 00:27:23 | tarcieri | @langs.each_key.to_a |
| 00:27:24 | scoopr | you misspelled cow-orker |
| 00:27:33 | tarcieri | cow horker? |
| 00:27:57 | evan | tarcieri: eww. |
| 00:28:00 | brixen | evan: running with -fs? |
| 00:28:10 | evan | tarcieri: what kind of person calls that rather than |
| 00:28:14 | evan | @langs.keys |
| 00:28:17 | tarcieri | yeah seriously |
| 00:28:18 | tarcieri | lol |
| 00:28:28 | evan | "I wanted it to take a while." |
| 00:28:31 | evan | SUCCESS SIR |
| 00:28:31 | tarcieri | heh |
| 00:28:40 | evan | brixen: no, just 'rake' atm. |
| 00:28:42 | tarcieri | well on 1.8.7 |
| 00:28:47 | tarcieri | that gets the enumerator |
| 00:29:05 | tarcieri | so fucking annoying |
| 00:29:21 | tarcieri | I'm the only one still running 1.8.6 |
| 00:29:34 | evan | i see the pauses on "describes a running thread" |
| 00:29:59 | nemerle | evan: i did some test with a few combinations of flags, to see what had largest impact on that threading issue http://pastie.org/517576 |
| 00:30:28 | evan | HUH |
| 00:30:35 | evan | debugging the GIL helped! |
| 00:30:40 | nemerle | strange eh ? :) |
| 00:30:41 | tarcieri | haha |
| 00:30:44 | evan | yeah |
| 00:30:52 | evan | it's clearly a thread timing issue |
| 00:30:58 | evan | and debugging the gil slows stuff down |
| 00:31:06 | evan | to remove some timing problems |
| 00:31:13 | nemerle | i think it broke up the busy waiting a bit somewhere ;) |
| 00:31:50 | evan | ok |
| 00:31:57 | evan | these ALL use the exact same mechanism |
| 00:32:01 | evan | ThreadSpecs.status_of_running_thread |
| 00:32:12 | evan | internal warning sensor just went off |
| 00:32:58 | brixen | evan: are you debugging this on elle? |
| 00:33:09 | evan | yes. |
| 00:33:12 | brixen | k |
| 00:33:19 | brixen | I won't run the benches then |
| 00:33:24 | evan | yeah |
| 00:33:25 | evan | give me a sec. |
| 00:33:27 | nemerle | only difference is: infinite loop in the 'running_thread' |
| 00:33:33 | evan | right |
| 00:33:34 | evan | so |
| 00:33:43 | evan | you know that article |
| 00:33:53 | evan | on why python's gil is trouble |
| 00:33:58 | evan | i'm betting this is the EXACT same issue |
| 00:34:10 | evan | check_interrupts is being called in that 'loop {}' |
| 00:34:12 | evan | which is good |
| 00:34:12 | nemerle | yeah, Channel is using GIL for Conditional |
| 00:34:23 | evan | but it releases the mutex and grabs it again right away |
| 00:34:34 | evan | and starves the other threads |
| 00:34:47 | evan | makes perfect sense it would a 'random' amount of time |
| 00:34:57 | evan | because it's purely based on who can require the GIL |
| 00:35:18 | evan | which depends on the kernel's timeslicing, signals, atomic operations |
| 00:35:20 | evan | the works. |
| 00:35:25 | ddub | stupid dining philosophers |
| 00:35:36 | ddub | just eat with your hands, dudes |
| 00:36:11 | evan | ddub: MUNCH |
| 00:37:05 | sbryant | evan: ouch. |
| 00:37:21 | evan | i'm going to do a few tests |
| 00:37:22 | nemerle | stupid question: would creating a new independent Mutex for each Channel solve the problem or just complicate things enormously ? :) |
| 00:37:27 | evan | should be easy to show a fix |
| 00:37:37 | evan | nemerle: it's not nothing to do with Channel. |
| 00:38:11 | nemerle | oh.. hmm so i must've been following the symptoms not the cause |
| 00:41:14 | nemerle | that hang appeard after following seq: Th1:Channel::receive_timeout -> Th1 goes to sleep | Th2:Channel::send -> Th1 does not wake after (signal()) |
| 00:42:38 | evan | yeah |
| 00:42:41 | nemerle | but since that conditional is using GIL, which flickers like crazy at that point, it would explain that |
| 00:42:41 | evan | it's the same thing |
| 00:42:52 | evan | it's all about who manages to get the GIL when there is contention |
| 00:42:58 | evan | linux seems to favor the last owner |
| 00:43:04 | evan | because thats the thread with the current time slice |
| 00:43:31 | evan | yep |
| 00:43:33 | evan | that fixes it. |
| 00:43:42 | evan | i added a call to nanosleep in check_interrupts |
| 00:43:49 | evan | to sleep for 1 microsecond |
| 00:43:57 | evan | running the thread specs now takes 3 seconds |
| 00:44:15 | evan | thats enough to make linux try and actually give the lock to someone else |
| 00:44:45 | evan | the question is, what if i tell it sleep for 1 nanosecond |
| 00:44:49 | evan | is that enough to upset the balance. |
| 00:44:54 | sbryant | is it possible to do a something like GIL pair? |
| 00:45:03 | evan | GIL pair? |
| 00:45:10 | evan | 2 locks |
| 00:45:22 | evan | you unlock one, and lock the other? |
| 00:45:42 | sbryant | yeah to ease the contention on the lock, then signal both |
| 00:45:53 | evan | the GIL has no signaling |
| 00:45:53 | sbryant | Not really well thought out. |
| 00:45:58 | evan | it doesn't use a condition variable |
| 00:46:02 | sbryant | Oh. |
| 00:46:02 | evan | it's just a mutex |
| 00:46:08 | ddub | ponders inviting some philosophers over for dinner |
| 00:46:12 | evan | thats actually part of this problem |
| 00:46:16 | sbryant | evan: oh yeah. |
| 00:46:22 | sbryant | Try a cond? |
| 00:46:44 | sbryant | or is it even applicable? |
| 00:46:44 | evan | i think it would have the same problem |
| 00:46:59 | evan | because it would still boil down to who locks a contended lock |
| 00:47:24 | evan | unless running the signal causes the current thread to give up it's timeslice |
| 00:47:29 | evan | which i'm doubting it does. |
| 00:47:45 | nemerle | it doesn't |
| 00:47:47 | ddub | condition variables only do stuff when you are waiting |
| 00:48:01 | ddub | what does the lock protect again? |
| 00:48:13 | evan | the whole VM |
| 00:48:25 | evan | it's totally macrograined |
| 00:48:27 | ddub | well then, the answer _obviously_ is to make the VM threadsafe :) |
| 00:48:34 | evan | yep! |
| 00:48:40 | sbryant | Not it. |
| 00:48:42 | evan | we're on the way to that already |
| 00:48:50 | evan | ok |
| 00:48:52 | ddub | oh perfect, then |
| 00:48:53 | evan | so |
| 00:48:55 | ddub | I'll take a nap |
| 00:48:59 | evan | doing this |
| 00:49:06 | evan | struct timespec ts = {0, 1}; |
| 00:49:13 | evan | nanosleep(&ts, NULL); |
| 00:49:15 | evan | fixes it. |
| 00:49:24 | ddub | so basically the issue is that on some operating systems, when there is contention around a mutex the current process wins |
| 00:49:40 | evan | exactly |
| 00:49:56 | evan | if T2 is stuck on pthread_mutex_lock() |
| 00:49:58 | evan | and T1 does |
| 00:50:04 | ddub | this linux, right? |
| 00:50:10 | evan | pthread_mutex_unlock(); pthread_mutex_lock(); |
| 00:50:13 | evan | will T2 be woken up? |
| 00:50:20 | evan | on linux, the answer seems to be a big NO |
| 00:50:26 | evan | on darwin, the answer is yes. |
| 00:50:28 | ddub | its a big maybe |
| 00:50:30 | ddub | actually :) |
| 00:50:32 | ddub | nptl |
| 00:50:34 | evan | course |
| 00:50:40 | evan | otherwise these specs would NEVER finish |
| 00:50:45 | ddub | hmm |
| 00:50:48 | evan | on linux, it's just highly unlikely |
| 00:50:51 | ddub | on solaris, they may very well never finish |
| 00:50:59 | yakischloba | try sched_yield()? |
| 00:51:04 | evan | the nanosleep does a syscall as I recall |
| 00:51:06 | ddub | linux has two thread types, user and process |
| 00:51:09 | evan | yakischloba: OH |
| 00:51:10 | evan | of course! |
| 00:51:11 | evan | der. |
| 00:51:13 | yakischloba | :P |
| 00:51:21 | evan | yakischloba: i KNEW there was something i was forgetting |
| 00:51:30 | yakischloba | heh. |
| 00:51:31 | evan | nanosleep with 1ns is pretty much sched_yield() |
| 00:51:32 | evan | :D |
| 00:51:41 | yakischloba | there was some thing like |
| 00:51:54 | yakischloba | pthread_yield() that i saw when i was learning the pthread api this past weeek |
| 00:52:04 | yakischloba | but that function doesnt seem to exist so i came upon sched_yield() somehow |
| 00:52:27 | yakischloba | i guess that is the correct one now. |
| 00:52:30 | evan | yep! |
| 00:52:31 | evan | testing now |
| 00:52:33 | evan | if this works |
| 00:52:37 | evan | you get the lurker award of the day! |
| 00:52:48 | yakischloba | hhah |
| 00:53:52 | ddub | awww |
| 00:53:56 | yakischloba | havent really been following the whole thing but that is at least how i solved my problem, which was similar. |
| 00:53:59 | ddub | just gives up and comes back tomorrow |
| 00:54:48 | evan | ah damn. |
| 00:54:50 | evan | it doesn't solve it. |
| 00:54:56 | yakischloba | :( |
| 00:55:14 | ddub | oooh, still might have a chance! |
| 00:55:31 | evan | though the notes here do say it's useful for deal with heavily contented mutexs |
| 00:55:35 | evan | of which this is exactly that. |
| 00:55:52 | evan | i'll try.. 3! |
| 00:56:04 | yakischloba | yeah. without that, my other thread was like *never* getting scheduled durng the brief unlock |
| 00:56:05 | evan | this is call the "I SAID YIELD BITCH" method. |
| 00:56:20 | yakischloba | heh |
| 00:56:21 | yakischloba | sleep(5) |
| 00:56:32 | evan | nanosleep of 1ns does it fine |
| 00:56:35 | sbryant | evan: that is surprising. |
| 00:56:45 | nemerle | i wonder if thread scheduling policy might change things ( SCHED_RR ) ? |
| 00:56:47 | yakischloba | but sched_yield() doesnt? weird. |
| 00:56:53 | evan | yakischloba: yeah, i thought so |
| 00:56:56 | evan | i'm doing to do some testing |
| 00:57:02 | yakischloba | all i was doing in my app was |
| 00:57:24 | yakischloba | pthread_mutex_unlock(); sched_yield(); pthread_mutex_lock();, and the other thread acquired like every single time no problem |
| 00:57:25 | evan | maybe even sleep(0) |
| 00:57:27 | evan | that might do it |
| 00:57:35 | scoopr | evan, http://www.greenteapress.com/semaphores/downey08semaphores.pdf ~page 87, no-starve mutex ;) |
| 00:58:12 | scoopr | (no, I don't know anyting about threading/locking etc, but I remembered reading about that) |
| 00:58:33 | evan | huh |
| 00:58:37 | evan | sched_yield doesn't seem to work. |
| 00:58:41 | evan | weird. |
| 00:59:11 | sbryant | Thtat is just strange. |
| 00:59:38 | sbryant | any custom policies or params? |
| 00:59:51 | evan | nope |
| 01:00:01 | evan | Linux elle 2.6.25-2-686 #1 SMP Fri Jun 27 03:23:20 UTC 2008 i686 GNU/Linux |
| 01:00:14 | evan | perhaps because it's SMP? |
| 01:00:14 | evan | and the threads are getting scheduled on different CPUs |
| 01:00:15 | scoopr | oh right, you could try sched_setscheduler :P |
| 01:01:23 | evan | i can't imagine it would be anything except for the standard policy |
| 01:01:24 | yakischloba | i was doing it on multi core machine as well with no prob |
| 01:01:26 | sbryant | HRRM maybe you could try some priorty switching before yiedling? |
| 01:01:26 | evan | i certainly didn't set it |
| 01:01:49 | evan | to validate this experiment, in need to try nanosleep again |
| 01:01:55 | evan | to verify before continuinig |
| 01:02:44 | ddub | does nanosleep actually wait the process? |
| 01:03:01 | ddub | guess it does |
| 01:04:28 | evan | yeah |
| 01:04:31 | evan | even with 1ns |
| 01:04:34 | evan | it seems to |
| 01:04:46 | evan | i'm moving some code around to make experimenting with this a little easier |
| 01:04:47 | evan | one sec |
| 01:04:52 | evan | then i'm going to try sleeping for 0ns |
| 01:04:54 | evan | see what happens |
| 01:06:58 | sbryant | Something is very very strange with this |
| 01:07:17 | yakischloba | yeah. i dont really know the details about what you are doing but it should work without anything janky |
| 01:07:42 | maharg | it's pretty standard for sleeping for 0 time to mean to the thread scheduler "give up my slice" |
| 01:12:52 | dgtized | so do benchmarks just run nightly? |
| 01:13:06 | evan | yeah |
| 01:15:14 | dgtized | hmm, as long as we are running all this stuff like that, could we do a published profile run every night for these benchmarks as well? |
| 01:16:05 | evan | what do ya mean |
| 01:16:57 | dgtized | well we could finish running each of the benchmarks, and then run them again with a profiler on, and then generate the profiler data and put that up too |
| 01:17:07 | evan | oh |
| 01:17:12 | evan | i guess. |
| 01:17:39 | dgtized | I mean it doesn't matter as much as a time sensitive check, but it would do a good job of reminding us where the hotspots are |
| 01:18:09 | dgtized | by time sensitive check, I mean the fact that we would have a timeline for it is not important, we could just always publish the latest set |
| 01:19:00 | evan | yeah |
| 01:19:52 | dgtized | dunno might not be the most useful, but it could be nice to see as things like Hash drop off as a major bottleneck |
| 01:20:30 | sbryant | the profiler will clear up the contention probably |
| 01:40:22 | evan | yeah, sched_yield really does not work. |
| 01:40:25 | evan | strange. |
| 01:40:52 | sbryant | but nanosleep of zero does? |
| 01:41:16 | evan | yep |
| 01:42:35 | sbryant | use what works! |
| 01:43:20 | evan | i should probably change that spec |
| 01:43:24 | evan | to not use a hot loop |
| 01:43:28 | evan | it should at least do a sleep or something in it |
| 01:43:41 | evan | oh, the test is that it's running though... |
| 01:43:47 | evan | not sleeping.. |
| 01:43:50 | evan | well, something |
| 01:43:53 | evan | 1 + 1 |
| 01:43:55 | evan | anything |
| 01:52:42 | boyscout | Add code to deal with linux's mutex contention - a81968a - Evan Phoenix |
| 01:52:51 | evan | ok, using the nanosleep hack for now. |
| 01:53:02 | evan | relevant code goes in vm/global_lock.cpp |
| 01:53:05 | evan | GlobalLock::yield |
| 01:53:10 | evan | if anyone wants to play with it. |
| 01:53:17 | sbryant | I'll take a look at it. |
| 01:53:22 | sbryant | I've actually seen this before. |
| 01:54:35 | sbryant | I'll be back later. |
| 02:00:09 | boyscout | CI: a81968a success. 2709 files, 10767 examples, 33781 expectations, 0 failures, 0 errors |
| 02:00:32 | evan | well, off to get abby an iphone 3gs |
| 02:01:47 | brixen | heh sweet |
| 02:01:54 | brixen | she has come over to the dark side |
| 02:12:56 | ddub | I want a 3gs :( |
| 02:13:08 | ddub | of course, I used to want a ][ GS as well |
| 02:13:42 | brixen | heh, me too! |
| 10:51:30 | nemerle_n | The latest fix ( nanosleep ) to threading fixes the 'hanging' issue, but adding -Xvm.gil.debug flag still speeds up the mspec-ci run :) |
| 10:51:59 | nemerle_n | http://pastie.org/518414 |
| 11:27:26 | dbussink | nemerle_n: they're probably all asleep :) |
| 11:28:16 | nemerle_n | i figured :P |
| 11:28:59 | nemerle_n | timezone distribution of top rubini-minds sucks :) |
| 11:31:34 | dbussink | nemerle_n: yeah, where are you from? |
| 11:31:53 | nemerle_n | poland |
| 11:32:25 | dbussink | nemerle_n: ah, i could have whois'd too i guess :) |
| 11:39:58 | radarek | nemerle_n: oo, I'm from Poland to:P |
| 11:40:30 | nemerle_n | radarek: cool |
| 19:21:15 | radarek | http://pastie.org/518678 |
| 19:21:34 | radarek | when building rubinius it's segfaulting |
| 19:21:46 | radarek | ubuntu 9.04, 64bit |
| 19:41:45 | radarek | here is more specific info: |
| 19:41:46 | radarek | http://pastie.org/518694 |
| 19:42:16 | radarek | (I'm trying build rbx with exported RBX_LLVM=1 |