Show enters and exits. Hide enters and exits.
| 18:59:38 | dbussink | silence before the release storm? :) |
| 18:59:56 | evan | :) |
| 19:00:06 | evan | just finished up the last of the debugger functionality it looks like. |
| 19:00:17 | evan | namely, the ability to debug with the JIT on. |
| 19:18:42 | dbussink | evan: cool :) |
| 19:18:57 | evan | going to do some debugging docs and cleanup later today |
| 19:19:21 | dbussink | ah nice, brixen is finishing the audit i guess, those are the last things? |
| 19:19:37 | evan | brixen is going to finish up pack/unpack |
| 19:19:59 | evan | then need to get the rails3 tests green. |
| 19:37:13 | boyscout | Support switching a method from JIT to debugging interpreter - 7e97a1c - Evan Phoenix |
| 19:37:13 | boyscout | Merge branch 'jit_deopt' - 903d16d - Evan Phoenix |
| 19:37:14 | boyscout | Cleanup some old breakpoint cruft - 8fdb4a4 - Evan Phoenix |
| 19:37:45 | dbussink | evan: hehe, i can even get headius and enebo bug hunting in jruby ;) |
| 19:49:17 | boyscout | CI: rubinius: 8fdb4a4 successful: 3522 files, 15318 examples, 43186 expectations, 0 failures, 0 errors |
| 19:49:54 | dbussink | evan: ah, it completely deopts for the debugger? |
| 19:57:25 | dbussink | brixen: irc issues? |
| 19:57:30 | brixen | dunno |
| 19:57:35 | brixen | just got back from lunch |
| 19:57:57 | dbussink | brixen: btw, just got a nice compiler blow up found due to a bug in jruby: ./bin/rbx -e "class Fixnum; def -@; 'hello'; end; end; puts eval('-1')" |
| 19:58:08 | dbussink | just wanted to show you what that does :) |
| 19:58:35 | brixen | what is "that"? |
| 19:59:24 | dbussink | brixen: the compiler blow up |
| 20:00:02 | brixen | you can't redefine @- |
| 20:00:07 | brixen | what sense does that make? |
| 20:01:26 | dbussink | it doesn't |
| 20:01:37 | brixen | so what is the point? |
| 20:01:49 | dbussink | but i was discussing some bug in #jruby and headius tested this to see how mri and rbx behave here |
| 20:01:58 | dbussink | and he was just surprised by the output |
| 20:02:02 | brixen | why? |
| 20:02:11 | brixen | what would you reasonably expect? |
| 20:02:24 | dbussink | i guess he expects a more descriptive error |
| 20:02:29 | dbussink | that's the main thing |
| 20:02:52 | brixen | suddenly a Fixnum is a string |
| 20:03:03 | brixen | I suppose I should test for that |
| 20:03:10 | brixen | that's fucking stupid |
| 20:03:14 | brixen | :) |
| 20:03:53 | dbussink | looks like mri just ignores it |
| 20:04:06 | brixen | big surprise |
| 20:04:12 | brixen | MRI is a C library |
| 20:04:26 | brixen | rbx is a (mostly) Ruby library |
| 20:04:53 | dbussink | yeah, exactly |
| 20:05:06 | brixen | besides |
| 20:05:08 | dbussink | but anyway, it's not that i would expect anything to happen with it |
| 20:05:11 | brixen | you should be testing this: |
| 20:05:12 | dbussink | just wanted to show you |
| 20:05:12 | brixen | ruby -e "class Fixnum; def -@; 'hello'; end; end; puts -eval('1')" |
| 20:09:09 | brixen | dbussink: just fyi, so you know what you are testing: http://gist.github.com/562857 |
| 20:09:54 | dbussink | brixen: it actually was about how :negate came into play here |
| 20:10:02 | dbussink | but anyway, doesn't really matter |
| 20:10:32 | brixen | my point is there is no requirement that negate call -@ |
| 20:23:47 | slava | brixen: its amazing that you can monkeypatch Fixnum#- and ruby doesn't crash immediately |
| 20:27:52 | brixen | slava: it's amazing that anyone would monkeypatch Fixnum#- |
| 20:28:00 | slava | do people do this in real code? |
| 20:28:00 | evan | slava: thats not subtraction |
| 20:28:03 | evan | thats negation |
| 20:28:07 | evan | like |
| 20:28:07 | brixen | slava: no |
| 20:28:10 | evan | y = -x |
| 20:28:15 | slava | hi evan :) |
| 20:28:22 | brixen | yeah, actually Fixnum#-@ |
| 20:28:41 | dbussink | brixen: btw, for bignum, mri does run the -@ |
| 20:28:48 | slava | in factor, redefining fixnum ops wouldn't get as far as printing the next REPL prompt |
| 20:28:55 | evan | dbussink: yeah, thats not surprising. |
| 20:31:24 | slava | it seems problematic in general if every method that a given method X calls is part of the public API of X's class |
| 20:31:39 | slava | and people override random things expecting things to work based on how MRI happens to be implemented |
| 20:31:55 | dbussink | slava: sadly that is part of how people write some ruby code |
| 20:32:07 | dbussink | slava: there's even ruby stdlib code that does that |
| 20:32:10 | slava | I bet if you subclass Array and override [], then most MRI array ops wont' go through your custom [] |
| 20:32:15 | slava | but in rbx they would |
| 20:32:23 | slava | and some monkey's ajax frontend crap will break |
| 20:32:30 | evan | it's getting better actually |
| 20:32:38 | evan | rubinius is a stick in the mud |
| 20:32:47 | evan | so people have a harder time being stupid in their libraries |
| 20:32:55 | dbussink | slava: brixen wrote a nice article on it to raise some awareness: http://www.engineyard.com/blog/2010/rubinius-wants-to-help-you-make-ruby-better/ |
| 20:33:01 | evan | because users report "this library doesn't work on rbx" and we push back a little |
| 20:34:06 | slava | that's a good blog post |
| 20:35:52 | evan | yeah, i'm pretty proud of brixen. That post has gotten a lot of traction. |
| 20:36:06 | slava | has there been much resistance to doing things right? |
| 20:36:16 | slava | people who say "it works with MRI so I won't bother?" |
| 20:36:54 | evan | most people recognize pretty quickly when we push |
| 20:37:01 | evan | if they're doing something dumb |
| 20:37:18 | evan | a common response is "yeah, a quick hack, worked fine in MRI, but pretty dumb." |
| 20:37:39 | evan | when we provide them with better solutions, they're usually amenable. |
| 20:39:51 | brixen | yeah, very rarely are people trying to do something wrong |
| 20:40:11 | brixen | it's easy to not notice something and MRI goes happily away calling C functions |
| 20:40:20 | brixen | that's the biggest case |
| 20:40:57 | evan | knowing about the weirdness is 80% of the battle |
| 20:41:01 | evan | which we're winning. |
| 20:43:55 | brixen | evan: so, trying to debug a segfault where the GC is trying to promote a 32M tuple |
| 20:44:08 | brixen | which of course is not really true |
| 20:44:15 | evan | hm. |
| 20:44:26 | brixen | but I'm not seeing where full_size_ could get written to |
| 20:44:32 | brixen | the header is not corrupted |
| 20:44:46 | evan | full_size_ is 32M? |
| 20:44:49 | brixen | yeah |
| 20:44:58 | evan | maybe it's not corrupted |
| 20:45:05 | evan | maybe it was set that at the beginning |
| 20:45:15 | brixen | well, it couldn't be in young space could it? |
| 20:45:25 | evan | mm, no, it could not. |
| 20:45:42 | evan | but maybe the code that sets full_size_ is screwed up |
| 20:45:42 | evan | ? |
| 20:45:54 | brixen | I have the assert in Tuple::create |
| 20:46:03 | brixen | which is the only place I see full_size_ being set |
| 20:46:09 | evan | is this crash happening every time? |
| 20:46:17 | brixen | yeah |
| 20:46:21 | evan | oh good. |
| 20:46:26 | evan | thats excellent for a GC bug. |
| 20:46:35 | evan | gist me your changes |
| 20:46:41 | brixen | sec.. |
| 20:49:39 | brixen | evan: http://gist.github.com/562918 |
| 20:52:12 | evan | hrm |
| 20:52:24 | evan | want me to replicate it here? |
| 20:52:36 | brixen | well, I suppose |
| 20:52:41 | brixen | I can keep looking at it |
| 20:52:43 | evan | i don't see anything off hand |
| 20:52:52 | brixen | just curious what you might do to debug |
| 20:53:02 | brixen | I can watch full_size_ being set |
| 20:53:18 | evan | thats going to be beyond tedious to do in gdb |
| 20:53:23 | evan | because there are so many Tuples |
| 20:53:34 | brixen | well, I was just going to puts it and grep :) |
| 20:53:38 | brixen | rather than a watchpoint |
| 20:54:12 | evan | why did you introduce int32_t? |
| 20:54:17 | evan | btw |
| 20:54:35 | brixen | I thought I changed that to native_int |
| 20:54:46 | brixen | since the uint32_t will always fit in that |
| 20:54:50 | evan | I see you changing it to int32_t |
| 20:54:53 | brixen | and the external code will use that |
| 20:55:24 | brixen | for num_fields() ? |
| 20:55:30 | evan | the argument to at() |
| 20:56:05 | brixen | ah, that's old |
| 20:56:12 | evan | old? |
| 20:56:23 | brixen | well, it's incorrecty |
| 20:56:26 | brixen | er -y |
| 20:56:37 | brixen | I was trying to work out the type for num_fields |
| 20:56:56 | brixen | my reasoning is, num_fields should be native_int because that's what code will use when eg getting a Fixnum from ruby |
| 20:57:07 | brixen | both at() used size_t before |
| 20:57:08 | evan | ok |
| 20:57:16 | evan | i'm going to replicate it here |
| 20:57:19 | brixen | I changed num_fields() and forgot that |
| 20:57:25 | evan | i don't know how to tell you how to debug it unless I do it myself :) |
| 20:57:29 | evan | i could screencast it again. |
| 20:57:41 | brixen | well, one more thing to look at |
| 20:57:44 | brixen | Tuple::reverse |
| 20:58:06 | brixen | i just realized we set limit, but never use it |
| 20:58:21 | brixen | but I think there was a boundary bug using full_size_ directly initially |
| 20:58:43 | brixen | because the value of o_start comes in as a Fixnum |
| 20:58:55 | brixen | and full_size_ will be bigger than num_fields() |
| 20:59:10 | brixen | so you could say start at num_fields() + 1 |
| 20:59:11 | evan | um |
| 20:59:13 | evan | wtf |
| 20:59:16 | evan | totally |
| 20:59:22 | evan | why the fuck is this code using full_size_ directly |
| 20:59:26 | brixen | not sure |
| 20:59:31 | brixen | you wrote it :) |
| 20:59:34 | evan | it's not related to the number of slots at all |
| 20:59:37 | evan | well, mega fail on me. |
| 20:59:39 | brixen | hah |
| 20:59:58 | evan | it works because limit isn't used |
| 21:00:00 | evan | i'm guessing. |
| 21:00:03 | brixen | yeah |
| 21:00:06 | brixen | just realized that |
| 21:00:13 | evan | but the bounds check is fully wrong. |
| 21:00:37 | brixen | well if(start > num_fields()) seems right |
| 21:00:38 | evan | it will happily write off the end of the Tuple |
| 21:01:11 | brixen | but yes, the bound needs to be fixed |
| 21:02:44 | evan | what workload cases the crash? |
| 21:02:51 | brixen | oh, running specs |
| 21:03:04 | brixen | I'm just doing: bin/mspec ci --gdb |
| 21:03:14 | evan | your gist looks cut off |
| 21:03:34 | evan | and patch complains about that too |
| 21:03:37 | brixen | I think it's right |
| 21:03:38 | evan | could you double check it? |
| 21:03:53 | brixen | looks the same as my cat'd file of the diff |
| 21:04:00 | evan | k |
| 21:04:09 | brixen | did it not apply? |
| 21:04:13 | evan | nope |
| 21:04:16 | evan | patch didn't |
| 21:04:19 | evan | i'll try git apply |
| 21:04:28 | evan | nope |
| 21:04:30 | evan | says it's corrupt |
| 21:04:34 | brixen | weird |
| 21:04:47 | evan | git apply says it ends early |
| 21:05:10 | brixen | oh, you didn't patch < it |
| 21:05:12 | brixen | ? |
| 21:05:16 | evan | i did |
| 21:05:21 | brixen | hum |
| 21:05:21 | evan | I got rejects in regexp.cpp |
| 21:05:29 | evan | and it complained about it ending unexpectedly |
| 21:05:40 | brixen | I'll give you a git fp |
| 21:05:57 | evan | k |
| 21:06:19 | brixen | actually, le'me fix ::reverse |
| 21:06:29 | evan | k |
| 21:27:56 | evan | which one do you guys prefer: |
| 21:27:59 | evan | Debugger.here |
| 21:28:00 | evan | or |
| 21:28:01 | evan | Debugger.start |
| 21:28:09 | evan | i'm taking votes for the next 5 minutes. |
| 21:28:59 | cremes | is there a contra command like Debugger.stop? |
| 21:29:06 | bougyman | Debugger.exterminate |
| 21:29:10 | evan | no. |
| 21:29:12 | evan | there is not. |
| 21:29:17 | evan | it's like Kernel.debugger |
| 21:29:23 | evan | ie, it starts the debugger at that point |
| 21:29:30 | evan | for you to begin debugging |
| 21:32:03 | cremes | 'start' sounds funny to me without a related 'stop'; 'here' is okay but unfamiliar... |
| 21:32:23 | evan | ok |
| 21:32:30 | evan | any other votes? |
| 21:32:41 | cremes | is it supposed to act like a breakpoint? |
| 21:32:58 | cremes | (i looked up Kernel.debugger but didn't find anything for MRI) |
| 21:33:05 | evan | it's in ruby-debug |
| 21:33:07 | evan | yeah |
| 21:33:11 | cremes | oh |
| 21:33:15 | evan | when you call it, it acts like a breakpoint |
| 21:33:22 | evan | it starts you into the CLI |
| 21:33:27 | brixen | Debugger.start is my pref |
| 21:33:30 | kronos_vano | votes -> Debugger.start |
| 21:33:32 | evan | inside the called method |
| 21:33:36 | evan | ok |
| 21:33:36 | cremes | then i like Debugger.break |
| 21:33:43 | cremes | :) |
| 21:33:45 | evan | break is a keyword |
| 21:33:50 | evan | i dislike using break as a method. |
| 21:34:00 | cremes | Debugger.breakpoint |
| 21:34:06 | cremes | or break_point |
| 21:34:19 | brixen | start seem simpler to me |
| 21:34:28 | brixen | you set a breakpoint *in* the debugger |
| 21:34:36 | kevinclark | Debugger.whats_the_problem_already? |
| 21:34:38 | cremes | 'start' makes it sound like a profiler |
| 21:34:44 | brixen | kevinclark! |
| 21:34:58 | evan | kevinclark: Debugger.pay_attention_to_me! |
| 21:35:10 | kevinclark | ! |
| 21:35:29 | kevinclark | how's it going brian? |
| 21:35:34 | cremes | i don't really have a dog in this hunt; my code doesn't have bugs so i don't need a debugger.... :P |
| 21:35:38 | brixen | evan: just pulled and merged, rebuilding, then I'll give you an fp |
| 21:35:44 | evan | k |
| 21:35:52 | brixen | kevinclark: good, you? |
| 21:36:32 | kevinclark | good good. post-vacation I get to wait and see how much of the team evaporates. (my boss gave notice this morning) |
| 21:36:54 | evan | kevinclark: you didn't give notice too? |
| 21:37:14 | brixen | kevinclark: you could be the boss then! :) |
| 21:38:12 | kevinclark | haven't yet. Decided that I should start moving towards teaching (which is what I've wanted to do for a while anyway), so I might let MS help pay for me to finish undergrad |
| 21:38:30 | kevinclark | need to figure out what system I'd want to to teach in, though, and what the reqs would be |
| 21:38:34 | kevinclark | maybe I still don't need it |
| 21:41:15 | brixen | evan: http://gist.github.com/563006 |
| 21:42:15 | brixen | kevinclark: you should see if they'll pay for your school |
| 21:42:42 | brixen | kevinclark: despite all the "you don't need a degree" talk you hear, I think it's good and if they pay, win! :) |
| 21:43:38 | evan | brixen: do you have changes in regexp? or something? |
| 21:43:45 | evan | this won't apply for me. |
| 21:44:03 | kevinclark | brixen: yeah, I think they give 6.5k a year towards it, which'd help a lot |
| 21:44:55 | brixen | evan: hrm |
| 21:45:32 | evan | just push this to a public branch |
| 21:45:32 | brixen | evan: no changes not in that fp |
| 21:45:38 | evan | i can't get it to apply |
| 21:45:46 | evan | i don't know wtf is going on. |
| 21:50:14 | evan | wtf git |
| 21:50:17 | evan | this should totally apply |
| 21:50:27 | evan | i have no clue whats going on. |
| 21:51:31 | evan | it's telling me this: |
| 21:51:34 | evan | Did you hand edit your patch? |
| 21:51:34 | evan | It does not apply to blobs recorded in its index. |
| 21:51:39 | boyscout | Normalizing all access of Tuple and ByteArray. - 3db5989 - Brian Ford (native_int) |
| 21:51:56 | brixen | I just git fp'd it |
| 21:51:57 | evan | how were you getting that into a gist? |
| 21:52:00 | brixen | and pasted it in gist |
| 21:52:03 | evan | how? |
| 21:52:15 | evan | do you have the gist cli tool? |
| 21:52:17 | brixen | copy from the terminal, paste into the gist textarea |
| 21:52:19 | brixen | nope |
| 21:52:21 | evan | UG |
| 21:52:23 | evan | thats probably it |
| 21:52:27 | evan | the cut and paste screwed it up somehow. |
| 21:52:33 | brixen | weird |
| 21:53:34 | brixen | well, I pushed you a branch |
| 21:53:39 | evan | yep |
| 21:53:41 | evan | compiling now. |
| 21:55:05 | evan | oh shit. |
| 21:55:06 | evan | i see |
| 21:55:15 | evan | i diff'd your gist with my making a fp from your commit |
| 21:55:19 | evan | there are hard tabs in there |
| 21:55:26 | evan | and your cut and paste turned them into spaces |
| 21:55:36 | brixen | hard tabs in regexp? |
| 21:55:39 | evan | yep. |
| 21:55:43 | brixen | ugh |
| 21:55:50 | evan | those Tuple::from lines |
| 21:55:54 | evan | that are way intended |
| 21:56:02 | evan | have hard tabs |
| 21:56:21 | brixen | do you have vim highlight tabs? |
| 21:56:24 | evan | you should get the gist cli tool |
| 21:56:49 | evan | i don't recall how to have vim show you |
| 21:57:16 | brixen | evan: btw, there's some spec failures I'm fixing |
| 21:57:28 | brixen | was just trying to get past that segfault to a complete run |
| 21:57:28 | evan | http://gist.github.com/563028 |
| 21:57:33 | evan | thats my diff of the diffs |
| 21:57:39 | evan | fyi. |
| 21:57:43 | evan | ok |
| 21:57:45 | brixen | k |
| 21:57:49 | brixen | I'll fix those tabs |
| 21:57:49 | evan | i'm not getting the segfault yet |
| 21:58:18 | brixen | it appears to happen for me reliably in the spec/core/method/parameters specs from a full run |
| 21:58:27 | brixen | no segfault on just that file |
| 21:59:03 | evan | for me, rbx started sucking down memory |
| 21:59:09 | evan | now at 750M |
| 21:59:20 | evan | i stopped it. |
| 22:00:09 | brixen | hm |
| 22:00:41 | evan | #7 0x00000001000840c6 in rubinius::ObjectMemory::promote_object (this=0x10131a1d0, obj=0x1022c5ef8) at vm/objectmemory.cpp:160 |
| 22:00:41 | evan | 160 Object* copy = immix_->allocate(sz); |
| 22:00:41 | evan | (gdb) p sz |
| 22:00:43 | evan | $1 = 36459912 |
| 22:00:50 | evan | 36M |
| 22:00:58 | brixen | is it a tuple? |
| 22:01:00 | evan | and it's a tuple. |
| 22:01:03 | brixen | k |
| 22:22:05 | evan | brixen: hm |
| 22:22:08 | evan | i'm.... stumped. |
| 22:22:17 | brixen | evan: you probably already know this, but an assert on full_size_ in Tuple::create for this big a size never fails |
| 22:22:31 | evan | yeah |
| 22:22:32 | brixen | if course, it could never get allocated |
| 22:22:33 | evan | i tried that too. |
| 22:22:35 | brixen | ok |
| 22:23:11 | brixen | allocated in young gen I mean |
| 22:23:46 | evan | yep |
| 22:23:50 | brixen | so it could be a random write, or a write behind? |
| 22:24:02 | brixen | ie, it doesn't seem like something wrote over the header |
| 22:24:07 | brixen | into full_size_ |
| 22:25:03 | evan | yeah, it doesn't seem that way |
| 22:25:16 | evan | the klass pointer is right |
| 22:25:21 | evan | and it's adjacent to the size |
| 22:26:30 | brixen | how do you watch for an obj address being alloc'd again? |
| 22:27:03 | evan | set the env var WATCH to it's address |
| 22:28:22 | brixen | hmm, I wonder if this is an odd coincidence, the full_size_ value is really close to the same value as the *address* of the obj being copied |
| 22:29:44 | evan | hm... |
| 22:29:50 | evan | i didn't notice that |
| 22:29:53 | evan | but you might be onto something |
| 22:29:57 | evan | i'm backing out your tuple changes |
| 22:30:00 | evan | and seeing if it goes away |
| 22:30:29 | brixen | well, interestingly enough, I've had random segfaults running specs that I could not repro |
| 22:30:34 | brixen | before any of the tuple stuff |
| 22:30:47 | brixen | about the same place |
| 22:30:58 | brixen | the bt would be ?? and the abort frame |
| 22:31:08 | evan | it was an abort? |
| 22:31:15 | evan | what was the condition? |
| 22:31:48 | brixen | what's interesting here, the gc is getting called from the rbx_ prolog_check |
| 22:31:50 | evan | brixen: oh, run the specs with the JIT off |
| 22:31:52 | evan | see what happens |
| 22:31:55 | brixen | k |
| 22:32:52 | brixen | well, by abort frame I mean the mangled cpp terminate frame |
| 22:33:08 | brixen | so, like 3 frames, ??, something, cpp terminate |
| 22:33:30 | evan | ok |
| 22:33:34 | evan | good news |
| 22:33:41 | evan | going back to before your tuple changes fixes the problem |
| 22:33:56 | evan | so now it's time to binary search the changes |
| 22:33:57 | brixen | heh, well, "fixes" |
| 22:34:20 | evan | yes |
| 22:34:22 | evan | fixes. |
| 22:34:23 | evan | :) |
| 22:34:49 | evan | oh maybe not |
| 22:34:54 | evan | got a crash. |
| 22:34:59 | evan | but different |
| 22:35:21 | evan | :/ |
| 22:35:28 | evan | it's a corrupt object |
| 22:37:14 | brixen | ok, got it with -Xint |
| 22:37:36 | evan | k |
| 22:37:38 | evan | good. |
| 22:37:40 | brixen | let's see if it's the same object addr repeatably with -Xint |
| 22:37:42 | evan | not a JIT bug then. |
| 22:38:27 | evan | oh oh |
| 22:38:33 | evan | i caused the crash with my debugging code |
| 22:38:33 | evan | yay. |
| 22:38:46 | brixen | yay |
| 22:38:52 | brixen | what's your debugging code? |
| 22:39:11 | evan | i did something silly |
| 22:39:15 | evan | used an undefined local |
| 22:39:18 | evan | which i guess sort of worked sometimes |
| 22:39:27 | evan | gotta redo my initial test now actually |
| 22:39:31 | brixen | ok |
| 22:41:28 | brixen | unfortunately, it's not easy to separate parts of that diff |
| 22:42:39 | brixen | I'm really curious about this value being so close to an obj address |
| 22:42:39 | evan | sure it is |
| 22:42:44 | evan | i did |
| 22:42:51 | brixen | which parts? |
| 22:42:53 | evan | git checkout HEAD^ -- vm/builtin/tuple.?pp |
| 22:42:57 | evan | to go back on them |
| 22:43:34 | brixen | well, yeah, tuple is better isolated than bytearray |
| 22:44:07 | evan | hm so |
| 22:44:29 | evan | it's almost definitely part of an object reference |
| 22:44:45 | evan | (gdb) p t->full_size_ |
| 22:44:45 | evan | $3 = 35747248 |
| 22:44:45 | evan | (gdb) p (void*)$3 |
| 22:44:47 | evan | $4 = (void *) 0x22175b0 |
| 22:45:12 | evan | t is 0x102217c48 |
| 22:45:13 | brixen | what is the diff between the size and other's address? |
| 22:45:19 | evan | which other? |
| 22:45:31 | evan | it's only part of an address |
| 22:45:31 | brixen | in copy_body |
| 22:45:35 | evan | huh? |
| 22:45:41 | evan | i'm not getting a crash in copy_body |
| 22:45:46 | evan | so i'm not sure what you're refering to |
| 22:45:52 | brixen | my segfault is in ObjectHeader::copy_body |
| 22:45:57 | evan | ok |
| 22:45:58 | evan | not here. |
| 22:46:03 | brixen | ah |
| 22:46:03 | evan | i put in an assert in baker.cpp |
| 22:46:08 | evan | to catch a giant tuple before it gets promoted |
| 22:46:17 | evan | otherwise we spin out chewing up memory |
| 22:46:30 | brixen | oh, well you're just a couple frames up then I'm guessing |
| 22:46:42 | brixen | where's the assert? |
| 22:46:51 | evan | if(Tuple* t = try_as<Tuple>(obj)) { |
| 22:46:52 | evan | assert(t->full_size_ < 20000000); |
| 22:46:52 | evan | } |
| 22:46:57 | evan | on line 67 of baker.cpp |
| 22:47:20 | brixen | :sp |
| 22:47:23 | brixen | er ... :) |
| 22:47:32 | evan | heh |
| 22:48:06 | brixen | ok, sure, and I'm below promote_object, so in the same place basically |
| 22:48:35 | brixen | what's the diff between the 32bits of obj's address and the full_size_ there? |
| 22:48:49 | brixen | I'm just wondering if it's a constant difference |
| 22:50:05 | evan | thats not a full address |
| 22:50:10 | evan | it's missing the 10 in the high bits |
| 22:51:01 | evan | if I add the high bits in, it's 1688 |
| 22:51:06 | evan | i doubt it's related. |
| 22:51:16 | evan | i think it's just another straw reference |
| 22:52:01 | evan | stray, rather. |
| 23:03:42 | brixen | yay, hit a watch point with a funny value :) |
| 23:04:09 | brixen | http://gist.github.com/563125 |
| 23:05:04 | evan | i have a hunch. |
| 23:05:05 | brixen | hmm, the right magnitude |
| 23:05:16 | evan | we might have a latent bug |
| 23:05:28 | evan | where we tried to set the -1'st field in the tuple |
| 23:05:37 | brixen | yeah |
| 23:05:39 | evan | and when they were unsigned, that was a field way out in the middle of nowhere |
| 23:05:43 | evan | but not that they ARE signed |
| 23:05:44 | evan | when it sets -1 |
| 23:05:49 | evan | thats where full_size_ is. |
| 23:06:16 | evan | i'm putting asserts around all field[] code to be sure the index is positive |
| 23:06:20 | evan | we'll see what that does. |
| 23:06:21 | brixen | I've seen Tuple::put come through this watch |
| 23:09:09 | evan | k |
| 23:09:35 | evan | we need to be guarding field[] more |
| 23:09:44 | evan | we should do that as part of this cleanup. |
| 23:10:11 | brixen | k |
| 23:21:51 | brixen | awesome, got an assert failure |
| 23:22:41 | evan | where at? |
| 23:23:12 | brixen | well, it was building, so I'm retrying in gdb |
| 23:23:55 | brixen | ah dammit, wrong condition |
| 23:23:58 | brixen | :/ |
| 23:24:04 | evan | :/ |
| 23:24:09 | evan | i got it to go away here |
| 23:24:10 | evan | i think i'm close |
| 23:27:12 | evan | BINGO. |
| 23:27:23 | evan | got it |
| 23:27:25 | evan | yep |
| 23:27:28 | evan | code is passing -1 to put |
| 23:27:50 | evan | :/ |
| 23:27:53 | evan | it's code you just added I think. |
| 23:28:22 | evan | did you just add |
| 23:28:30 | evan | it "raises an Rubinius::ObjectBoundsExceededError when index is less than zero" do |
| 23:28:37 | evan | in core/tuple/shared/put.rb |
| 23:28:38 | evan | ? |
| 23:29:36 | brixen | erm, no that's been there |
| 23:29:47 | evan | ok, then we had a latent bug |
| 23:29:52 | evan | because we were using a size_t there |
| 23:30:01 | evan | so it was setting some far off memory |
| 23:30:10 | evan | could be the source of some of those random crashes we've seen. |
| 23:30:19 | evan | i'm going to shore up tuple right now |
| 23:30:29 | evan | with object_bounds and asserts. |
| 23:30:54 | brixen | ok |
| 23:30:56 | evan | oh oh |
| 23:30:57 | evan | no |
| 23:30:59 | evan | it wasn't a latent bug |
| 23:31:15 | evan | the -1 would turn into a giant positive size_t |
| 23:31:25 | evan | which would fail the num_fields() <= idx comparison |
| 23:31:31 | evan | and raise the exception |
| 23:31:35 | evan | ok |
| 23:31:46 | evan | so all these places that we changed size_t to native_int |
| 23:31:55 | evan | we need to also add code to check for negative values |
| 23:32:13 | brixen | yeah |
| 23:33:32 | brixen | ok, now I have a real assert failure |
| 23:33:46 | brixen | Array::Iterator::at |
| 23:38:23 | evan | brixen: i'm making the code a little less forgiving |
| 23:38:28 | evan | and closer to what i talked about before |
| 23:38:36 | evan | ie, more C++ exception isolation |
| 23:39:51 | brixen | hmm ok |
| 23:41:52 | evan | you'll see |
| 23:41:54 | evan | it's not a big change. |
| 23:42:17 | brixen | that's cool |
| 23:42:28 | brixen | are you just going to push it to native_int br? |
| 23:42:35 | evan | yep |
| 23:42:37 | brixen | k |
| 23:42:38 | evan | you can merge it back then. |
| 23:42:48 | brixen | ok |
| 23:50:00 | brixen | hmm, this is a challenging issue |
| 23:50:09 | brixen | ideally, we would not have redundant bounds checks |
| 23:50:24 | brixen | but Array typically returns nil for out-of-bounds access |
| 23:50:47 | evan | yeah, i know |
| 23:50:48 | brixen | but if we rely on users of Tuple to do bounds checks, Tuple is vulnerable |
| 23:50:56 | evan | i'm refactoring out the part that is core |
| 23:51:03 | brixen | but if we do bounds checks before calling Tuple, we duplicate it |
| 23:51:06 | evan | and wrapping it multiple places with the different bounds checks |
| 23:51:35 | brixen | so, Array::Iterator blindly dispatched and expects Tuple to raise |
| 23:51:45 | brixen | but I don't think raising is the best idea |
| 23:51:59 | brixen | I think Array::Iterator should perhaps have Array behavior |
| 23:52:05 | brixen | ie, nil for OOB access |
| 23:52:50 | brixen | ideally, all of Array would be in Ruby and the JIT would handle removing duplicate checks :) |
| 23:53:25 | evan | yeah |
| 23:53:29 | evan | evenutally it will handle that |
| 23:53:42 | brixen | so what do you think for Array::Iterator? |
| 23:53:42 | evan | better to have more checks than not enough |
| 23:53:44 | evan | as we just saw. |
| 23:53:58 | brixen | should it return nil or depend on Tuple to raise? |
| 23:54:11 | evan | well, where are we using Array::Iterator? |
| 23:54:37 | brixen | a bunch of places in Array |
| 23:55:00 | evan | well, if it's there |
| 23:55:03 | evan | it needs to be checking. |
| 23:55:10 | evan | even if they are redundent |
| 23:55:45 | brixen | hmm, yeah, but I'm wondering whether to return nil or raise |
| 23:55:52 | evan | ah. |
| 23:56:10 | evan | mmm |
| 23:56:22 | evan | return nil seems more consistent |
| 23:56:25 | evan | since Array does that |
| 23:56:34 | evan | Tuple cases bounds errors |
| 23:56:35 | brixen | yeah |
| 23:56:40 | evan | tuple raises, rather. |
| 23:56:51 | brixen | I think I was thinking it was more implementation level, so it should fail harder |
| 23:57:24 | evan | up to you. |
| 23:57:31 | brixen | but I think the bounds errors apply more clearly to a very low level structure like Tuple |
| 23:57:53 | evan | performance wise, Array::Iterator is already a bit slower than accessing the underlying tuple directly |
| 23:58:07 | evan | so if we're worried about performance there, we should take a critical eye to Array::Iterator usage. |
| 23:58:19 | brixen | well, perhaps we should remove it then |
| 23:58:25 | brixen | since the jit is performing so well |
| 23:58:34 | evan | ok |
| 23:58:37 | evan | i'd be fine with that. |
| 23:58:42 | evan | i've removed it in a few places. |
| 23:58:42 | brixen | the jit should handle any while loops pretty well these days, right? |
| 23:58:52 | evan | yeah |
| 23:58:56 | brixen | k |
| 23:59:05 | brixen | this predates the jit being on by default |
| 23:59:05 | evan | we don't need another abstraction for tuple to sit behind |
| 23:59:07 | evan | Array does that. |
| 23:59:15 | brixen | I'd rather put more pressure on the jit |
| 23:59:32 | evan | well, do you think Array::Iterator is faster in the interpreter than not using it? |
| 23:59:35 | brixen | ie, if we get slow reports that involve looping, we are better situated to handle that now |
| 23:59:42 | brixen | yeah, it was faster |