Show enters and exits. Hide enters and exits.
| 00:02:13 | RyanTM leaves the room. | |
| 00:06:33 | tarcieri | josb: here now |
| 00:06:47 | josb | tarcieri: thanks for your email |
| 00:07:18 | tarcieri | yeah, np |
| 00:07:36 | josb | tarcieri: I was aware of Rev, it looks great. the issue is the difficulty of porting swiftiply to use it. |
| 00:07:39 | therealadam leaves the room. | |
| 00:08:30 | tarcieri | yeah, I've talked with kirk about it before |
| 00:08:33 | josb | tarcieri: I don't want to say this too loudly but EM, while it fixes a particular problem well enough, doesn't seem to be a good long0-term solution. |
| 00:08:59 | tarcieri | as soon as I can multiplex Scheduler.send_on_readable/send_on_writable on the same channel I'll be able to port Rev to Rubinius |
| 00:09:40 | josb | That will be cool. |
| 00:09:44 | brixen | heh, someone help that man already :) |
| 00:10:09 | josb | Realistically it looks like I'll have to install nginx to solve my immediate problem. |
| 00:10:11 | brixen | we need to have *everything* cooler than 1.9 |
| 00:10:26 | tarcieri | yeah |
| 00:10:44 | tarcieri | except a working Reactor :) |
| 00:11:26 | josb | Because adding the bits Puppet needs looks hard. Heck, I can't even store the extracted subject_name in a member variable. Compiles fine, SEGVs hard. |
| 00:11:43 | tarcieri | josb: there's a pretty minimal EM compatibility API for Rev, although it's not nearly feature complete enough to run Swiftiply |
| 00:12:26 | josb | tarcieri: that's what I thought. The only way to make it work would be to port Swiftiply to Rev, which sounds like a major undertaking. |
| 00:13:01 | wyhaines | It'd be a fair amount of work. |
| 00:13:04 | josb | I wish I was smarter :( |
| 00:13:05 | tarcieri | josb: The main problem with EM right now is it's over 3 klocs of confusing Ruby and 7 klocs of even more confusing C++... and the maintainer is completely awol, afaict |
| 00:13:28 | wyhaines | em could use a lot of cleanup, for sure. |
| 00:13:29 | tarcieri | he used to be really responsive and quick at fixing bugs, now he hasn't responded to any e-mail for at least a month if not two |
| 00:13:51 | josb | tarcieri: agreed. It's non-trivial to make any kind of change to it. |
| 00:14:10 | wyhaines | tarcieri, yeah. I interacted with him about a month ago...he was in the midst of some sort of VC funding thing for his company. Haven't heard from him since. |
| 00:14:18 | tarcieri | aah |
| 00:14:34 | josb | tarcieri: Kirk said he thought Francis is trying to do a startup and get funding. |
| 00:14:49 | tarcieri | yeah, he just said that a few seconds ago too |
| 00:14:49 | tarcieri | heh |
| 00:14:50 | josb | wyhaines: Hi :) |
| 00:15:02 | josb | I don't type very fast... |
| 00:15:28 | wyhaines | Not a startup. His company has been around for a while, but yeah.... |
| 00:15:43 | tarcieri | josb: I think doing an EM compatibility layer is a better approach, as it would let you run EM applications on both Rev and "Revinius" |
| 00:15:52 | wyhaines | I do have commit privs on the repository, but I don't have a toolset installed that will build the windows gems for a release. |
| 00:16:00 | josb | wyhaines: okay, misremembered. |
| 00:17:09 | josb | tarcieri: true. How hard is that? |
| 00:17:52 | tarcieri | josb: well, the API's pretty huge... |
| 00:18:09 | tarcieri | but for the most part the wrappers are trivially thin |
| 00:18:10 | josb | Can we just do the part that Swiftiply needs, then? <g> |
| 00:18:37 | tarcieri | http://pastie.caboo.se/178879 |
| 00:18:52 | tarcieri | ^^^ there's a wrapper for the core EventMachine functionality |
| 00:19:06 | josb | I can try that and see what's missing...? |
| 00:19:24 | tarcieri | yeah |
| 00:19:55 | josb | OK, I'll install rev in my test setup and give it a spin, see how far I get |
| 00:20:01 | josb | Thanks! |
| 00:20:32 | tarcieri | cool |
| 00:20:54 | RyanTM enters the room. | |
| 00:21:39 | tarcieri | I really need to send a patch to ruby-core with nonblocking SSL support |
| 00:21:45 | d2dchat leaves the room. | |
| 00:21:57 | tarcieri | trying to add it through a 3rd party C extension is ludicrous |
| 00:22:00 | josb | Yeah. |
| 00:22:08 | tarcieri | seems to work, I guess :/ |
| 00:22:15 | KirinDave leaves the room. | |
| 00:22:22 | josb | At least you'll get more testers that way... |
| 00:22:38 | tarcieri | I have externs that reference global variables in the openssl extension :/ |
| 00:22:48 | tarcieri | ick |
| 00:22:58 | josb | Oh well :) |
| 00:23:10 | josb | One step at a time |
| 00:23:24 | tarcieri | yeah |
| 00:29:19 | evan | i hate that having global VALUE's is commonplace in MRI extensions |
| 00:32:39 | tarcieri | heh, well it has global VALUEs... but it also has a bunch of other random crap which gets referenced via extern across a bunch of modules |
| 00:32:49 | tarcieri | gg common coupling... ugh |
| 00:33:10 | evan | i was just complaining in general |
| 00:33:33 | tarcieri | heh |
| 00:33:35 | josb | can't wait until Rubinius can run Swiftiply |
| 00:33:51 | evan | me too |
| 00:33:55 | wyhaines | I think there will be some cool potential, there. |
| 00:34:01 | wyhaines | With the multi-VM support..... |
| 00:34:14 | tarcieri | I can't wait until Rubinius can run everything I have on Rev/Revactor/EventMachine |
| 00:35:04 | tarcieri | it will be nice to have massively concurrent I/O be a core feature of the VM, not something added as an afterthought through a hokey C extension |
| 00:36:06 | josb | You mean fastthread? |
| 00:36:15 | tarcieri | I mean things like EventMachine and Rev |
| 00:36:23 | josb | Oh. |
| 00:36:34 | josb | Gotcha. |
| 00:36:51 | tarcieri | All I/O in MRI is based around select() |
| 00:36:51 | boyscout | 1 commit by Vladimir Sizikov |
| 00:36:52 | boyscout | * New rubyspecs for BigDecimal#truncate.; 8ff9ae4 |
| 00:36:57 | tarcieri | I/O multiplexing, that is |
| 00:37:42 | josb | Yeah. |
| 00:39:24 | tarcieri | in MRI things like EventMachine and Rev have to busy wait to allow Ruby to run its green threads |
| 00:39:58 | josb | Because they are not part of the same big select() loop? |
| 00:40:02 | tarcieri | yep |
| 00:40:27 | josb | This is what Channels solve in Rubinius? |
| 00:40:28 | tarcieri | I'd like to try using libev's idle watchers to run Ruby threads |
| 00:40:31 | tarcieri | yep |
| 00:40:47 | josb | Neat. |
| 00:41:26 | d2dchat enters the room. | |
| 00:41:29 | evan | MRI's scheduler is entrenched |
| 00:41:41 | evan | libev is the scheduler in rubinius |
| 00:41:56 | evan | so it's quite extensible |
| 00:42:05 | josb | So other things can hook into it? |
| 00:42:20 | josb | Through the Channel sbstraction |
| 00:42:24 | evan | yeah |
| 00:42:34 | evan | i'm sure there will be a lot more API we wire up for the Scheduler |
| 00:42:36 | josb | Sounds like a good design. |
| 00:42:38 | evan | i'm open to it |
| 00:42:47 | evan | depending on what tarcieri and EM and such need |
| 00:46:00 | tarcieri | all I really need for now is controlling what gets sent to the channel when an event occurs |
| 00:46:25 | rubuildius_amd64 | Vladimir Sizikov: 8ff9ae455; 1989 files, 6478 examples, 22571 expectations, 0 failures, 0 errors; http://rafb.net/p/YC6ySd56.html |
| 00:50:39 | KirinDave enters the room. | |
| 00:53:05 | KirinDave leaves the room. | |
| 00:53:22 | rubuildius_ppc | Vladimir Sizikov: 8ff9ae455; 1989 files, 6481 examples, 22600 expectations, 0 failures, 0 errors; http://pastie.caboo.se/paste/178904 |
| 00:55:41 | jayWHY leaves the room. | |
| 00:58:37 | benstiglitz leaves the room. | |
| 00:58:54 | radarek leaves the room. | |
| 00:59:00 | twbray enters the room. | |
| 00:59:21 | antares leaves the room. | |
| 01:00:58 | mernen enters the room. | |
| 01:01:57 | RyanTM leaves the room. | |
| 01:05:07 | imajes leaves the room. | |
| 01:08:05 | cored enters the room. | |
| 01:08:10 | cored leaves the room. | |
| 01:08:55 | lopex leaves the room. | |
| 01:09:19 | cored enters the room. | |
| 01:11:14 | smparke1 leaves the room. | |
| 01:13:29 | headius enters the room. | |
| 01:13:30 | rue | EventMachine seems a lost cause at this point |
| 01:14:00 | rue | tarcieri: We should try to do a short planning meeting in the next week or two |
| 01:15:04 | tarcieri | I don't think anyone will ever step up to maintain EventMachine besides Francis... that has to be the lest idiomatic Ruby I've seen nest to mkmf, not to mention the gargantuan C++ extension |
| 01:15:09 | tarcieri | and rue: regarding? |
| 01:15:19 | tarcieri | s/lest/least/ |
| 01:15:33 | JimRoepcke | speaking of Revactor, does anyone know what happened to the Revactor web site? It's been down for quite a while |
| 01:15:39 | tarcieri | it's back up |
| 01:15:42 | tarcieri | the server it was on crashed |
| 01:15:46 | wyhaines | I'd maintain it, but I'd end up rewriting a lot of it. |
| 01:15:58 | tarcieri | oh wait |
| 01:15:59 | tarcieri | ugh |
| 01:16:10 | rue | tarcieri: Massively concurrent I/O |
| 01:16:18 | tarcieri | rue: aah |
| 01:16:23 | rue | wyhaines: You too, I am co-opting you ;) |
| 01:16:37 | josb | wyhaines: please, port swiftiply to Rev instead :) |
| 01:16:57 | foysavas leaves the room. | |
| 01:17:04 | rue | tarcieri, wyhaines: I originally expected to start working on the cluster/server mode in early May |
| 01:17:18 | tarcieri | Jimroepcke: okay, *now* it's back up :) |
| 01:17:25 | wyhaines | josb, may not be an either/or. I think there is a lot of usable structure in EM. It just needs...love. |
| 01:17:48 | tarcieri | happy to just throw away EM's entire implementation and keep its API around |
| 01:17:53 | tarcieri | there's enough stuff built on top of it already |
| 01:18:12 | tarcieri | and it's not like EM will *ever* work with Rubinius |
| 01:18:31 | wyhaines | tarcieri: Yeah, that's kind of what I mean. |
| 01:18:35 | josb | wyhaines: that's where the EM compat API in Rev comes in... |
| 01:18:45 | smparke1 enters the room. | |
| 01:18:46 | tarcieri | yeah, if it were fleshed out a little better |
| 01:18:46 | tarcieri | heh |
| 01:19:15 | josb | Once that's in place we 're good. I'll be getting to testing this with Swiftiply shortly... |
| 01:19:31 | wyhaines | I think, in the long run, EM would end up being completley rewritten with better Ruby, and, at least for Rubinius, a complete reactor replacement. |
| 01:19:50 | tarcieri | the nice thing is that 95% of Rev's existing codebase could be reused on top of Rubinius if I just rewrote Rev::IO to use Channels |
| 01:20:01 | tarcieri | but Scheduler doesn't quite have all the API features I need to do that yet |
| 01:20:22 | josb | Is somebody working on the Scheduler API? |
| 01:20:42 | chris2 leaves the room. | |
| 01:20:46 | tarcieri | has seriously considered trying to modify it, but it doesn't appear to be trivial by any stretch of the imagination :/ |
| 01:20:50 | evan | tarcieri: well, lets be sure to add whatever you need to Scheduler |
| 01:21:00 | JimRoepcke | Thanks tarcieri. I ported my Erlang code (a fairly simple producer/consumer example) over to Rubinius actors yesterday, but I won't have time to try out Revactor in time for my talk on Sunday. I'll check it out later though |
| 01:21:28 | tarcieri | JimRoepcke: yeah, Revactor and Rubinius Actors should hopefully be API compatible |
| 01:21:36 | tarcieri | except for the Actor::TCP stuff |
| 01:21:38 | tarcieri | evan: cool |
| 01:22:20 | evan | the Scheduler API should be cleaner inside the VM with the C++ version |
| 01:23:13 | JimRoepcke | tarcieri: cool, i should at least mention it. Do you have any idea which performs better? I had trouble getting more than a few hundred simple actors collaborating with Rubinius |
| 01:23:48 | JimRoepcke | with Erlang I had 900,000 actors working on my MacBook Pro :) |
| 01:24:05 | JimRoepcke | i would have tried more but i was getting worried about melting points of various components |
| 01:24:13 | evan | huzzah! newest valgrind on debian unstable has all the proper suppressions now |
| 01:24:18 | evan | a fully clean run |
| 01:24:48 | tarcieri | JimRoepcke: Revactor had about twice the messaging throughut of Rubinius Actors last I checked, but that's probably not a valid figure anymore |
| 01:25:01 | tarcieri | JimRoepcke: Erlang was, uhh, two orders of magnitude higher |
| 01:25:24 | rue | tarcieri: How high a level of abstraction do you think we can provide through the VM itself? |
| 01:25:50 | tarcieri | rue: abstraction for what? |
| 01:26:03 | rue | tarcieri: Concurrency support |
| 01:26:49 | tarcieri | rue: well I already asked evan about separate heaps for each Task and he said that was a no-go... that'd be the main thing that would help, as you could automatically load balance Tasks across CPUs... |
| 01:26:57 | tarcieri | ala the Erlang SMP scheduler |
| 01:27:11 | rue | evan: Reconsider ;) ^ |
| 01:27:45 | evan | isn't that just multi-VM? |
| 01:28:19 | tarcieri | evan: it'd be MVM... with load balancing at the concurrency primitive level |
| 01:28:35 | kentaur enters the room. | |
| 01:28:48 | tarcieri | evan: Erlang runs one LWP per CPU and load balances processes between the threads |
| 01:29:21 | evan | so, where are the seperate heaps in there? |
| 01:29:28 | tarcieri | each process has its own heap |
| 01:29:33 | tarcieri | i.e. shared-nothing |
| 01:29:45 | evan | and one process per cpu? |
| 01:29:56 | tarcieri | An Erlang process is a lot closer to a Task |
| 01:29:59 | evan | LWP == native thread |
| 01:30:02 | tarcieri | yeah |
| 01:30:13 | kentaur leaves the room. | |
| 01:30:22 | evan | ok, so again, how is that now MVM? |
| 01:30:25 | evan | if they're share nothing |
| 01:30:40 | rue | Our model currently is VM per LWP |
| 01:30:40 | tarcieri | well, I suppose I should say, an Erlang process is an Actor... |
| 01:30:45 | tarcieri | yeah |
| 01:30:50 | kentaur enters the room. | |
| 01:30:53 | tarcieri | Erlang has a "node" abstraction... a node has many processes |
| 01:30:59 | tarcieri | One node can span multiple LWPs |
| 01:31:02 | tarcieri | that's the distinction |
| 01:31:35 | tarcieri | it's effectively one "VM" per multiple hardware threads |
| 01:31:38 | evan | so, shared operations on a single heap between LWPs is the hard thing |
| 01:31:54 | tarcieri | well, that's just it |
| 01:31:58 | rue | Erlang also migrates the processes between LWPs? |
| 01:31:59 | tarcieri | there isn't a single heap |
| 01:32:02 | evan | breaking the heaps apart, and not allowing them to share things is a lot easier |
| 01:32:03 | tarcieri | rue: yes |
| 01:32:08 | tarcieri | evan: exactly |
| 01:32:27 | evan | ok, so, in MVM, they're are seperate heaps, one per LWP |
| 01:32:31 | evan | so how is that different? |
| 01:32:40 | evan | i keep asking because to me, they sound the same. |
| 01:32:55 | evan | ok, perhaps the migration is the point you're trying to make |
| 01:33:03 | tarcieri | evan: It'd be like MVM, if you could serialize an Actor and message pass it to another VM |
| 01:33:04 | evan | but if they're share nothing, how is that possible? |
| 01:33:14 | tarcieri | evan: and you had something in the background doing that automatically |
| 01:33:21 | evan | each process is shared nothing? so it and all it's data is moved? |
| 01:33:24 | tarcieri | i.e. this VM is too busy, so offload some Actors onto a different VM |
| 01:33:33 | tarcieri | evan: it'd be as if each Task were shared nothing |
| 01:33:43 | evan | ok, i think i gotcha |
| 01:34:17 | evan | until we can allow for cooperative access to a single heap by multiple LWPs, i'm not sure how we'll get there |
| 01:34:38 | tarcieri | yeah, I think it's fundamentally a much different architecture |
| 01:34:39 | evan | erland was designed with this restriction, ruby was probably designed in exactly the opposite. |
| 01:34:44 | tarcieri | yep |
| 01:34:57 | evan | it's certainly possible, java does it. |
| 01:35:00 | evan | we'll get there eventually |
| 01:35:39 | tarcieri | I mean, BEAM is designed to where there's no shared heap save for a few select cases |
| 01:35:50 | tarcieri | there's a few experimental shared heaps |
| 01:36:10 | tarcieri | but for the most part everything happens at the "process" level |
| 01:36:15 | tarcieri | including the garbage collection |
| 01:36:38 | evan | right |
| 01:36:41 | headius | hmmm |
| 01:36:42 | tarcieri | when one "process" sends the other one a message, it's copied from the first one's heap to the second's |
| 01:36:57 | evan | yep |
| 01:36:57 | tarcieri | so the first one can garbage collect it if it wants |
| 01:37:05 | tarcieri | the second one has its own copy |
| 01:37:36 | tarcieri | and when you have that kind of setup, you can copy entire processes between hardware threads, because they're not dependent on anything in the original LWP |
| 01:37:36 | headius | as far as I know most modern JVMs don't have lightweight processes...but of course actual threads end up being even harder to allow heap sharing across |
| 01:37:52 | tarcieri | headius: well, LWP == OS thread/hardware thread/whatever |
| 01:38:22 | evan | he's just using that as a generic term for a native thread |
| 01:38:24 | headius | I'd argue that actual concurrent threads are a whole different ballgame when it comes to safe access to a shared heap |
| 01:38:27 | tarcieri | I don't want to call an Erlang process a "coroutine" |
| 01:38:33 | tarcieri | because they are pre-emptable |
| 01:38:47 | tarcieri | but they don't have the normal headaches of threads because they're shared-nothing |
| 01:42:36 | rue | *catching up* OK, so issue for us is the inability to migrate Tasks to other VMs and thereby achieving load balancing? |
| 01:42:55 | boyscout | 1 commit by Vladimir Sizikov |
| 01:42:56 | tarcieri | I mean, it's not strictly necessary |
| 01:42:56 | boyscout | * A bit more rubyspecs for BigDecimal#sub and #to_s.; 71a4b0a |
| 01:43:14 | tarcieri | Erlang/OTP already wraps up the idea of distributing an application across multiple nodes |
| 01:43:38 | tarcieri | just need something similar for, say, MVM |
| 01:43:43 | evan | tarcieri: what if you could migrate a Task between 2 VMs? |
| 01:43:47 | tarcieri | and you can scatter/gather tasks across multiple VMs |
| 01:43:50 | dgtized | I kind of thing it's going to be painful to get that style of copy without share nothing symatnics |
| 01:43:52 | tarcieri | evan: that would be awesome |
| 01:44:02 | evan | thats not too outlandish |
| 01:44:04 | tarcieri | dgtized: I do too |
| 01:44:07 | headius | what if you had native threads |
| 01:44:16 | evan | and is an isolated chunk of work |
| 01:44:22 | dgtized | what I think would be awesome though |
| 01:44:32 | dgtized | is if we could serialize a VM instance |
| 01:44:38 | dgtized | because that is share nothing |
| 01:44:44 | tarcieri | dgtized: It doesn't work unless Actors talking in the same VM behave exactly like Actors in two different VMs |
| 01:44:51 | tarcieri | well, it works, but it's error prone |
| 01:44:54 | evan | and then perhaps reload it at a later time? perhaps with an object editor builtin? |
| 01:44:55 | rue | tarcieri: Addendum, are nodes split over several OS processes? |
| 01:44:57 | evan | :D |
| 01:45:08 | dgtized | evan: no, reload it on another machine |
| 01:45:17 | tarcieri | rue: With Erlang's SMP scheduler a node spans multiple OS threads |
| 01:45:20 | tarcieri | with a single process |
| 01:45:29 | evan | dgtized: the only trouble with that is how to handle references to external resources |
| 01:45:35 | evan | file descriptors, sockets, etc. |
| 01:45:39 | rue | dgtized: Basically an image? |
| 01:45:53 | tarcieri | evan: That was a huge bitch in Erlang according to Joe Armstrong |
| 01:46:01 | dgtized | evan: but those aren't actually in that VM instance right? they are at a higher level in the event machine? |
| 01:46:01 | evan | which was? |
| 01:46:06 | tarcieri | reentrant I/O across multiple hardware threads |
| 01:46:28 | evan | dgtized: but they're still exposed as such |
| 01:46:36 | dgtized | evan: I'm not saying you could do it seamlessly for all programs, but I bet there is a subset that could work |
| 01:46:48 | evan | certainly. |
| 01:46:53 | twbray leaves the room. | |
| 01:47:02 | rue | Yeah, an implementation with a caveat would probably help lots of folks |
| 01:47:05 | evan | the semantics of what happens to those things that reference external resources would have to be decided |
| 01:47:16 | evan | thats all i'm saying. |
| 01:47:21 | tarcieri | I think the much simpler abstraction is to be able to scatter what you're doing across VMs |
| 01:47:23 | evan | it's software people, nothing is impossible. |
| 01:47:26 | rue | Maybe try to re-establish and if it fails, give the user a hook to try and then fail |
| 01:47:29 | tarcieri | pmap |
| 01:48:04 | tarcieri | Erlang's SMP scheduler is awesome, but something like that doesn't really seem feasible in Rubinius |
| 01:48:13 | dgtized | I'm thinking you would have to make a safe way to talk between VM's that could do with a freeze/reload |
| 01:48:19 | trythil leaves the room. | |
| 01:48:23 | tarcieri | the VM needs to be designed from the ground up with a shared-nothing concurrency primitive |
| 01:48:24 | JimRoepcke | why not tarcieri? |
| 01:48:45 | EugZol enters the room. | |
| 01:48:48 | dgtized | that way you leave all the io stuff in a particular VM instance, and do processing in the other ones |
| 01:48:49 | JimRoepcke | could shared-nothing be an option on a per VM basis? |
| 01:49:01 | EugZol leaves the room. | |
| 01:49:39 | dgtized | that way the only external resources for that VM would be the inter-vm communication |
| 01:50:09 | tarcieri | dgtized: VMActor lets you scatter/gather across VMs already |
| 01:50:11 | JimRoepcke | Erlang is shared-nothing from the programmer's perspective but internally it does share things. Having immutable data and single-assignment variables would make that easier though |
| 01:50:27 | tarcieri | JimRoepcke: there's some shared state, but it's moot |
| 01:50:33 | dgtized | tarcieri: but this would let you scatter across a VM, then pick up a VM and move it to another machine |
| 01:50:39 | tarcieri | JimRoepcke: Unless you're using the hybrid heap processes share nothing |
| 01:50:41 | dgtized | tarcieri: and finish the scatter |
| 01:50:42 | headius | seriously though, what's wrong with just making rubinius support native threads |
| 01:50:48 | headius | why wouldn't that solve this |
| 01:50:55 | tarcieri | headius: the current model is fine, imo |
| 01:51:05 | tarcieri | in fact, it's optimal |
| 01:51:06 | headius | well, except that it doesn't do what you want, yeah? |
| 01:51:20 | rubuildius_amd64 | Vladimir Sizikov: 71a4b0a51; 1989 files, 6478 examples, 22571 expectations, 0 failures, 0 errors; http://rafb.net/p/TUCKec44.html |
| 01:51:25 | dgtized | headius: I think native threads actually complicate the model, and I don't think they do solve these issues |
| 01:51:28 | tarcieri | it gets as close to what I want as I think Ruby can |
| 01:51:39 | headius | hmm, how do you figure? |
| 01:51:52 | headius | if you could spin up a few tasks and have them run in parallel in the same VM, why isn't that better? |
| 01:51:57 | dgtized | headius: because native threads are a share object space approach |
| 01:52:00 | tarcieri | ideally you want to distribute load across one (or two) OS threads per CPU |
| 01:52:10 | headius | dgtized: only if you actually choose to share the objects |
| 01:52:33 | headius | you take the same precautions you would with native threads on any other system and you can have exactly the same level of concurrency |
| 01:52:57 | rue | But that is no fun :) |
| 01:53:05 | dgtized | headius: right but if we set it up right then the restrictions would make it easier to program, while gaining the performance |
| 01:53:24 | tarcieri | you can use a thread pool, but I think the Erlang approach is much more elegant, and I don't think Rubinius is far off from it now |
| 01:53:35 | tarcieri | you just treat each VM as an Erlang "node" |
| 01:53:45 | tarcieri | and distribute your application across one node per CPU |
| 01:53:50 | JimRoepcke | headius: native threads are way too heavy for big actor systems |
| 01:53:53 | headius | I haven't heard why threads are a problem |
| 01:54:01 | tarcieri | native threads? |
| 01:54:11 | tarcieri | context switching penalty |
| 01:54:11 | headius | well rubinius mvms have their own threads anyway |
| 01:54:12 | JimRoepcke | headius: you can only do a few thousand on most OS before they go kaput |
| 01:54:19 | tarcieri | you can use a thread pool |
| 01:54:31 | tarcieri | but that will require all sorts of thread synchronization |
| 01:54:34 | headius | so you're either talking about something completely different or a pretty major rework of out rubinius VMs are pumped |
| 01:54:44 | tarcieri | Rubinius MVM is practically shared nothing as is |
| 01:54:58 | tarcieri | which virtually eliminates the overhead of all those thread synchronizing syscalls |
| 01:55:11 | tarcieri | the more concurrency you can handle in userspace, the better it's going to work |
| 01:55:29 | rubuildius_ppc | Vladimir Sizikov: 71a4b0a51; 1989 files, 6481 examples, 22600 expectations, 0 failures, 0 errors; http://pastie.caboo.se/paste/178923 |
| 01:55:43 | rue | Is there anything we can do with OS processes? |
| 01:55:58 | rue | Aside from COW semantics and such |
| 01:56:00 | tarcieri | it sounds like the VM's designed in such a way that HWPs aren't necessary |
| 01:56:44 | rue | Yeah, they should not be necessary--but I wonder if they could be exploited for additional fun |
| 01:56:48 | dgtized | rue: you are saying load up the application code, and then fork to the CPU count? |
| 01:57:05 | tarcieri | dgtized: that's what I'd like to do with MVM... one hardware thread per CPU |
| 01:57:27 | enebo | but each Rubinius MVM is a seperate Native thread in practice no? |
| 01:57:33 | dgtized | tarcieri: right, I know that's actually what we want, but I think rue is suggesting a halfway easier point |
| 01:57:36 | tarcieri | speaking of which, I have some ncpu code I can commit, which is at least portable across Linux/*BSD/Darwin |
| 01:57:50 | tarcieri | like uhh, a call to retrieve the number of CPUs |
| 01:58:07 | tarcieri | enebo: yes |
| 01:58:16 | tarcieri | dgtized: and Rubinius MVM does that already |
| 01:58:37 | stepheneb enters the room. | |
| 01:58:47 | tarcieri | dgtized: each VM is shared nothing and runs in its own hardware thread... one process |
| 01:58:52 | dgtized | tarcieri: I don't see a problem with supporting both a forking and threading model, I mean we would cost legacy if we didn't |
| 01:59:20 | tarcieri | threading just means any data you want to send between VMs doesn't have to go through the kernel first |
| 01:59:27 | kentaur leaves the room. | |
| 01:59:55 | evan | to the effect that MVM is shared nothing, if we work in the ability to migrate a Task between them, that task could be anywhere on the planet in that case |
| 02:00:06 | evan | because the Task has be effectively serialized to a stream in that case |
| 02:00:21 | evan | er. the destination |
| 02:00:33 | rue | We could buy a small island and fill it with commodity hardware |
| 02:00:34 | tarcieri | evan: honestly I don't see the use of migrating Tasks between VMs unless you can have Tasks that are shared nothing in the first place |
| 02:00:53 | dgtized | so we can migrate continuations? |
| 02:01:00 | evan | effectively |
| 02:01:02 | gnufied_ leaves the room. | |
| 02:01:26 | tarcieri | evan: the problem being that unless a Task is shared nothing it will behave differently before and after it's migrated, which is tremendously confusing |
| 02:01:55 | boyscout | 1 commit by Vladimir Sizikov |
| 02:01:57 | boyscout | * Some more test cases for BigDecimas#finite? and #nonzero?.; bfa69d9 |
| 02:02:37 | rue | You are all over that BigDecimal :P |
| 02:02:54 | evan | tarcieri: thats true |
| 02:02:59 | evan | oh btw |
| 02:03:02 | evan | more code in cpp branch |
| 02:03:09 | evan | i spent all day in valgrind |
| 02:03:21 | evan | 0 bytes in, 0 bytes out now |
| 02:03:27 | agardiner | woot! |
| 02:03:28 | tarcieri | sweet |
| 02:03:38 | evan | there is one last thing that valgrind reports, where it seems like onig perhaps is holding on to a few bytes |
| 02:03:45 | evan | but valgrind reports that it's still reachable |
| 02:03:50 | evan | i'll look at it later |
| 02:04:35 | djwhitt | is the vm actually runable at this point? I mean, how do you use valgrind on it? |
| 02:04:45 | evan | oh, also compiles cleanly on linux now. |
| 02:04:47 | evan | djwhitt: tests. |
| 02:04:50 | evan | 183 of them. |
| 02:05:04 | evan | MUCH better way of finding memory errors. |
| 02:05:11 | rue | Nicely done |
| 02:05:15 | evan | than just running it and hoping to find things |
| 02:05:22 | rue | I poked around the nfunc code a bit |
| 02:05:24 | djwhitt | cool |
| 02:07:02 | imajes enters the room. | |
| 02:08:28 | djwhitt | evan: did you leave that assert in there on purpose in the last commit? |
| 02:08:55 | evan | hm. |
| 02:09:00 | evan | probably not. |
| 02:10:27 | headius_ enters the room. | |
| 02:11:21 | rubuildius_amd64 | Vladimir Sizikov: bfa69d930; 1989 files, 6478 examples, 22571 expectations, 0 failures, 0 errors; http://rafb.net/p/K3aAs897.html |
| 02:11:37 | headius leaves the room. | |
| 02:12:21 | stepheneb leaves the room. | |
| 02:12:32 | cremes enters the room. | |
| 02:13:28 | rubuildius_ppc | Vladimir Sizikov: bfa69d930; 1989 files, 6481 examples, 22600 expectations, 0 failures, 0 errors; http://pastie.caboo.se/paste/178929 |
| 02:20:53 | nicksieger leaves the room. | |
| 02:23:38 | rue | tarcieri: I was also thinking HWPs in the context of transparent hardware clustering |
| 02:23:55 | VVSiz_ enters the room. | |
| 02:25:06 | jayWHY enters the room. | |
| 02:29:25 | tarcieri | rue: yeah, HWPs on different machines makes sense |
| 02:29:45 | trythil enters the room. | |
| 02:30:22 | VVSiz leaves the room. | |
| 02:32:05 | ezmobius leaves the room. | |
| 02:40:53 | enebo leaves the room. | |
| 02:42:04 | stepheneb enters the room. | |
| 02:43:58 | cored leaves the room. | |
| 02:44:05 | binary42 enters the room. | |
| 02:53:07 | stepheneb_ enters the room. | |
| 02:54:31 | stepheneb leaves the room. | |
| 02:57:55 | wyhaines leaves the room. | |
| 02:58:41 | ttmrichter enters the room. | |
| 03:05:15 | twbray enters the room. | |
| 03:21:23 | headius leaves the room. | |
| 03:22:07 | imajes leaves the room. | |
| 03:22:23 | dysinger leaves the room. | |
| 03:25:50 | mernen leaves the room. | |
| 03:29:54 | jayWHY leaves the room. | |
| 03:31:13 | RyanTM enters the room. | |
| 03:34:02 | twbray leaves the room. | |
| 03:34:52 | agardiner | evan: is there some reason why e.g. Selector::create defined name to be an OBJECT instead of a Symbol? |
| 03:38:37 | jtoy enters the room. | |
| 03:41:45 | loincloth leaves the room. | |
| 03:44:05 | stepheneb_ leaves the room. | |
| 03:47:26 | wyhaines leaves the room. | |
| 03:54:42 | stepheneb enters the room. | |
| 04:03:06 | srbaker enters the room. | |
| 04:06:31 | dysinger enters the room. | |
| 04:10:48 | rue | Libtool *strangle* |
| 04:13:12 | rue | Weak symbol handling is fugly |
| 04:14:30 | JimRoepcke enters the room. | |
| 04:14:30 | trythil leaves the room. | |
| 04:14:33 | trythil enters the room. | |
| 04:25:55 | binary42 leaves the room. | |
| 04:38:26 | srbaker leaves the room. | |
| 04:43:10 | rubuildius_ppc leaves the room. | |
| 04:45:39 | stepheneb_ enters the room. | |
| 04:56:20 | twbray enters the room. | |
| 04:56:58 | MenTaLguY enters the room. | |
| 04:57:01 | MenTaLguY | howdy |
| 04:57:12 | MenTaLguY | OOC, when did I first get commit rights in Rubinius? |
| 04:57:18 | MenTaLguY | I think it was early 2007? |
| 05:00:50 | Defiler | Your first commit is on May 24, 2007 |
| 05:02:13 | Defiler | sorry, wrong |
| 05:02:23 | Defiler | 5db681808a3b047 on Jan 16th |
| 05:02:34 | Defiler | (2007) |
| 05:02:54 | stepheneb leaves the room. | |
| 05:06:17 | MenTaLguY | ah, cool |
| 05:06:20 | MenTaLguY | that's close enough |
| 05:08:41 | MenTaLguY | thanks |
| 05:20:44 | wyhaines leaves the room. | |
| 05:24:55 | ssmoot enters the room. | |
| 05:25:16 | ssmoot | evan: ping? got a second to enlighten me on encodings? |
| 05:27:05 | ssmoot | pasted http://pastie.textmate.org/private/fcyiym3jdsee2g9l0kek3q |
| 05:27:27 | ssmoot | well, if anyone is curious what I'm talking about ^^^ ;) It's my crusade of the week. :) |
| 05:29:07 | drbrain | ssmoot: automatic encoding conversion is the devil |
| 05:29:13 | drbrain | since it is lossy by definition |
| 05:29:37 | drbrain | and, Unicode is not yet The Answer |
| 05:29:45 | drbrain | give it a decade or two, but it isn't yet |
| 05:30:08 | ssmoot | drbrain: yeah, but integration with services that only accept one encoding or another is a nightmare without it... I mean, still no Ruby library to properly integrate with MSSQL NVARCHAR fields AFAIK. |
| 05:30:41 | ssmoot | drbrain: UTF16 no, but UCS32 and UTF-8 everybody was happy with I thought? |
| 05:30:53 | drbrain | ssmoot: I trust matz in doing what's right wrt M17N |
| 05:31:15 | drbrain | ssmoot: http://en.wikipedia.org/wiki/Han_unification |
| 05:32:01 | drbrain | also, I believe Japan has two popular encodings for Japanese documents |
| 05:32:15 | drbrain | (one is 7-bit ASCII compatible, the other is not) |
| 05:33:01 | drbrain | so matz and crew probably have enough experience to do something that's going to work |
| 05:33:05 | ssmoot | drbrain: I know. I worked with EUC-JP and ShiftJIS on a Rails app with MSSQL Server. And because there's no easy conversion, it was a massive failure. That was 2 years ago and nothing's changed. M17N won't help... |
| 05:33:28 | Defiler | The only safe thing to do with ShiftJIS is not convert it at all |
| 05:33:44 | ssmoot | drbrain: btw, just figuring things out... ;) Didn't come in to stir up a storm. No offense. :) |
| 05:33:49 | Defiler | I don't think M17N is a bad idea, but the default should be UTF8 not US-ASCII |
| 05:34:35 | Defiler | UTF8 is totally fine, today, for everything but these few use cases, and those are easily fixed by having the default encoding be configurable |
| 05:34:53 | Defiler | Preferably along with this selector namespace junk, so you could just set it for the code you were writing |
| 05:35:06 | drbrain | ssmoot: none taken |
| 05:35:15 | ssmoot | So ok, maybe I have the wrong impression... but my impression is that M17N is going to contain overhead, and that when you have individual strings normalizing instead of normalizing multiple charsets at IO boundaries, pain is likely. |
| 05:35:28 | ssmoot | But I could be wrong... |
| 05:35:55 | Defiler | My understanding was that there would be no conversion if you were operating on like-encoded objects |
| 05:36:01 | drbrain | ssmoot: I have the impression that 1.9 will work kinda-sorta like 1.8 does |
| 05:36:10 | Defiler | e.g. String#<< being passed another shift-jis string won't try to mangle anything up |
| 05:36:26 | drbrain | where you say "I'm working in encoding X" and everything will stay in X unless you explicitly need to use Y |
| 05:37:39 | ssmoot | But regardless, internationalized databases are UTF16 or UTF8 exclusively as far as I'm aware, so is Han Unification even an issue? I'm ~90% certain that ShiftJIS is pure Katakana anyways, a 7bit character-set, and there is no loss... at least our Japanese clients never complained when ShiftJIS was normalized to UTF16-LE. |
| 05:38:21 | Defiler | What do you mean 'pure Katakana'? |
| 05:38:27 | Defiler | That phrase doesn't make sense to me |
| 05:39:02 | Defiler | Most Japanese webapps that need to interface with mobile phones (which is most of them) just round-trip the data in shift-jis, to my understanding |
| 05:39:03 | tarcieri | yeah, what? Shift-JIS supports katakana, hiragana, and kanji... |
| 05:39:06 | ssmoot | I could be insane. ;) I only picked up enough of the lingo to talk to the clients (in English) and write the code. :D |
| 05:40:03 | ssmoot | tarcieri: ah, ok, thanks for the correction then. :) I could be misremembering... or maybe the transaction files we were parsing were just Katakana. I'm not confident now... :o |
| 05:40:18 | drbrain | postgres, at least, supports all these: http://www.postgresql.org/docs/current/static/multibyte.html |
| 05:40:44 | Defiler | Man, a file that only contained katakana would be hard to read. |
| 05:41:24 | drbrain | and the table implies that it operates on data differently depending on character set |
| 05:41:37 | ssmoot | Defiler: well.. numbers too :) |
| 05:41:38 | Defiler | I am definitely worried about what M17N may shackle us with |
| 05:42:04 | drbrain | there is ongoing work on M17N stuff |
| 05:42:42 | ssmoot | drbrain: hmm... but how many of those are internationalized sets? (Just curious) It's obvious much more robust (in Postgres at least) than I suspected... |
| 05:42:42 | drbrain | as something between 1.9.0-0 and present broke some bits of RubyGems, as files were being read as UTF-8 instead of 8-bit ASCII |
| 05:43:10 | drbrain | so we haven't seen the full design for M17N in ruby 1.9 |
| 05:43:22 | drbrain | ssmoot: I'm not sure |
| 05:43:29 | Defiler | What's an 'internationalized set'? |
| 05:43:29 | ssmoot | k, thanks. |
| 05:44:06 | rue | Diplomats and so on |
| 05:44:34 | ssmoot | Defiler: my lack of lingo. ;) I'm trying to make a distinction between charsets intended to unify vs any multi-byte charset. Just for curiousity. No followup argument to make. ;) |
| 05:45:07 | Defiler | Most of those on the list, to my knowledge, are easily preserved losslessly in UTF-8 |
| 05:45:16 | ssmoot | EUC_JP for example is probably English from what I remember, but probably not a lot else. |
| 05:45:22 | drbrain | Unicode and UTF-8 are the combination of a technical problem with a political problem, so everybody whining about it now isn't helping much :( |
| 05:45:38 | drbrain | the two combined will take at least a decade to sort out :( |
| 05:45:59 | Defiler | You have to be careful with user data |
| 05:46:17 | Defiler | If I post a form with Shift-JIS encoding, it is best to store what I sent in Shift-JIS in the database |
| 05:46:48 | Defiler | This belief that it was safe to convert user data seems relatively new, and pernicious |
| 05:47:02 | Defiler | 'was ever', I mean |
| 05:47:22 | ssmoot | drbrain: Windows is UTF-16LE. I just don't see any pragmatic value in Ruby trying to make a leap ahead of everyone else then I suppose. It'll likely cause a decade of integration issues for what? It seems a very un-pragmattic choice. |
| 05:47:34 | wyhaines enters the room. | |
| 05:47:55 | Defiler | 'is' glosses over too much, really |
| 05:48:09 | Defiler | Windows definitely still has a local non-Unicode codepage option that must be respected |
| 05:48:09 | drbrain | ssmoot: the decade of integration issues is going to occur across the entire software industry |
| 05:48:15 | drbrain | it's not just a ruby problem |
| 05:48:30 | Defiler | For example, if you set your locale to Japanese, you get yen characters instead of backslashes |
| 05:48:39 | ssmoot | Defiler: I was being vague I suppose I meant .NET Strings specifically. |
| 05:48:45 | drbrain | "yen sign problem" |
| 05:49:02 | Defiler | .NET doesn't have a lot to do with Ruby |
| 05:49:29 | Defiler | A bigger concern is what to do when you call, say, CreateFileEx |
| 05:49:36 | ssmoot | drbrain: but compared to .NET, it is "just a Ruby problem"... that's what I don't understand. A massive chunk of the market has laid down the law, and the market is OK with that. So why are we creating problems for ourselves? |
| 05:49:42 | Defiler | ..and you pass it a filename that the user read in in shift-jis |
| 05:50:45 | drbrain | ssmoot: remember, there's more than one market. Ruby has a very large Japanese user base, and they have a different idea of what is the best way to solve the problem |
| 05:51:01 | Defiler | .NET also lets you specify the encoding of strings |
| 05:51:23 | Defiler | hence the existence of Japanese keitai-compatible sites written in VB.NET |
| 05:51:23 | ssmoot | Defiler: not since I used it... ? Unless I'm just high. :) |
| 05:52:05 | Defiler | <globalization> config section in web.config lets you set it for a whole project |
| 05:52:21 | ssmoot | Defiler: .NET allows you to specify encodings on IO objects. Strings are assumed to be UTF-16LE, and you have to jump through hoops. |
| 05:52:24 | Defiler | http://www.codeproject.com/KB/aspnet/Encoding_in_ASPNET.aspx |
| 05:52:33 | RyanTM leaves the room. | |
| 05:52:38 | ssmoot | Defiler: that's just setting the default encoding of IO. Not strings. |
| 05:52:41 | ssmoot | I think ;) |
| 05:54:17 | Defiler | I guess I just haven't heard any .NET people complain about its behavior |
| 05:54:20 | madsimian enters the room. | |
| 05:54:25 | Defiler | Maybe ko1 knows some .NET people? |
| 05:54:31 | ssmoot | So yeah, I don't mean to be a pain... thanks for discussing this with me guys. :) |
| 05:54:42 | drbrain | you're not |
| 05:55:04 | Defiler | Yeah, it is an important concern |
| 05:55:25 | drbrain | I remain skeptical that matz & crew are going to do something impossibly heinous to duplicate |
| 05:55:31 | drbrain | skeptical of the belief that |
| 05:55:38 | ssmoot | But I can't help but feel Ruby is trying to tackle problems that don't need to exist? I mean, if .NET does it X way, and nobody's complaining, why are we going with the (presumably) more complex and slower M17N? |
| 05:55:56 | ssmoot | And I guess that's what my problem is. A presumption that M17N is slower and more complex. |
| 05:55:58 | drbrain | I think the best thing to do would be to jump into 1.9, play around, and give feedback |
| 05:56:04 | ssmoot | But I could be way off base. |
| 05:56:59 | drbrain | ssmoot: I think your question is valid, but I think there's also problems that the Japanese have that latin-based alphabet people don't have |
| 05:57:11 | ssmoot | drbrain: I would love to... this is the first time I've made it back to #rubinius in a year tho'... ;) stupid job :p |
| 05:57:43 | ssmoot | at least I think it's been about a year... time flies when you're stressed. :D |
| 05:57:51 | drbrain | with the rest of ruby as reference material, I find it difficult to believe that they're making a mountain out of a mole hill |
| 05:57:57 | joachimm_ leaves the room. | |
| 05:58:23 | Defiler | So is this stuff implemented in 1.9? |
| 05:58:35 | drbrain | the M17N stuff? yes |
| 05:58:36 | Defiler | I know some of it is |
| 05:58:43 | Defiler | Is it in a near-final state? |
| 05:58:58 | ssmoot | drbrain: you make a good argument... Hopefully it's fast. Then I guess as long as I can set encoding on IO boundaries for external libs that expect UTF16LE for example, I'll survive. ;) |
| 05:59:15 | drbrain | I believe it is 75%-85% complete |
| 05:59:37 | drbrain | for example, to open a file in "binary" mode, use "rb:ascii-8bit" |
| 05:59:41 | drbrain | as the mode |
| 06:00:09 | trythil_ enters the room. | |
| 06:00:15 | drbrain | if you want utf-8, "r:UTF-8" |
| 06:00:49 | trythil leaves the room. | |
| 06:01:41 | ssmoot | Defiler: btw, so if you were curious about .NET Strings, the Remarks paragraph(s) in this article explain it: http://msdn2.microsoft.com/en-us/library/system.string.aspx Which lines up with how I've always understood it. |
| 06:02:04 | drbrain | hehe "(irb):2: warning: Unsupported encoding UTF-16 ignored" |
| 06:02:40 | rue | Ewps |
| 06:02:43 | ssmoot | drbrain: UTF16-LE or BE. There is not "real" UTF16 AFAIK. |
| 06:02:58 | ssmoot | there is no even. |
| 06:03:29 | twbray leaves the room. | |
| 06:04:10 | ssmoot | drbrain: good ol' FEFF or FFFE :) http://unicode.org/faq/utf_bom.html#25 |
| 06:04:48 | drbrain | thanks |
| 06:05:43 | ssmoot | It took me a long time to understand encodings, and I'm still getting there I suppose. ;) |
| 06:06:13 | Defiler | ssmoot: Well, there is that whole Encoding class in .NET, right? |
| 06:06:32 | ssmoot | But yeah, the Han Unification stuff sounds like a big deal... but in practice, it's not. Because banks and whomever are more concerned about interop than not. |
| 06:07:07 | ssmoot | Defiler: Enum I belive. But maybe singletons. But they're just constants basically to translate bytes for IO objects. |
| 06:07:12 | agardiner | rue: what's the syntax to use your compile-time include/exclude macro thingy? |
| 06:07:26 | Defiler | The Japanese web is very different than the western web |
| 06:07:38 | Defiler | people care about precise text display a whole lot |
| 06:07:53 | Defiler | see the shift-jis art on 2chan, etc |
| 06:07:53 | ssmoot | Defiler: like you have to read a byte buffer from a StringReader if you don't want the UTF-16 conversion IIRC. |
| 06:08:06 | Defiler | .NET is horrible. Heh |
| 06:08:52 | drbrain | http://rafb.net/p/OCCP4X23.html |
| 06:09:02 | drbrain | haha, something got screwed up in the pasting |
| 06:09:07 | drbrain | it should be "πº" |
| 06:09:08 | ssmoot | Defiler: no argument from me. ;) I just use it as a point of reference... but still, I do think this is one of the things that MS got right... because, you know, tons of .NET Japanese websites out there still. :) |
| 06:09:10 | drbrain | (pi degrees) |
| 06:11:44 | twbray enters the room. | |
| 06:11:44 | Defiler | ssmoot: I'm not sure they got it right, necessarily. I just don't have the .NET experience I need to say for sure |
| 06:12:09 | Defiler | I've never actually had to implement a shift-jis user-driven site in .NET. :) |
| 06:12:36 | ssmoot | Defiler: well that's the point... you wouldn't. :) it'd be UTF-8. |
| 06:12:48 | rue | agardiner: Rubinius.compile_if($global) { ... } |
| 06:13:03 | agardiner | thanks! |
| 06:13:03 | ssmoot | I mean, I guess you could use shift-jis, but why? |
| 06:13:09 | rue | agardiner: The $global is evaluated at compile time, naturally :) |
| 06:13:19 | Defiler | ssmoot: No, I would. It's a hard requirement |
| 06:13:27 | agardiner | yeah, i just plan to use $DEBUG |
| 06:13:29 | Defiler | Nobody uses a web browser in Japan. Everything is cell phones |
| 06:13:31 | be9 enters the room. | |
| 06:13:52 | Defiler | ..and older ones don't support UTF-8 correctly, so everything is shift-jis or EUC, etc |
| 06:14:01 | ssmoot | Defiler: ah... well... :/ |
| 06:14:01 | rue | Slightly hyperbolic but sure :P |
| 06:14:19 | agardiner | hmm, actually perhaps that's not a good global to use for the debugger! |
| 06:14:27 | Defiler | OK, yeah.. they know what web browsers are and stuff.. :) |
| 06:14:38 | agardiner | since there's a good chance it would be set when using the debugger to debug something else! |
| 06:14:55 | ssmoot | heh... |
| 06:15:07 | rue | agardiner: Well possible--on the other hand, the file must still be compiled *while* under $DEBUG |
| 06:15:10 | ssmoot | so yeah, once upon a time: http://www.gpnet.ne.jp/home-e.html |
| 06:15:26 | agardiner | ah yeah -- good point! |
| 06:16:03 | ssmoot | Defiler: and yeah, it's shift-jis. |
| 06:16:04 | Defiler | ssmoot: click on 'Japanese' at the top. It is in shift_jis encoding |
| 06:16:18 | ssmoot | :) |
| 06:16:21 | Defiler | ..and does the horrible usual thing of putting most of the text on images |
| 06:16:23 | Defiler | =( |
| 06:17:33 | rue | Well, there is one good thing about M17N or whatever stupid acronym they use now |
| 06:17:34 | Defiler | むかつくね |
| 06:17:44 | rue | It makes my Apache woes seem surmountable |
| 06:18:33 | ssmoot | rue: what does it have to do with apache? Or do you mean you could abandon Apache or something? |
| 06:19:42 | ssmoot | Defiler: so what does Java do? strings are byte-arrays to be manipulated? |
| 06:20:20 | MenTaLguY | sort of |
| 06:20:22 | Defiler | twbray: Got a second to enlighten us on how non-unicode strings are handled in Java? |
| 06:20:29 | MenTaLguY | we keep a byte array and a Java String around |
| 06:20:36 | MenTaLguY | cacheing to some extent as needed |
| 06:20:47 | MenTaLguY | Java itself doesn't really do non-Unicode strings |
| 06:21:00 | rue | ssmoot: I just like to complain about Apache. I am compiling a huge statically linked library out of apache to be able to enable weak symbols |
| 06:21:15 | MenTaLguY | Java Strings are UTF-16, originally UCS-2 |
| 06:21:29 | ssmoot | rue: If I were a smarter man I'd know what you meant. :) |
| 06:21:40 | stepheneb_ leaves the room. | |
| 06:21:43 | ssmoot | a-ha. :) |
| 06:22:05 | Defiler | http://osdir.com/ml/text.unicode.general/2003-05/msg00742.html |
| 06:22:08 | Defiler | oh god those poor people |
| 06:22:10 | ssmoot | see? UTF16: 2, M17N: 0. ;) |
| 06:22:14 | ssmoot | heh |
| 06:23:09 | tarcieri | why UTF-16 over UTF-8? |
| 06:23:21 | MenTaLguY | Java strings were already sequences of 16-bit characters |
| 06:23:29 | MenTaLguY | as I said, it was originally UCS-2 |
| 06:23:47 | tarcieri | Java predates UTF-8's emergence as the popular standard |
| 06:23:51 | MenTaLguY | well, that too |
| 06:24:08 | MenTaLguY | but you couldn't go from UCS-2 to UTF-8 and be nice about backwards compatibility anyway |
| 06:24:18 | MenTaLguY | well, that and also UTF-8 hates asian text |
| 06:24:19 | tarcieri | yeah |
| 06:24:21 | MenTaLguY | size-wise |
| 06:24:26 | MenTaLguY | UTF-16 is half as worse |
| 06:24:32 | tarcieri | heh, I suppose |
| 06:24:33 | MenTaLguY | give or take |
| 06:25:20 | tarcieri | UTF-8 means less thunking between encodings in the general case |
| 06:25:53 | tarcieri | that is: talking to network services which encode strings in UTF-8, and reading/writing from files that are UTF-8 |
| 06:26:08 | MenTaLguY | only inasmuch as UTF-8 is common |
| 06:26:09 | ssmoot | tarcieri: well, I would prefer UTF8, I was just putting forth the idea that if the 2 biggest platforms out there (ok, maybe a little hyperbolic) are UTF16 native with automatic conversion at IO boundaries, then M17N seems like overkill and will probably cause as many headaches as it tries to solve. |
| 06:26:14 | MenTaLguY | (now) |
| 06:26:30 | tarcieri | ssmoot: biggest virtual machine platforms? |
| 06:27:13 | Defiler | Like it or not, Ruby was developed in Japan, and is always going to be sensitive to the needs of that locale |
| 06:27:39 | Defiler | Hopefully it will all work out |
| 06:28:14 | ssmoot | tarcieri: I guess so? I mean, I suppose in Japan they probably only have 50% of the jobs, but still, it's significant enough that to deviate without a really good justification just seems... not ideal? And if most of the market can settle on UTF16 then Han Unification seems a really sorry excuse no? |
| 06:29:03 | Defiler | The whole writing system over there is totally retarded, so some people get defensive |
| 06:29:26 | Defiler | There has never been a language (other than maybe Indonesian) more suitable for an alphabet-based writing system than Japanese |
| 06:29:38 | Defiler | but due to a trick of history, they got saddled with the worst possible fit |
| 06:30:02 | Defiler | Literally native speakers cannot read a business card and tell you who it belongs to |
| 06:30:48 | Defiler | Makes for some amusing gags, though, because you can say "It is written 'Bob', but pronounced 'Swift Sword of Justice'" |
| 06:31:08 | benburkert leaves the room. | |
| 06:31:26 | tarcieri | you're talking about hanko? |
| 06:31:55 | Defiler | No, nanori |
| 06:35:58 | dysinger leaves the room. | |
| 06:37:39 | stepheneb enters the room. | |
| 06:39:09 | twbray | Defiler: Java doesn't grok non-Unicode strings. The fact that Java uses UTF-16 is a history-related botch, it' totally be UTF-8 if you were doing it today. |
| 06:40:25 | Defiler | twbray: OK, so presumably Java apps that need to serve content up in, say, Shift-JIS store their data in UTF-16 and convert at the last moment? |
| 06:40:40 | twbray | UTF-8 has 50% overhead vs UTF-16 on Asian texts, but my disks are filling up with video & audio & the text is to the right of the decimal point, so maybe no biggie |
| 06:41:03 | twbray | Defiler: Exactly. Which is probably the right approach. Except for it should be UTF-8. |
| 06:42:17 | Defiler | My inclination is to go with UTF-8 as the default encoding, introduce a CodePoint immediate type, and handle things in a compatible way only when the encodings absolutely require it |
| 06:42:19 | MenTaLguY | the only exceptions are where you're dealing with encodings which can't roundtrip through Unicode, and if you do a lot of bytestring manipulation in the native code for some externally-driven reason |
| 06:42:31 | Defiler | I suspect we can be M17N compatible without making exactly the same decisions as 1.9 |
| 06:43:06 | boyscout | 9 commits by Adam Gardiner |
| 06:43:07 | boyscout | * Remove TaskBreakpoint class, breakpoint debugging statements; 8e9bb15 |
| 06:43:08 | boyscout | * Get breakpoint handling working properly; 37d3127 |
| 06:43:09 | boyscout | * Make ISeq#decode return symbols rather than objects by default; 498b95a |
| 06:43:09 | boyscout | * Simplify and correct breakpoint handling; e89c7b0 |
| 06:43:10 | boyscout | * Add hit count to breakpoints, list breakpoint display; a585c83 |
| 06:43:12 | boyscout | ... |
| 06:45:29 | ssmoot | booo M17N! :p |
| 06:45:33 | joseph enters the room. | |
| 06:45:44 | ssmoot | decided to make a meaningful contribution :) |
| 06:46:16 | benburkert enters the room. | |
| 06:47:25 | benburkert leaves the room. | |
| 06:47:58 | boyscout | 1 commit by Adam Gardiner |
| 06:47:59 | boyscout | * Fix breakpoint hit incrementing; c4663aa |
| 06:49:11 | ssmoot | but seriously... IronRuby is probably going to assume Ruby UTF-8 to .NET internal UTF-16 Strings right? So they'll be ahead of the game. They'll just make integrating easy. And I don't want to use IronRuby. :p |
| 06:50:33 | MenTaLguY | I'm not sure I follow. |
| 06:51:45 | dysinger enters the room. | |
| 06:52:07 | ssmoot | MenTaLguY: well, I mean, the way RubyClr worked, it just assumed the Ruby strings were ASCII IIRC. But that's probably going to change to UTF-8 I'm sure. So you'll be able to use IronRuby to use a Merb app with MSSQL. |
| 06:52:59 | ssmoot | MenTaLguY: but if M17N means encoding aware strings, and not IO (I'm fuzzy on this), then integrating say Rubinius with MSSQL will still be (almost) as big a PITA tomorrow as it would be today. |
| 06:53:27 | drbrain | damn, Alien Director's Cut is playing for the midnight movie |
| 06:53:28 | rubuildius_amd64 leaves the room. | |
| 06:53:30 | ssmoot | MenTaLguY: maybe jRuby already has this figured out tho'... |
| 06:53:45 | benburkert enters the room. | |
| 06:54:08 | MenTaLguY | I'm a little bit fuzzy on JRuby's M17N handling, except that we do use the dual representation I mentioned earlier |
| 06:55:58 | ssmoot | MenTaLguY: anyways, point is, without automatic conversion and encoding aware IO, it practically guarantees YARV or Rubinius won't see any real action in businesses that have to deal with multiple encodings right? I mean, at least if I had the choice, I'd rather say "this file is shift-jis, handle everything else transparently". |
| 06:56:09 | Defiler | drbrain: Have you seen it? It's really good |
| 06:56:10 | ssmoot | I'm being hyperbolic again I know. :) |
| 06:56:27 | drbrain | Defiler: I saw it when it first was released |
| 06:56:33 | Defiler | word. |
| 06:56:39 | ssmoot | (I think hyperbolic is my Word Of The Day btw...) |
| 06:57:21 | MenTaLguY | Isn't that what 1.9's stuff does? |
| 06:57:29 | MenTaLguY | IIRC you can specify the encoding when opening a file |
| 06:57:42 | ssmoot | MenTaLguY: encoding at boundaries? That's what I'm fuzzy on... |
| 06:58:17 | ssmoot | I mean, if it's not going to normalize to a uniform encoding, what's the point other than it'll stamp the Strings returned with an enum? |
| 06:58:25 | drbrain | open some_file, 'r:encoding_of_file' do |io| ... end |
| 06:58:41 | tarcieri | ssmoot: you can always write an itsy bitsy subclass to ensure all I/O thunks to unicode |
| 06:58:51 | drbrain | ssmoot: automatic transformation is not guaranteed to work |
| 06:58:56 | drbrain | that's the point of it |
| 06:59:11 | ssmoot | drbrain: can you specify a default encoding for IO in 1.9 and have it take Strings with other encodings and translate them automatically just by passing them to that IO? |
| 06:59:30 | tarcieri | ssmoot: write an itsy bits subclass |
| 06:59:32 | drbrain | ssmoot: I'd have to experiment... |
| 06:59:41 | tarcieri | ssmoot: or hell, a module |
| 07:00:06 | ssmoot | tarcieri: to overwrite IO#<< or something you mean? |
| 07:00:40 | tarcieri | or just define something like #uread and #uwrite that implicitly convert to/from unicode |
| 07:00:53 | obvio171 enters the room. | |
| 07:01:01 | MenTaLguY | my understanding had been that it keeps things in the original encoding until/unless conversion is required |
| 07:01:58 | ssmoot | tarcieri: makes sense... but... you'd have to do the same for FasterCSV then... and Net::HTTP... and mini-exiftools... and... |
| 07:02:12 | agardiner leaves the room. | |
| 07:02:24 | ssmoot | MenTaLguY: but then it's automagic? I have nothing to complain about then possibly? |
| 07:02:41 | drbrain | irb(main):012:0> open "ascii", 'w:us-ascii' do |io| io.write utf_16le endRuntimeError: conversion undefined for byte sequence (maybe invalid byte sequence) |
| 07:02:44 | drbrain | oops |
| 07:02:48 | twbray leaves the room. | |
| 07:02:55 | drbrain | RuntimeError: conversion undefined for byte sequence (maybe invalid byte sequence) |
| 07:03:41 | ssmoot | drbrain: well of course. :) Pretty sure .NET throws a similar error... |
| 07:03:43 | drbrain | but... |
| 07:04:17 | drbrain | irb(main):017:0> open "ascii", 'w:us-ascii' do |io| io.write "foo".encode('utf-16le') end |
| 07:04:21 | drbrain | => 3 |
| 07:04:28 | drbrain | so it appears it auto-converts on autput |
| 07:04:31 | drbrain | output |
| 07:05:17 | joseph leaves the room. | |
| 07:05:24 | ssmoot | Then I suppose I have nothing to complain about. ;) |
| 07:06:18 | drbrain | irb(main):018:0> open "utf-8", 'w:utf-8' do |io| io.write utf_16le end=> 7 |
| 07:06:23 | rubuildius_amd64 enters the room. | |
| 07:06:46 | obvio leaves the room. | |
| 07:07:04 | benburkert leaves the room. | |
| 07:07:10 | drbrain | yeah, and writing to a 'w:utf-16le' IO gives 6 bytes written |
| 07:07:26 | ssmoot | er... |
| 07:07:47 | ssmoot | drbrain: it's writing out the BOM then? |
| 07:08:13 | benburkert enters the room. | |
| 07:08:19 | ssmoot | drbrain: because UTF-8 should be 3... except I think it's BOM is 3 bytes long. Or 4. I forget. |
| 07:08:41 | ssmoot | drbrain: or did you use a different string? |
| 07:08:53 | drbrain | ssmoot: my string was "πº" (pi degrees) |
| 07:09:20 | drbrain | $ od -Ax utf-16le |
| 07:09:30 | drbrain | 0000000 177377 001700 000272 |
| 07:09:35 | drbrain | $ od -A x utf-8 |
| 07:09:45 | drbrain | 0000000 135757 147677 141200 000272 |
| 07:10:06 | ssmoot | drbrain: ah, gotchas. so no BOM. :/ |
| 07:10:33 | drbrain | I guess not |
| 07:10:39 | scoopr | utf8 can have bom |
| 07:10:42 | scoopr | fefffe I think |
| 07:10:50 | scoopr | or was it fffeff |
| 07:11:10 | drbrain | here's proper output of hex string content for utf-16le: 03c0feff000000ba |
| 07:11:19 | scoopr | EFBBEF it seems =) |
| 07:11:30 | ssmoot | that's a (minor) problem... at least if you're not opening it as binary you'd think the BOM should be default. |
| 07:12:07 | scoopr | http://en.wikipedia.org/wiki/Byte_Order_Mark#Representations_of_byte_order_marks_by_encoding |
| 07:12:15 | ssmoot | but whatevs. I'm happy. thanks drbrain :) |
| 07:12:22 | tarcieri | or you could use UTF-8 and not have to deal with that |
| 07:13:40 | ssmoot | tarcieri: true... BOM is just nice because it makes it really easy to tell what you're working with, and File libs can open with the correct encoding automatically then. |
| 07:14:09 | ssmoot | Which is so much better than trying to guess at random what encoding something is. :D |
| 07:15:27 | ezmobius enters the room. | |
| 07:16:24 | rubuildius_amd64 | Adam Gardiner: c4663aa29; 1990 files, 6478 examples, 22572 expectations, 0 failures, 0 errors; http://rafb.net/p/F6esNY77.html |
| 07:19:05 | yaroslav enters the room. | |
| 07:22:09 | Defiler | UTF-8 often is used without a BOM because most UTF-8 is just ASCII |
| 07:22:19 | Defiler | ..and the BOM breaks compatibility with things that don't speak UTF |
| 07:22:30 | MenTaLguY | UTF-8 doesn't need a BOM |
| 07:22:51 | MenTaLguY | it is byte-oriented rather than word-oriented |
| 07:24:17 | ssmoot | your words make sense. :) I suppose you could just open any plain-text file as UTF-8, and if it was just ASCII, no harm done. |
| 07:30:42 | kade enters the room. | |
| 07:31:01 | kade | ivars == instance variables? |
| 07:31:35 | yaroslav leaves the room. | |
| 07:32:01 | MenTaLguY | ssmoot: well, not all valid iso-9660 sequences are valid UTF-8 sequences |
| 07:32:14 | yaroslav enters the room. | |
| 07:32:19 | _mk_ enters the room. | |
| 07:32:38 | _mk_ leaves the room. | |
| 07:32:47 | MenTaLguY | er, is it 9660? or is that CD-ROM filesystems? |
| 07:33:05 | tarcieri | CD-ROM filesystem |
| 07:34:07 | ssmoot | MenTaLguY: I don't ever remember if ASCII is 7bit or 8bit to be honest. I _think_ 7. So I think UTF-8 is a full superset, but I could be wrong. |
| 07:34:21 | MenTaLguY | US-ASCII is 7 bits |
| 07:35:42 | drbrain | http://en.wikipedia.org/wiki/Joliet_%28file_system%29 |
| 07:35:51 | drbrain | 64 UCS-2 characters |
| 07:37:12 | kw leaves the room. | |
| 07:37:36 | ssmoot | you guys are better than google. :) |
| 07:37:51 | drbrain | it looks like ISO-9660 is /\A[A-Z0-9_]{,8}.[A-Z0-9_]{,3}\Z/ |
| 07:38:02 | drbrain | oh, that's level 1 |
| 07:38:14 | ssmoot | dreams of writing an ad-supported technical mechanical turk that posts to #rubinius for great riches... |
| 07:38:23 | ssmoot | :) |
| 07:38:37 | drbrain | level 2 can have file names up to ~180 characters |
| 07:39:18 | drbrain | http://en.wikipedia.org/wiki/ISO-9660#File_and_directory_name_restrictions |
| 07:39:27 | drbrain | but, still no lowercase characters |
| 07:41:55 | Defiler | Yeah, Joliet is the lowercase stuff |
| 07:43:27 | drbrain | "According to legend, the El Torito CD/DVD extension to ISO 9660 gained its name because its design originated in an El Torito restaurant in Irvine, California." |
| 07:44:05 | Maledictus enters the room. | |
| 07:44:47 | Defiler | I think that is a totally legit naming convention |
| 07:45:01 | drbrain | me too |
| 07:45:46 | MenTaLguY | maybe Rubinius should adopt it |
| 07:46:28 | drbrain | "The [Rock Ridge] standard takes its name from the fictional town in Mel Brooks' film Blazing Saddles." |
| 07:47:13 | tarcieri | Rumeo and Joliet? |
| 07:48:05 | drbrain | wikipedia doesn't say where the name came from |
| 07:51:59 | cypher23 enters the room. | |
| 07:58:07 | mentz enters the room. | |
| 08:00:47 | MenTaLguY leaves the room. | |
| 08:02:04 | zimbatm_ enters the room. | |
| 08:02:57 | mentz_ enters the room. | |
| 08:03:42 | mentz_ leaves the room. | |
| 08:03:46 | mentz leaves the room. | |
| 08:04:13 | mentz enters the room. | |
| 08:05:15 | mentz leaves the room. | |
| 08:17:39 | octopod enters the room. | |
| 08:21:21 | thehcdreamer enters the room. | |
| 08:42:42 | stepheneb leaves the room. | |
| 08:44:37 | trythil_ leaves the room. | |
| 08:44:42 | trythil enters the room. | |
| 08:47:39 | TheVoice leaves the room. | |
| 08:48:06 | Skip enters the room. | |
| 08:52:43 | benburkert leaves the room. | |
| 08:53:13 | benburkert enters the room. | |
| 08:56:04 | qwert666 enters the room. | |
| 08:58:52 | ariekeren enters the room. | |
| 08:59:09 | w1rele55 enters the room. | |
| 08:59:52 | mutle enters the room. | |
| 09:00:23 | trythil leaves the room. | |
| 09:02:13 | Fullmoon enters the room. | |
| 09:10:27 | kade leaves the room. | |
| 09:13:19 | wycats_ leaves the room. | |
| 09:15:47 | _mk_ enters the room. | |
| 09:15:54 | Fullmoon leaves the room. | |
| 09:16:09 | ezmobius leaves the room. | |
| 09:25:48 | Arjen_ enters the room. | |
| 09:31:27 | wycats enters the room. | |
| 09:33:02 | cypher23 leaves the room. | |
| 09:40:39 | qwert666 leaves the room. | |
| 09:55:28 | benburkert leaves the room. | |
| 10:05:44 | ariekeren leaves the room. | |
| 10:09:54 | retnuH enters the room. | |
| 10:22:01 | chris2 enters the room. | |
| 10:23:02 | mentz enters the room. | |
| 10:33:52 | rue | Good morning, Europe |
| 10:36:13 | scoopr | good afternoon =) |
| 10:36:13 | Arjen | Morning. |
| 10:38:36 | mentz leaves the room. | |
| 10:40:27 | mentz enters the room. | |
| 10:46:12 | yugui enters the room. | |
| 10:52:32 | ko1 | eban? |
| 10:52:34 | ko1 | evan? |
| 10:54:40 | rue | ko1_: He will probably not be up for a couple hours |
| 10:54:51 | ko1 | rue: thanks! |
| 10:56:35 | rue | ko1_: I can try to help if you have questions about the codebase although you may want to e-mail him or check back in 5-6 hours or so |
| 10:57:47 | rue | drbrain: Minor, the RDoc TemplatePage diagram is *really* big :P |
| 10:58:13 | _mk_ leaves the room. | |
| 10:58:13 | w1rele55 leaves the room. | |
| 10:58:13 | wyhaines leaves the room. | |
| 10:58:13 | djwhitt leaves the room. | |
| 10:58:13 | goodney leaves the room. | |
| 10:58:13 | _goodney_ leaves the room. | |
| 10:58:13 | Fobax_ leaves the room. | |
| 10:58:13 | boyscout leaves the room. | |
| 10:58:13 | squeegy leaves the room. | |
| 10:58:13 | Defiler leaves the room. | |
| 10:58:13 | brixen leaves the room. | |
| 10:58:13 | Ingmar leaves the room. | |
| 10:58:13 | crayz__ leaves the room. | |
| 10:58:13 | chris2 leaves the room. | |
| 10:58:13 | jtoy leaves the room. | |
| 10:58:13 | smparke1 leaves the room. | |
| 10:58:13 | rudebwoy leaves the room. | |
| 10:58:13 | mitsuhiko leaves the room. | |
| 10:58:13 | cyndis leaves the room. | |
| 10:58:13 | ndemonner leaves the room. | |
| 10:58:13 | _eric leaves the room. | |
| 10:58:13 | flori leaves the room. | |
| 10:58:13 | ko1_ leaves the room. | |
| 10:58:13 | fearoffish leaves the room. | |
| 10:58:13 | VVSiz leaves the room. | |
| 10:58:13 | yugui leaves the room. | |
| 10:58:13 | retnuH leaves the room. | |
| 10:58:13 | zimbatm_ leaves the room. | |
| 10:58:13 | obvio171 leaves the room. | |
| 10:58:13 | ttmrichter leaves the room. | |
| 10:58:13 | winescou1 leaves |