Language support for external iterators (new for loop)

classic Classic list List threaded Threaded
19 messages Options
Reply | Threaded
Open this post in threaded view
|

Language support for external iterators (new for loop)

Daniel Micay
This is a followup from the previous discussion on internal vs. external
iterators.[1]

Since then, most iterators in the standard library have been converted to
external ones. Almost all uses of the `for` loop are now using the `.advance`
wrapper, and I don't think there will be any use cases left for the old
internal iteration protocol.

# External iterators

To reiterate the benefits of the external iteration protocol:

* It's generic and works well with traits, so functions can be written to work
  on any arbitrary `Iterator<A>`. Most adaptors can work for any type `A`,
  whether it's a value, reference or mutable reference.

* Iteration state is an object, so iterators can be interleaved. This is
  required for a generic zip, merge, union, intersect, etc. and is often useful
  in an ad-hoc fashion to consume only some of an iterator without
  losing it.

* In the future, Rust can have generators using a `yield` statement like C#,
  compiling down to a fast state machine without requiring context switches,
  virtual functions or even closures. This would eliminate the difficulty of
  coding recursive traversals by-hand with external iterators.

# Alternatives

The iteration protocol can't be based on anything requiring virtual method
calls, heap allocations or context switches without the performance becoming
significantly worse.

There have been some suggestions about following the lead of Clojure and using
reducers[2], but the implementation suffers from the same limitations of not
having an external state.

Rust doesn't yet have a way to write data-parallel code, but when it does gain
that, containers can just support partitioning themselves into ranges via
`Iterator`. It will work for in-place mutation in parallel too.

# A new loop

I think it's a foregone conclusion that we'll be replacing `for`, so I suggest
that we just reuse the current syntax and change the semantics:

    for iterator |pattern| { body }

This can just be compiled as the following would be:

    let mut it = iterator;
    loop {
        match it.next() {
            Some(pattern) => { body }
            None => break
        }
    }

A lang item can be added for the Iterator trait requirement.

This would avoid the `.advance` tacked onto almost every `for` loop at the
moment, and rid us of the problems associated with the current `for`:

* The `break` and `return` statements can fail to work, so borrow/liveness
  checking can't trust them. A loop body can't take an `&mut` reference and
  return it because it could result in grabbing the reference again. This also
  seems to be why we forbid `return` inside closures and do statements, since it
  would be confusing to have to act differently than `for`.

* The function's local variables are upvars in the closure, so using them is
  very restricted. It's very obvious that it's not just another block because
  of this.

* Compiling closures is slow, as they have to broken down by SROA and involve
  inlining a function after proving the function pointer is constant. If we
  were marking the function pointers as `llvm.invariant` it might stop being a
  performance hit, but it would remain a compile time issue.

# Iterables

The `for` loop above can also be extended to work for *any* `Iterable` in the
future, not just iterators themselves.

    for iterable |x| { ... } // use the default iterator
    for iterable.rev_iter |x| { ... } // use another iterator

At the moment, the `Iterable` trait cannot be defined/used because the compiler
ignores bounds on trait type parameters, but it would be something like the
following:

    #[lang = "iterable"]
    trait Iterable<A, T: Iterator<A>> {
        fn iter(&self) -> T;
    }

    trait ReverseIterable<A, T: Iterator<A>> {
        fn rev_iter(&self) -> T;
    }

    trait MutableIterable<A, T: Iterator<A>> {
        fn mut_iter(&mut self) -> T;
    }

    trait MutableReverseIterable<A, T: Iterator<A>> {
        fn mut_rev_iter(&mut self) -> T;
    }

# Links

[1]: https://mail.mozilla.org/pipermail/rust-dev/2013-June/004364.html
[2]: https://github.com/mozilla/rust/wiki/Meeting-weekly-2013-06-25#iterators

_______________________________________________
Rust-dev mailing list
[hidden email]
https://mail.mozilla.org/listinfo/rust-dev

signature.asc (853 bytes) Download Attachment
Reply | Threaded
Open this post in threaded view
|

Re: Language support for external iterators (new for loop)

Timothy Kuehn
What motivates your suggestion to keep the syntax the same? I think it's confusing that it looks like a closure but is functionally different from one. I remember seeing suggestions to change it to "for pattern in iterator." Besides being less confusing, I think it simply looks nicer.



From: "Daniel Micay" <[hidden email]>
To: [hidden email]
Sent: Tuesday, June 25, 2013 8:44:10 PM
Subject: [rust-dev] Language support for external iterators (new for loop)

This is a followup from the previous discussion on internal vs. external
iterators.[1]

Since then, most iterators in the standard library have been converted to
external ones. Almost all uses of the `for` loop are now using the `.advance`
wrapper, and I don't think there will be any use cases left for the old
internal iteration protocol.

# External iterators

To reiterate the benefits of the external iteration protocol:

* It's generic and works well with traits, so functions can be written to work
  on any arbitrary `Iterator<A>`. Most adaptors can work for any type `A`,
  whether it's a value, reference or mutable reference.

* Iteration state is an object, so iterators can be interleaved. This is
  required for a generic zip, merge, union, intersect, etc. and is often useful
  in an ad-hoc fashion to consume only some of an iterator without
  losing it.

* In the future, Rust can have generators using a `yield` statement like C#,
  compiling down to a fast state machine without requiring context switches,
  virtual functions or even closures. This would eliminate the difficulty of
  coding recursive traversals by-hand with external iterators.

# Alternatives

The iteration protocol can't be based on anything requiring virtual method
calls, heap allocations or context switches without the performance becoming
significantly worse.

There have been some suggestions about following the lead of Clojure and using
reducers[2], but the implementation suffers from the same limitations of not
having an external state.

Rust doesn't yet have a way to write data-parallel code, but when it does gain
that, containers can just support partitioning themselves into ranges via
`Iterator`. It will work for in-place mutation in parallel too.

# A new loop

I think it's a foregone conclusion that we'll be replacing `for`, so I suggest
that we just reuse the current syntax and change the semantics:

    for iterator |pattern| { body }

This can just be compiled as the following would be:

    let mut it = iterator;
    loop {
        match it.next() {
            Some(pattern) => { body }
            None => break
        }
    }

A lang item can be added for the Iterator trait requirement.

This would avoid the `.advance` tacked onto almost every `for` loop at the
moment, and rid us of the problems associated with the current `for`:

* The `break` and `return` statements can fail to work, so borrow/liveness
  checking can't trust them. A loop body can't take an `&mut` reference and
  return it because it could result in grabbing the reference again. This also
  seems to be why we forbid `return` inside closures and do statements, since it
  would be confusing to have to act differently than `for`.

* The function's local variables are upvars in the closure, so using them is
  very restricted. It's very obvious that it's not just another block because
  of this.

* Compiling closures is slow, as they have to broken down by SROA and involve
  inlining a function after proving the function pointer is constant. If we
  were marking the function pointers as `llvm.invariant` it might stop being a
  performance hit, but it would remain a compile time issue.

# Iterables

The `for` loop above can also be extended to work for *any* `Iterable` in the
future, not just iterators themselves.

    for iterable |x| { ... } // use the default iterator
    for iterable.rev_iter |x| { ... } // use another iterator

At the moment, the `Iterable` trait cannot be defined/used because the compiler
ignores bounds on trait type parameters, but it would be something like the
following:

    #[lang = "iterable"]
    trait Iterable<A, T: Iterator<A>> {
        fn iter(&self) -> T;
    }

    trait ReverseIterable<A, T: Iterator<A>> {
        fn rev_iter(&self) -> T;
    }

    trait MutableIterable<A, T: Iterator<A>> {
        fn mut_iter(&mut self) -> T;
    }

    trait MutableReverseIterable<A, T: Iterator<A>> {
        fn mut_rev_iter(&mut self) -> T;
    }

# Links

[1]: https://mail.mozilla.org/pipermail/rust-dev/2013-June/004364.html
[2]: https://github.com/mozilla/rust/wiki/Meeting-weekly-2013-06-25#iterators

_______________________________________________
Rust-dev mailing list
[hidden email]
https://mail.mozilla.org/listinfo/rust-dev


_______________________________________________
Rust-dev mailing list
[hidden email]
https://mail.mozilla.org/listinfo/rust-dev
Reply | Threaded
Open this post in threaded view
|

Re: Language support for external iterators (new for loop)

Daniel Micay
On Wed, Jun 26, 2013 at 3:44 PM, Timothy Kuehn <[hidden email]> wrote:
> What motivates your suggestion to keep the syntax the same? I think it's
> confusing that it looks like a closure but is functionally different from
> one. I remember seeing suggestions to change it to "for pattern in
> iterator." Besides being less confusing, I think it simply looks nicer.

Changing the syntax should be a separate breaking change, if it's
going to happen. It would make it a *lot* harder to land this and I
think we need this ASAP.
_______________________________________________
Rust-dev mailing list
[hidden email]
https://mail.mozilla.org/listinfo/rust-dev
Reply | Threaded
Open this post in threaded view
|

Re: Language support for external iterators (new for loop)

Benjamin Striegel
Yes, let's not get bogged down with syntax discussions until the important semantic changes have landed.

_______________________________________________
Rust-dev mailing list
[hidden email]
https://mail.mozilla.org/listinfo/rust-dev
Reply | Threaded
Open this post in threaded view
|

Re: Language support for external iterators (new for loop)

Uther
In reply to this post by Daniel Micay
The proposed syntax already a breaking change, isn't it?

The "for x in iterable { }" syntax could live along the old "for
iterable.each |x| {}" syntax,
making the transition easier.

>> What motivates your suggestion to keep the syntax the same? I think it's
>> confusing that it looks like a closure but is functionally different from
>> one. I remember seeing suggestions to change it to "for pattern in
>> iterator." Besides being less confusing, I think it simply looks nicer.
> Changing the syntax should be a separate breaking change, if it's
> going to happen. It would make it a *lot* harder to land this and I
> think we need this ASAP.
> _______________________________________________
> Rust-dev mailing list
> [hidden email]
> https://mail.mozilla.org/listinfo/rust-dev

_______________________________________________
Rust-dev mailing list
[hidden email]
https://mail.mozilla.org/listinfo/rust-dev
Reply | Threaded
Open this post in threaded view
|

Re: Language support for external iterators (new for loop)

Daniel Micay
On Wed, Jun 26, 2013 at 4:29 PM, Uther <[hidden email]> wrote:
> The proposed syntax already a breaking change, isn't it?
>
> The "for x in iterable { }" syntax could live along the old "for
> iterable.each |x| {}" syntax,
> making the transition easier.

The proposed change can be done by making `advance` the identity
function and having the compiler build in both stage0 (old for loop)
and stage1/stage2 (new for loop).

Changing the syntax of the for loop would be a lot harder... it
shouldn't be tied to this important semantic change.
_______________________________________________
Rust-dev mailing list
[hidden email]
https://mail.mozilla.org/listinfo/rust-dev
Reply | Threaded
Open this post in threaded view
|

Re: Language support for external iterators (new for loop)

Gareth Smith
In reply to this post by Daniel Micay
One downside of external iterators is that they produce function signatures that expose implementation details and are harder to understand.

For example, an internal iterator function that loops over odd numbers would have a signature like this:

fn each_odd(ints: &[int], fn(int) -> bool) -> bool;

The same function written to produce an external iterator would have a signature more like this:

fn iter_odds<'a>(ints: &'a [int]) -> iterator::FilterMapIterator<'a, &int, int, vec::VecIterator<'a, int>>;

This signature is more complicated. It is also likely to change if the implementation of iter_odds changes (e.g. to no longer use filter_map).

Gareth

On 26/06/13 04:44, Daniel Micay wrote:
This is a followup from the previous discussion on internal vs. external
iterators.[1]

Since then, most iterators in the standard library have been converted to
external ones. Almost all uses of the `for` loop are now using the `.advance`
wrapper, and I don't think there will be any use cases left for the old
internal iteration protocol.

# External iterators

To reiterate the benefits of the external iteration protocol:

* It's generic and works well with traits, so functions can be written to work
  on any arbitrary `Iterator<A>`. Most adaptors can work for any type `A`,
  whether it's a value, reference or mutable reference.

* Iteration state is an object, so iterators can be interleaved. This is
  required for a generic zip, merge, union, intersect, etc. and is often useful
  in an ad-hoc fashion to consume only some of an iterator without
  losing it.

* In the future, Rust can have generators using a `yield` statement like C#,
  compiling down to a fast state machine without requiring context switches,
  virtual functions or even closures. This would eliminate the difficulty of
  coding recursive traversals by-hand with external iterators.

# Alternatives

The iteration protocol can't be based on anything requiring virtual method
calls, heap allocations or context switches without the performance becoming
significantly worse.

There have been some suggestions about following the lead of Clojure and using
reducers[2], but the implementation suffers from the same limitations of not
having an external state.

Rust doesn't yet have a way to write data-parallel code, but when it does gain
that, containers can just support partitioning themselves into ranges via
`Iterator`. It will work for in-place mutation in parallel too.

# A new loop

I think it's a foregone conclusion that we'll be replacing `for`, so I suggest
that we just reuse the current syntax and change the semantics:

    for iterator |pattern| { body }

This can just be compiled as the following would be:

    let mut it = iterator;
    loop {
        match it.next() {
            Some(pattern) => { body }
            None => break
        }
    }

A lang item can be added for the Iterator trait requirement.

This would avoid the `.advance` tacked onto almost every `for` loop at the
moment, and rid us of the problems associated with the current `for`:

* The `break` and `return` statements can fail to work, so borrow/liveness
  checking can't trust them. A loop body can't take an `&mut` reference and
  return it because it could result in grabbing the reference again. This also
  seems to be why we forbid `return` inside closures and do statements, since it
  would be confusing to have to act differently than `for`.

* The function's local variables are upvars in the closure, so using them is
  very restricted. It's very obvious that it's not just another block because
  of this.

* Compiling closures is slow, as they have to broken down by SROA and involve
  inlining a function after proving the function pointer is constant. If we
  were marking the function pointers as `llvm.invariant` it might stop being a
  performance hit, but it would remain a compile time issue.

# Iterables

The `for` loop above can also be extended to work for *any* `Iterable` in the
future, not just iterators themselves.

    for iterable |x| { ... } // use the default iterator
    for iterable.rev_iter |x| { ... } // use another iterator

At the moment, the `Iterable` trait cannot be defined/used because the compiler
ignores bounds on trait type parameters, but it would be something like the
following:

    #[lang = "iterable"]
    trait Iterable<A, T: Iterator<A>> {
        fn iter(&self) -> T;
    }

    trait ReverseIterable<A, T: Iterator<A>> {
        fn rev_iter(&self) -> T;
    }

    trait MutableIterable<A, T: Iterator<A>> {
        fn mut_iter(&mut self) -> T;
    }

    trait MutableReverseIterable<A, T: Iterator<A>> {
        fn mut_rev_iter(&mut self) -> T;
    }

# Links

[1]: https://mail.mozilla.org/pipermail/rust-dev/2013-June/004364.html
[2]: https://github.com/mozilla/rust/wiki/Meeting-weekly-2013-06-25#iterators


_______________________________________________
Rust-dev mailing list
[hidden email]
https://mail.mozilla.org/listinfo/rust-dev


_______________________________________________
Rust-dev mailing list
[hidden email]
https://mail.mozilla.org/listinfo/rust-dev
Reply | Threaded
Open this post in threaded view
|

Re: Language support for external iterators (new for loop)

Daniel Micay
On Wed, Jun 26, 2013 at 6:49 PM, Gareth Smith
<[hidden email]> wrote:

> One downside of external iterators is that they produce function signatures
> that expose implementation details and are harder to understand.
>
> For example, an internal iterator function that loops over odd numbers would
> have a signature like this:
>
> fn each_odd(ints: &[int], fn(int) -> bool) -> bool;
>
> The same function written to produce an external iterator would have a
> signature more like this:
>
> fn iter_odds<'a>(ints: &'a [int]) -> iterator::FilterMapIterator<'a, &int,
> int, vec::VecIterator<'a, int>>;
>
> This signature is more complicated. It is also likely to change if the
> implementation of iter_odds changes (e.g. to no longer use filter_map).

You can implement a struct and make the fields private, no need to
expose the implementation details. A syntax for generators with
`yield` would do this for you.
_______________________________________________
Rust-dev mailing list
[hidden email]
https://mail.mozilla.org/listinfo/rust-dev
Reply | Threaded
Open this post in threaded view
|

Re: Language support for external iterators (new for loop)

Niko Matsakis
In reply to this post by Gareth Smith
Specificity is the cost of non-virtual dispatch. However, if it is
truly undesirable in some cases, we can eventually permit you to
return `~Iterator<int>`, once ~-objects work properly.


Niko

On Wed, Jun 26, 2013 at 11:49:07PM +0100, Gareth Smith wrote:

> One downside of external iterators is that they produce function
> signatures that expose implementation details and are harder to
> understand.
>
> For example, an internal iterator function that loops over odd
> numbers would have a signature like this:
>
> fn each_odd(ints: &[int], fn(int) -> bool) -> bool;
>
> The same function written to produce an external iterator would have
> a signature more like this:
>
> fn iter_odds<'a>(ints: &'a [int]) -> iterator::FilterMapIterator<'a,
> &int, int, vec::VecIterator<'a, int>>;
>
> This signature is more complicated. It is also likely to change if
> the implementation of iter_odds changes (e.g. to no longer use
> filter_map).
>
> Gareth
>
> On 26/06/13 04:44, Daniel Micay wrote:
> >This is a followup from the previous discussion on internal vs. external
> >iterators.[1]
> >
> >Since then, most iterators in the standard library have been converted to
> >external ones. Almost all uses of the `for` loop are now using the `.advance`
> >wrapper, and I don't think there will be any use cases left for the old
> >internal iteration protocol.
> >
> ># External iterators
> >
> >To reiterate the benefits of the external iteration protocol:
> >
> >* It's generic and works well with traits, so functions can be written to work
> >   on any arbitrary `Iterator<A>`. Most adaptors can work for any type `A`,
> >   whether it's a value, reference or mutable reference.
> >
> >* Iteration state is an object, so iterators can be interleaved. This is
> >   required for a generic zip, merge, union, intersect, etc. and is often useful
> >   in an ad-hoc fashion to consume only some of an iterator without
> >   losing it.
> >
> >* In the future, Rust can have generators using a `yield` statement like C#,
> >   compiling down to a fast state machine without requiring context switches,
> >   virtual functions or even closures. This would eliminate the difficulty of
> >   coding recursive traversals by-hand with external iterators.
> >
> ># Alternatives
> >
> >The iteration protocol can't be based on anything requiring virtual method
> >calls, heap allocations or context switches without the performance becoming
> >significantly worse.
> >
> >There have been some suggestions about following the lead of Clojure and using
> >reducers[2], but the implementation suffers from the same limitations of not
> >having an external state.
> >
> >Rust doesn't yet have a way to write data-parallel code, but when it does gain
> >that, containers can just support partitioning themselves into ranges via
> >`Iterator`. It will work for in-place mutation in parallel too.
> >
> ># A new loop
> >
> >I think it's a foregone conclusion that we'll be replacing `for`, so I suggest
> >that we just reuse the current syntax and change the semantics:
> >
> >     for iterator |pattern| { body }
> >
> >This can just be compiled as the following would be:
> >
> >     let mut it = iterator;
> >     loop {
> >         match it.next() {
> >             Some(pattern) => { body }
> >             None => break
> >         }
> >     }
> >
> >A lang item can be added for the Iterator trait requirement.
> >
> >This would avoid the `.advance` tacked onto almost every `for` loop at the
> >moment, and rid us of the problems associated with the current `for`:
> >
> >* The `break` and `return` statements can fail to work, so borrow/liveness
> >   checking can't trust them. A loop body can't take an `&mut` reference and
> >   return it because it could result in grabbing the reference again. This also
> >   seems to be why we forbid `return` inside closures and do statements, since it
> >   would be confusing to have to act differently than `for`.
> >
> >* The function's local variables are upvars in the closure, so using them is
> >   very restricted. It's very obvious that it's not just another block because
> >   of this.
> >
> >* Compiling closures is slow, as they have to broken down by SROA and involve
> >   inlining a function after proving the function pointer is constant. If we
> >   were marking the function pointers as `llvm.invariant` it might stop being a
> >   performance hit, but it would remain a compile time issue.
> >
> ># Iterables
> >
> >The `for` loop above can also be extended to work for *any* `Iterable` in the
> >future, not just iterators themselves.
> >
> >     for iterable |x| { ... } // use the default iterator
> >     for iterable.rev_iter |x| { ... } // use another iterator
> >
> >At the moment, the `Iterable` trait cannot be defined/used because the compiler
> >ignores bounds on trait type parameters, but it would be something like the
> >following:
> >
> >     #[lang = "iterable"]
> >     trait Iterable<A, T: Iterator<A>> {
> >         fn iter(&self) -> T;
> >     }
> >
> >     trait ReverseIterable<A, T: Iterator<A>> {
> >         fn rev_iter(&self) -> T;
> >     }
> >
> >     trait MutableIterable<A, T: Iterator<A>> {
> >         fn mut_iter(&mut self) -> T;
> >     }
> >
> >     trait MutableReverseIterable<A, T: Iterator<A>> {
> >         fn mut_rev_iter(&mut self) -> T;
> >     }
> >
> ># Links
> >
> >[1]: https://mail.mozilla.org/pipermail/rust-dev/2013-June/004364.html
> >[2]: https://github.com/mozilla/rust/wiki/Meeting-weekly-2013-06-25#iterators
> >
> >
> >_______________________________________________
> >Rust-dev mailing list
> >[hidden email]
> >https://mail.mozilla.org/listinfo/rust-dev
>

> _______________________________________________
> Rust-dev mailing list
> [hidden email]
> https://mail.mozilla.org/listinfo/rust-dev

_______________________________________________
Rust-dev mailing list
[hidden email]
https://mail.mozilla.org/listinfo/rust-dev
Reply | Threaded
Open this post in threaded view
|

Re: Language support for external iterators (new for loop)

Patrick Walton
On 6/28/13 11:23 AM, Niko Matsakis wrote:
> Specificity is the cost of non-virtual dispatch. However, if it is
> truly undesirable in some cases, we can eventually permit you to
> return `~Iterator<int>`, once ~-objects work properly.

They basically do now (in master), no?

Patrick

_______________________________________________
Rust-dev mailing list
[hidden email]
https://mail.mozilla.org/listinfo/rust-dev
Reply | Threaded
Open this post in threaded view
|

Re: Language support for external iterators (new for loop)

Niko Matsakis
In reply to this post by Daniel Micay
On Tue, Jun 25, 2013 at 11:44:10PM -0400, Daniel Micay wrote:
> Rust doesn't yet have a way to write data-parallel code, but when it does gain
> that, containers can just support partitioning themselves into ranges via
> `Iterator`. It will work for in-place mutation in parallel too.

I do not follow what you mean by "support partitioning themselves into
ranges via `Iterator`".  That said, I don't think the optimal
sequential protocol will necessarily be the optimal parallel protocol,
and it's not clear to me how general it will be to have a protocol
that yields slices. It'd work ok for vectors and btrees but perhaps
not that much else?

I suspect the best thing will be to have the iterable trait optimized
for sequential access, and provide a second trait to be used for
optimal parallel access, probably some kind of
"divide-and-conquer"-like trait that can split an iterator into two
other iterators. For types that lack the DivideAndConquer trait, you
could also have a poor man's D&C that is implemented in terms of an
iterator, or just serialize to a vec.


Niko
_______________________________________________
Rust-dev mailing list
[hidden email]
https://mail.mozilla.org/listinfo/rust-dev
Reply | Threaded
Open this post in threaded view
|

Re: Language support for external iterators (new for loop)

Daniel Micay
On Fri, Jun 28, 2013 at 2:55 PM, Niko Matsakis <[hidden email]> wrote:

> On Tue, Jun 25, 2013 at 11:44:10PM -0400, Daniel Micay wrote:
>> Rust doesn't yet have a way to write data-parallel code, but when it does gain
>> that, containers can just support partitioning themselves into ranges via
>> `Iterator`. It will work for in-place mutation in parallel too.
>
> I do not follow what you mean by "support partitioning themselves into
> ranges via `Iterator`".  That said, I don't think the optimal
> sequential protocol will necessarily be the optimal parallel protocol,
> and it's not clear to me how general it will be to have a protocol
> that yields slices. It'd work ok for vectors and btrees but perhaps
> not that much else?
>
> I suspect the best thing will be to have the iterable trait optimized
> for sequential access, and provide a second trait to be used for
> optimal parallel access, probably some kind of
> "divide-and-conquer"-like trait that can split an iterator into two
> other iterators. For types that lack the DivideAndConquer trait, you
> could also have a poor man's D&C that is implemented in terms of an
> iterator, or just serialize to a vec.

What I meant by partitioning is the divide-and-conquer strategy you
mentioned, with either `Iterator<&mut [T]>` or `Iterator<Iterator<&mut
T>>`. If we had a method on vectors returning two disjoint sub-slices
as `(&mut [T], &mut [T])`, you could just divide it up as much as you
needed.

I don't think most containers will need anything special to do this
with immutable slices/iterators though.
_______________________________________________
Rust-dev mailing list
[hidden email]
https://mail.mozilla.org/listinfo/rust-dev
Reply | Threaded
Open this post in threaded view
|

Re: Language support for external iterators (new for loop)

Graydon Hoare
In reply to this post by Niko Matsakis
On 13-06-28 11:23 AM, Niko Matsakis wrote:
> Specificity is the cost of non-virtual dispatch. However, if it is
> truly undesirable in some cases, we can eventually permit you to
> return `~Iterator<int>`, once ~-objects work properly.

This is interesting. I assume we'd want these to be Iterator<> cast to
&Iterator<>, no? But we can't really do that in a return value; we'd
have to return the Iterator and have the caller cast it to &Iterator<>.
~-allocating here seems a bit much..

-Graydon


_______________________________________________
Rust-dev mailing list
[hidden email]
https://mail.mozilla.org/listinfo/rust-dev
Reply | Threaded
Open this post in threaded view
|

Re: Language support for external iterators (new for loop)

Niko Matsakis
On Fri, Jun 28, 2013 at 02:46:43PM -0700, Graydon Hoare wrote:
> >Specificity is the cost of non-virtual dispatch. However, if it is
> >truly undesirable in some cases, we can eventually permit you to
> >return `~Iterator<int>`, once ~-objects work properly.
>
> This is interesting. I assume we'd want these to be Iterator<> cast
> to &Iterator<>, no? But we can't really do that in a return value;
> we'd have to return the Iterator and have the caller cast it to
> &Iterator<>. ~-allocating here seems a bit much..

Yes, allocating is somewhat heavy handed, but I don't see any
alternative if you want to hide the precise type of the iterator: even
if you were going to store the result on the caller's stack frame, the
caller would have to know how much space to allocate! Conceivably one
could pass in an arena or something once those work. Otherwise, you
need something like an internal iterator. But in the end I just don't
see this as a big problem one way or the other.


Niko
_______________________________________________
Rust-dev mailing list
[hidden email]
https://mail.mozilla.org/listinfo/rust-dev
Reply | Threaded
Open this post in threaded view
|

Re: Language support for external iterators (new for loop)

james
On 29/06/2013 18:39, Niko Matsakis wrote:
if you were going to store the result on the caller's stack frame, the
caller would have to know how much space to allocate! Conceivably one

If you can have a function that returns an allocated iterator, can't you instead have
a function that informs how big it would be, and a function that uses a passed in
pointer from alloca?


_______________________________________________
Rust-dev mailing list
[hidden email]
https://mail.mozilla.org/listinfo/rust-dev
Reply | Threaded
Open this post in threaded view
|

Re: Language support for external iterators (new for loop)

Daniel Micay
On Sat, Jun 29, 2013 at 5:29 PM, james <[hidden email]> wrote:

> On 29/06/2013 18:39, Niko Matsakis wrote:
>
> if you were going to store the result on the caller's stack frame, the
> caller would have to know how much space to allocate! Conceivably one
>
>
> If you can have a function that returns an allocated iterator, can't you
> instead have
> a function that informs how big it would be, and a function that uses a
> passed in
> pointer from alloca?

We don't have alloca, but if we did, it would be less efficient than a
statically sized allocation since it would involve an extra stack size
check. A low-level, unsafe workaround like that isn't needed when you
can just have a function return an iterator of a specific type.
_______________________________________________
Rust-dev mailing list
[hidden email]
https://mail.mozilla.org/listinfo/rust-dev
Reply | Threaded
Open this post in threaded view
|

Re: Language support for external iterators (new for loop)

james
On 29/06/2013 22:32, Daniel Micay wrote:
On Sat, Jun 29, 2013 at 5:29 PM, james [hidden email] wrote:
> On 29/06/2013 18:39, Niko Matsakis wrote:
>
> if you were going to store the result on the caller's stack frame, the
> caller would have to know how much space to allocate! Conceivably one
>
>
> If you can have a function that returns an allocated iterator, can't you
> instead have
> a function that informs how big it would be, and a function that uses a
> passed in
> pointer from alloca?
We don't have alloca, but if we did, it would be less efficient than a
statically sized allocation since it would involve an extra stack size
check. A low-level, unsafe workaround like that isn't needed when you
can just have a function return an iterator of a specific type.
Well, if the caller knows the type of the returned object and it is returned by value - yes.

But I thought the discussion had strayed to considering a case where the type is hidden
inside the iterated-over object, so the caller using the pattern does not know how to
allocate space for it and receive such an object by value.  I was trying to suggest that
it is not necessary for the caller to know much about the iterator object to avoid a
heap allocation - it has to ask the size, and it has to then allocate and pass some raw
storage on its stack.  And I guess it has to ask for a function to call on the raw store
when it has finished with it.

I'm not claiming that this is more efficient that a return by value, just that it may be
possible to avoid a heap allocation even in the case where the call site only sees an
abstract iterator interface, and does not know any details.

This is very much similar to tradeoffs in C++ between using inheritance and interfaces
vs templates, and my experience has been that moving to templates has in some
cases swapped a small runtime overhead for major problems with compilation speed
and emitted code size - and the latter has caused runtime performance issues all of
its own.


_______________________________________________
Rust-dev mailing list
[hidden email]
https://mail.mozilla.org/listinfo/rust-dev
Reply | Threaded
Open this post in threaded view
|

Re: Language support for external iterators (new for loop)

Matthieu Monrocq
Hello,

Regarding type-erasure without stack-allocation I remember a question on StackOverflow on how to implement this.

In C++ this can be done using templates => template <typename T, size_t Size, size_t Alignment> class Pimpl;

Pimpl will then declare raw storage (char [] in C++03, std::aligned_storage<Size, Alignment>::type in C++11), and then this space will be used by `T` (which was forward declared).

An important (and maybe overlooked) aspect is that Size and Alignment are upper-bounds. Whilst to avoid wasting space it is better they be as close to the necessary value, equality is not necessary. And thus one could perfectly imagine a AgnosticIterator<RandomIterator, 16, 8> and it is up to the builder to create a type that fit... and maybe use dynamic allocation as a fallback if the iterator state cannot fit within the provided size.

-- Matthieu.



On Sun, Jun 30, 2013 at 4:22 PM, james <[hidden email]> wrote:
On 29/06/2013 22:32, Daniel Micay wrote:
On Sat, Jun 29, 2013 at 5:29 PM, james [hidden email] wrote:
> On 29/06/2013 18:39, Niko Matsakis wrote:
>
> if you were going to store the result on the caller's stack frame, the
> caller would have to know how much space to allocate! Conceivably one
>
>
> If you can have a function that returns an allocated iterator, can't you
> instead have
> a function that informs how big it would be, and a function that uses a
> passed in
> pointer from alloca?
We don't have alloca, but if we did, it would be less efficient than a
statically sized allocation since it would involve an extra stack size
check. A low-level, unsafe workaround like that isn't needed when you
can just have a function return an iterator of a specific type.
Well, if the caller knows the type of the returned object and it is returned by value - yes.

But I thought the discussion had strayed to considering a case where the type is hidden
inside the iterated-over object, so the caller using the pattern does not know how to
allocate space for it and receive such an object by value.  I was trying to suggest that
it is not necessary for the caller to know much about the iterator object to avoid a
heap allocation - it has to ask the size, and it has to then allocate and pass some raw
storage on its stack.  And I guess it has to ask for a function to call on the raw store
when it has finished with it.

I'm not claiming that this is more efficient that a return by value, just that it may be
possible to avoid a heap allocation even in the case where the call site only sees an
abstract iterator interface, and does not know any details.

This is very much similar to tradeoffs in C++ between using inheritance and interfaces
vs templates, and my experience has been that moving to templates has in some
cases swapped a small runtime overhead for major problems with compilation speed
and emitted code size - and the latter has caused runtime performance issues all of
its own.


_______________________________________________
Rust-dev mailing list
[hidden email]
https://mail.mozilla.org/listinfo/rust-dev



_______________________________________________
Rust-dev mailing list
[hidden email]
https://mail.mozilla.org/listinfo/rust-dev
Reply | Threaded
Open this post in threaded view
|

Re: Language support for external iterators (new for loop)

Daniel Micay
In reply to this post by james
On Sun, Jun 30, 2013 at 10:22 AM, james <[hidden email]> wrote:

> On 29/06/2013 22:32, Daniel Micay wrote:
>
> On Sat, Jun 29, 2013 at 5:29 PM, james <[hidden email]> wrote:
>
>> On 29/06/2013 18:39, Niko Matsakis wrote:
>>
>> if you were going to store the result on the caller's stack frame, the
>> caller would have to know how much space to allocate! Conceivably one
>>
>>
>> If you can have a function that returns an allocated iterator, can't you
>> instead have
>> a function that informs how big it would be, and a function that uses a
>> passed in
>> pointer from alloca?
>
> We don't have alloca, but if we did, it would be less efficient than a
> statically sized allocation since it would involve an extra stack size
> check. A low-level, unsafe workaround like that isn't needed when you
> can just have a function return an iterator of a specific type.
>
> Well, if the caller knows the type of the returned object and it is returned
> by value - yes.
>
> But I thought the discussion had strayed to considering a case where the
> type is hidden
> inside the iterated-over object, so the caller using the pattern does not
> know how to
> allocate space for it and receive such an object by value.  I was trying to
> suggest that
> it is not necessary for the caller to know much about the iterator object to
> avoid a
> heap allocation - it has to ask the size, and it has to then allocate and
> pass some raw
> storage on its stack.  And I guess it has to ask for a function to call on
> the raw store
> when it has finished with it.

There's no way to dynamically allocate space on the stack in Rust. If
it was possible, it would require unsafe code to obtain the arbitrary
pointer and more unsafe code to pass it into functions requiring that
object. All of those functions would have to be explicitly designed
for this pattern and marked as unsafe.

> I'm not claiming that this is more efficient that a return by value, just
> that it may be
> possible to avoid a heap allocation even in the case where the call site
> only sees an
> abstract iterator interface, and does not know any details.

I don't think it's possible.

> This is very much similar to tradeoffs in C++ between using inheritance and
> interfaces
> vs templates, and my experience has been that moving to templates has in
> some
> cases swapped a small runtime overhead for major problems with compilation
> speed
> and emitted code size - and the latter has caused runtime performance issues
> all of
> its own.

Rust could offer alternate strategies for compiling generics, we
aren't restricted to specializing them in every case forever. Keep in
mind iterator adaptors are tiny functions and that inlining them is
always desired, for every one that's currently done. If you do write a
large function using iterators, you can hoist out the parts that don't
require generic code as a workaround.
_______________________________________________
Rust-dev mailing list
[hidden email]
https://mail.mozilla.org/listinfo/rust-dev