Today's Rust contribution ideas

classic Classic list List threaded Threaded
4 messages Options
Reply | Threaded
Open this post in threaded view
|

Today's Rust contribution ideas

Brian Anderson
People interested in Rust are often looking for ways to have a greater
impact on its development, and while the issue tracker lists lots of
stuff that one *could* work on, it's not always clear what one *should*
work on. There is consistently an overwhelming number of very important
tasks to do which nobody is tackling, so this is an effort to update
folks on what high-impact, yet accessible, contribution opportunities
are available. These are of varying difficulty, but progress on any of
them is worthy of *extreme kudos*.

# Break up libextra (#8784)

Getting our library ecosystem in shape in critical for Rust 1.0. We want
Rust to be a "batteries included" language, distributed with many crates
for common uses, but the way our libraries are organized - everything
divided between std and extra - has long been very unsatisfactory.
libextra needs to be split up into a number of subject-specific crates,
setting the precedent for future expansion of the standard libraries,
and with the impending merging of #11787 the floodgates can be opened.

This is simply a matter of identifing which modules in extra logically
belong in their own libraries, extracting them to a directory in src/,
and adding a minimal amount of boilerplate to the makefiles. Multiple
people can work on this, coordinating on the issue tracker.

# Improve the official cheatsheet

We have the beginnings of a 'cheatsheet', documenting various common
patterns in Rust code
(http://static.rust-lang.org/doc/master/complement-cheatsheet.html), but
there is so much more that could be here. This style of documentation is
hugely useful for newcomers. There are a few ways to approach this:
simply review the current document, editing and augmenting the existing
examples; think of the questions you had about Rust when you started and
add them; solicit questions (and answers!) from the broader community
and them; finally, organize a doc sprint with several people to make
some quick improvements over a few hours.

# Implement the `Share` kind (#11781)

Future concurrency code is going to need to reason about types that can
be shared across threads. The canonical example is fork/join concurrency
using a shared closure, where the closure environment is bounded by
`Share`. We have the `Freeze` kind which covers a limited version of
this use case, but it's not sufficient, and may end up completely
supplanted by `Share`. This is quite important to have sorted out for
1.0 but the design is not done yet. Work with other developers to figure
out the design, then once that's done the implementation - while
involving a fair bit of compiler hacking and library modifications -
should be relatively easy.

# Remove `do` (#10815)

Consensus is that the `do` keyword is no longer pulling its weight.
Remove all uses of it, then remove support from the compiler. This is a
1.0 issue.

# Experiment with faster hash maps (#11783)

Rust's HashMap uses a cryptographically secure hash, and at least partly
as a result of that it is quite slow. HashMap continues to show up very,
very high in performance profiles of a variety of code. It's not clear
what the solution to this is, but it is clear that - at least sometimes
- we need a much faster hash map solution. Figure out how to create
faster hash maps in Rust, potentially sacrificing some amount of
DOS-resistance by using weaker hash functions. This is fairly open-ended
and researchy, but a solution to this could have a big impact on the
performance of rustc and other projects.

# Replace 'extern mod' with 'extern crate' (#9880)

Using 'extern mod' as the syntax for linking to another crate has long
been a bit cringeworthy. The consensus here is to simply rename it to
`extern crate`. This is a fairly easy change that involves adding
`crate` as a keyword, modifying the parser to parse the new syntax, then
changing all uses, either after a snapshot or using conditional
compilation. This is a 1.0 issue.

# Introduce a design FAQ to the official docs (#4047)

There are many questions about languages' design asked repeatedly, so
they tend to have documents simply explaining the rationale for various
decisions. Particularly as we approach 1.0 we'll want a place to point
newcomers to when these questions are asked. The issue on the bug
tracker already contains quite a lot of questions, and some answers as
well. Add a new Markdown file to the doc/ folder and the documentation
index, and add as many of the answers as you can. Consider recruiting
others in #rust to help.

_______________________________________________
Rust-dev mailing list
[hidden email]
https://mail.mozilla.org/listinfo/rust-dev
Reply | Threaded
Open this post in threaded view
|

Re: Today's Rust contribution ideas

Matthieu Monrocq



On Mon, Jan 27, 2014 at 3:39 AM, Brian Anderson <[hidden email]> wrote:
People interested in Rust are often looking for ways to have a greater impact on its development, and while the issue tracker lists lots of stuff that one *could* work on, it's not always clear what one *should* work on. There is consistently an overwhelming number of very important tasks to do which nobody is tackling, so this is an effort to update folks on what high-impact, yet accessible, contribution opportunities are available. These are of varying difficulty, but progress on any of them is worthy of *extreme kudos*.

# Break up libextra (#8784)

Getting our library ecosystem in shape in critical for Rust 1.0. We want Rust to be a "batteries included" language, distributed with many crates for common uses, but the way our libraries are organized - everything divided between std and extra - has long been very unsatisfactory. libextra needs to be split up into a number of subject-specific crates, setting the precedent for future expansion of the standard libraries, and with the impending merging of #11787 the floodgates can be opened.

This is simply a matter of identifing which modules in extra logically belong in their own libraries, extracting them to a directory in src/, and adding a minimal amount of boilerplate to the makefiles. Multiple people can work on this, coordinating on the issue tracker.

# Improve the official cheatsheet

We have the beginnings of a 'cheatsheet', documenting various common patterns in Rust code (http://static.rust-lang.org/doc/master/complement-cheatsheet.html), but there is so much more that could be here. This style of documentation is hugely useful for newcomers. There are a few ways to approach this: simply review the current document, editing and augmenting the existing examples; think of the questions you had about Rust when you started and add them; solicit questions (and answers!) from the broader community and them; finally, organize a doc sprint with several people to make some quick improvements over a few hours.

# Implement the `Share` kind (#11781)

Future concurrency code is going to need to reason about types that can be shared across threads. The canonical example is fork/join concurrency using a shared closure, where the closure environment is bounded by `Share`. We have the `Freeze` kind which covers a limited version of this use case, but it's not sufficient, and may end up completely supplanted by `Share`. This is quite important to have sorted out for 1.0 but the design is not done yet. Work with other developers to figure out the design, then once that's done the implementation - while involving a fair bit of compiler hacking and library modifications - should be relatively easy.

# Remove `do` (#10815)

Consensus is that the `do` keyword is no longer pulling its weight. Remove all uses of it, then remove support from the compiler. This is a 1.0 issue.

# Experiment with faster hash maps (#11783)

Rust's HashMap uses a cryptographically secure hash, and at least partly as a result of that it is quite slow. HashMap continues to show up very, very high in performance profiles of a variety of code. It's not clear what the solution to this is, but it is clear that - at least sometimes - we need a much faster hash map solution. Figure out how to create faster hash maps in Rust, potentially sacrificing some amount of DOS-resistance by using weaker hash functions. This is fairly open-ended and researchy, but a solution to this could have a big impact on the performance of rustc and other projects.

You might be interested by a serie of articles by Joaquín M López Muñoz who maintains the Boost.MultiIndex library. He did a relatively comprehensive overview of the hash-maps implementation of Dirkumware (MSVC), libstdc++ and libc++ on top of Boost.MultiIndex, and a lot of benchmarks showing the performance for insertion/removal/search in a variety of setup.

One of the last articles: http://bannalia.blogspot.fr/2014/01/a-better-hash-table-clang.html
 

# Replace 'extern mod' with 'extern crate' (#9880)

Using 'extern mod' as the syntax for linking to another crate has long been a bit cringeworthy. The consensus here is to simply rename it to `extern crate`. This is a fairly easy change that involves adding `crate` as a keyword, modifying the parser to parse the new syntax, then changing all uses, either after a snapshot or using conditional compilation. This is a 1.0 issue.

# Introduce a design FAQ to the official docs (#4047)

There are many questions about languages' design asked repeatedly, so they tend to have documents simply explaining the rationale for various decisions. Particularly as we approach 1.0 we'll want a place to point newcomers to when these questions are asked. The issue on the bug tracker already contains quite a lot of questions, and some answers as well. Add a new Markdown file to the doc/ folder and the documentation index, and add as many of the answers as you can. Consider recruiting others in #rust to help.

_______________________________________________
Rust-dev mailing list
[hidden email]
https://mail.mozilla.org/listinfo/rust-dev


_______________________________________________
Rust-dev mailing list
[hidden email]
https://mail.mozilla.org/listinfo/rust-dev
Reply | Threaded
Open this post in threaded view
|

Re: Today's Rust contribution ideas

Sebastian Sylvan



On Mon, Jan 27, 2014 at 9:33 AM, Matthieu Monrocq <[hidden email]> wrote:



On Mon, Jan 27, 2014 at 3:39 AM, Brian Anderson <[hidden email]> wrote:

Consensus is that the `do` keyword is no longer pulling its weight. Remove all uses of it, then remove support from the compiler. This is a 1.0 issue.

# Experiment with faster hash maps (#11783)

Rust's HashMap uses a cryptographically secure hash, and at least partly as a result of that it is quite slow. HashMap continues to show up very, very high in performance profiles of a variety of code. It's not clear what the solution to this is, but it is clear that - at least sometimes - we need a much faster hash map solution. Figure out how to create faster hash maps in Rust, potentially sacrificing some amount of DOS-resistance by using weaker hash functions. This is fairly open-ended and researchy, but a solution to this could have a big impact on the performance of rustc and other projects.

You might be interested by a serie of articles by Joaquín M López Muñoz who maintains the Boost.MultiIndex library. He did a relatively comprehensive overview of the hash-maps implementation of Dirkumware (MSVC), libstdc++ and libc++ on top of Boost.MultiIndex, and a lot of benchmarks showing the performance for insertion/removal/search in a variety of setup.

One of the last articles: http://bannalia.blogspot.fr/2014/01/a-better-hash-table-clang.html

Let me also plug this blog post from a while back: http://sebastiansylvan.com/2013/05/08/robin-hood-hashing-should-be-your-default-hash-table-implementation/ . There's also a followup on improving deletions*, which makes the final form the fastest hash map I know of. It's also compact (95% load factor, 32 bits overhead per element, but you can reduce that to 2 bits per element if you sacrifice some perf.), and doesn't allocate (other than doubling the size of the table when you hit the load factor). 

For a benchmark with lots of std::strings it was 23%, 66% and 25% faster for insertions deletions and lookups (compared to MSVC unordered_map), it also uses 30% less memory in that case.

Seb

* the basic form has an issue where repeated deletes gradually increases the probe count. In pathological cases this can reduce performance by a lot. The fix is to incrementally fix up the table on each delete (you could also do it in batch every now and then). It's still faster in all cases, and the probe-length as well as probe-length-variance remains low even in the most pathological circumstances.

_______________________________________________
Rust-dev mailing list
[hidden email]
https://mail.mozilla.org/listinfo/rust-dev
Reply | Threaded
Open this post in threaded view
|

Re: Today's Rust contribution ideas

Matthieu Monrocq



On Mon, Jan 27, 2014 at 11:41 PM, Sebastian Sylvan <[hidden email]> wrote:



On Mon, Jan 27, 2014 at 9:33 AM, Matthieu Monrocq <[hidden email]> wrote:



On Mon, Jan 27, 2014 at 3:39 AM, Brian Anderson <[hidden email]> wrote:

Consensus is that the `do` keyword is no longer pulling its weight. Remove all uses of it, then remove support from the compiler. This is a 1.0 issue.

# Experiment with faster hash maps (#11783)

Rust's HashMap uses a cryptographically secure hash, and at least partly as a result of that it is quite slow. HashMap continues to show up very, very high in performance profiles of a variety of code. It's not clear what the solution to this is, but it is clear that - at least sometimes - we need a much faster hash map solution. Figure out how to create faster hash maps in Rust, potentially sacrificing some amount of DOS-resistance by using weaker hash functions. This is fairly open-ended and researchy, but a solution to this could have a big impact on the performance of rustc and other projects.

You might be interested by a serie of articles by Joaquín M López Muñoz who maintains the Boost.MultiIndex library. He did a relatively comprehensive overview of the hash-maps implementation of Dirkumware (MSVC), libstdc++ and libc++ on top of Boost.MultiIndex, and a lot of benchmarks showing the performance for insertion/removal/search in a variety of setup.

One of the last articles: http://bannalia.blogspot.fr/2014/01/a-better-hash-table-clang.html

Let me also plug this blog post from a while back: http://sebastiansylvan.com/2013/05/08/robin-hood-hashing-should-be-your-default-hash-table-implementation/ . There's also a followup on improving deletions*, which makes the final form the fastest hash map I know of. It's also compact (95% load factor, 32 bits overhead per element, but you can reduce that to 2 bits per element if you sacrifice some perf.), and doesn't allocate (other than doubling the size of the table when you hit the load factor). 

For a benchmark with lots of std::strings it was 23%, 66% and 25% faster for insertions deletions and lookups (compared to MSVC unordered_map), it also uses 30% less memory in that case.

Seb

* the basic form has an issue where repeated deletes gradually increases the probe count. In pathological cases this can reduce performance by a lot. The fix is to incrementally fix up the table on each delete (you could also do it in batch every now and then). It's still faster in all cases, and the probe-length as well as probe-length-variance remains low even in the most pathological circumstances.

Thanks for the link, I should have mentioned that the C++ Standard version is constrained by a memory stability requirement which may or may not apply to Rust (thanks to borrow checks, it should be possible to know statically whether an element is borrowed or not). This memory stability requirement as well as some other requirements such as relative stability of items within the same equivalence class during insert/erase several constrain the design; and indeed if the requirements can be lifted it the designs proposed on bannalia will be suboptimal.

-- Matthieu

_______________________________________________
Rust-dev mailing list
[hidden email]
https://mail.mozilla.org/listinfo/rust-dev