Skip to content

Instantly share code, notes, and snippets.

@aeden
Forked from copiousfreetime/gem-index-with-dns.md
Created June 30, 2011 21:51
Show Gist options
  • Save aeden/1057357 to your computer and use it in GitHub Desktop.
Save aeden/1057357 to your computer and use it in GitHub Desktop.

Currently the RubyGems index is stored as a Gzip file that is a marshalled array. Whenever Bundler needs to install a gem that is not yet installed it downloads the index, gunzips it and unmarshals it. It then looks for dependencies that are described in another file that is also a gzipped and marshalled file. There are several issues that arise from this setup:

  • The full index must be downloaded and parsed, but does not contain dependency data, which must then be downloaded and parsed. This is a relatively time consuming process.
  • The index must be centralized.

Additionally the gems themselves are currently centralized since there is nothing in the meta data that indicates where the gem should be downloaded from. However in order to allow this it is important to find ways of keeping the index from being poisoned (is this an issue in the centralized system?)

Dependency Resolution

I'd like to propose an alternate way of indexing RubyGems: let's use DNS.

Here's how this might work. For this example, I want to get the latest version of Rails, which is 3.0.1 (in this scenario):

  • Client sends question to local name server for ALL records at rails.index.rubygems.org
  • Local name server does not have the record so it sends the usual response indicating that the search should go upstream to the roots
  • Root delegates to .org name servers
  • .org name servers delegate to rubygems.org name servers
  • rubygems.org name servers can either respond to the query or delegate to another set of name servers. It'll answer in this case.

Examples

When a query is received for a specific gem then a collection of PTR records will be returned that represents all available versions for that gem:

  rails.index.rubygems.org.         84600   PTR   1.0.3.rails.index.rubygems.org.
  rails.index.rubygems.org.         84600   PTR   2.0.3.rails.index.rubygems.org.
  rails.index.rubygems.org.         84600   PTR   3.0.3.rails.index.rubygems.org.

If a specific version is requested then PTR records will be returned that represent all of the dependencies for that version. For example:

  1.0.3.rails.index.rubygems.org.   84600   PTR   0.0.3.activesupport.index.rubygems.org.
  1.0.3.rails.index.rubygems.org.   84600   PTR   0.0.3.actiopack.index.rubygems.org.
  1.0.3.rails.index.rubygems.org.   84600   PTR   0.0.3.activerecord.index.rubygems.org.
  1.0.3.rails.index.rubygems.org.   84600   PTR   0.0.3.activeresource.index.rubygems.org.
  1.0.3.rails.index.rubygems.org.   84600   PTR   0.0.3.actionmailer.index.rubygems.org.
  1.0.3.rails.index.rubygems.org.   84600   PTR   0.0.3.railties.index.rubygems.org.
  1.0.3.rails.index.rubygems.org.   84600   PTR   1.bundler.index.rubygems.org.

Note that some PTR records represent canonical gem names and others would be a CNAME pointing to the appropriate canonical version. The last record is an example of this where the CNAME record would likely resolve to something like 7.0.1.bundler.index.rubygems.org (which would be the reverse notation for bundler-1.0.7). This also allows for ~>, = and >= support and, with some small CNAME manipulations, <, <= and != as well. More information on this below.

If the latest version of a gem is requested:

 latest.rails.index.rubygems.org.   600  CNAME   10.0.3.rails.index.rubygems.org.

Twiddlewakka

For instance in the Amalgalite 1.0.0 gem has runtime dependencies of

  • arrayfields ~> 4.7.4
  • fastercsv ~> 1.5.4

This can be modeled with the following set of records

latest.amalgalite.index.rubygems.org  600     CNAME   0.0.1.amalgalite.index.rubygems.org
0.0.1.amalgalite.index.rubygems.org   84600   PTR     5.1.fastercsv.index.rubygems.org 
0.0.1.amalgalite.index.rubygems.org   84600   PTR     7.4.arrayfields.index.rubygems.org
5.1.fastercsv.index.rubygems.org      600     CNAME   4.5.1.fastercsv.index.rubygems.org
4.7.arrayfields.index.rubygems.org    600     CNAME   4.7.4.arrayfields.index.rubygems.org

It is not exactly the same, but close enough, the 5.1.fastercsv.index.rubygems.org would then be a CNAME record for the latest 1.5.x version of fastercsv.

Equal To

for a = dependency, they would be:

latest.amalgalite.index.rubygems.org  600     CNAME   0.0.1.amalgalite.index.rubygems.org
0.0.1.amalgalite.index.rubygems.org   84600   PTR     4.5.1.fastercsv.index.rubygems.org 
0.0.1.amalgalite.index.rubygems.org   84600   PTR     4.7.4.arrayfields.index.rubygems.org 

Greater Than or Equal To

And for a >=, they would be dependent on the most recent release of the gem in question, which is always found as the CNAME of that gemname

latest.amalgalite.index.rubygems.org  600     CNAME   0.0.1.amalgalite.index.rubygems.org
0.0.1.amalgalite.index.rubygems.org   84600   PTR     latest.fastercsv.index.rubygems.org 
0.0.1.amalgalite.index.rubygems.org   84600   PTR     latest.arrayfields.index.rubygems.org

Complex Resolutions

For a <, <=, != or dependencies with more than 1 requirement we need to do some trickery. The PTR record for the dependency will point to a CNAME record prefixed with the name dependency. This CNAME record will then be updated as new versions of the given gem are released as long as the dependency can still be satisfied.

Let's look at some examples:

Less Than or Equal To & Less Than

"< 4.5.1" for fastercsv and "<= 4.7.4" for arrayfields:

latest.amalgalite.index.rubygems.org              600     CNAME   0.0.1.amalgalite.index.rubygems.org
fastercsv.0.0.1.amalgalite.index.rubygems.org     600     CNAME   0.5.4.fastercsv.index.rubygems.org
arrayfields.0.0.1.amalgalite.index.rubygems.org   600     CNAME   4.7.4.arrayfields.index.rubygems.org
0.0.1.amalgalite.index.rubygems.org               84600   PTR     fastercsv.0.0.1.amalgalite.index.rubygems.org
0.0.1.amalgalite.index.rubygems.org               84600   PTR     arrayfields.0.0.1.amalgalite.index.rubygems.org

Not Equal To

The != dependency op is essentially the same. Consider "!= 4.5.2" for fastercsv:

latest.amalgalite.index.rubygems.org              600     CNAME   0.0.1.amalgalite.index.rubygems.org
fastercsv.0.0.1.amalgalite.index.rubygems.org     600     CNAME   1.5.4.fastercsv.index.rubygems.org
0.0.1.amalgalite.index.rubygems.org               84600   PTR     fastercsv.0.0.1.amalgalite.index.rubygems.org

If a patch release for fastercsv was released with the version 4.5.2 then the CNAME record for ne.2.5.4.fastercsv.index.rubygems.org would not change. On the other hand a patch version of 4.5.3 would cause the CNAME to change:

latest.amalgalite.index.rubygems.org             600     CNAME   0.0.1.amalgalite.index.rubygems.org
fastercsv.0.0.1.amalgalite.index.rubygems.org    600     CNAME   3.5.4.fastercsv.index.rubygems.org
0.0.1.amalgalite.index.rubygems.org              84600   PTR     fastercsv.0.0.1.amalgalite.index.rubygems.org

Dependencies with 2 or more requirements

For example, if the dependency is on fastercsv [">= 1.0.4", "< 1.7.0"], and the current version of fastercsv is 1.1.0 then the records would look like this:

latest.amalgalite.index.rubygems.org             600     CNAME   0.0.1.amalgalite.index.rubygems.org
fastercsv.0.0.1.amalgalite.index.rubygems.org    600     CNAME   0.1.1.fastercsv.index.rubygems.org
0.0.1.amalgalite.index.rubygems.org              84600   PTR     fastercsv.0.0.1.amalgalite.index.rubygems.org

If the version of fastercsv was changed to 1.6.9 then the records would be:

latest.amalgalite.index.rubygems.org             600     CNAME   0.0.1.amalgalite.index.rubygems.org
fastercsv.0.0.1.amalgalite.index.rubygems.org    600     CNAME   9.6.1.fastercsv.index.rubygems.org
0.0.1.amalgalite.index.rubygems.org              84600   PTR     fastercsv.0.0.1.amalgalite.index.rubygems.org

And if the version was changed to 1.7.0 or higher, the CNAME would not change.

Development Dependencies

All of the above dependencies are assumed to be runtime. If using the gem command you typed:

gem install --development amalgalite

Then that would install all of amalgalite's development dependencies. To facilitate this same functionality we will add an additional PTR records for all the development dependencies using 'gemname-development' as the namespace.

latest.amalgalite.index.rubygems.org             600     CNAME   0.0.1.amalgalite.index.rubygems.org
0.0.1.amalgalite-development.index.rubygems.org  84600   PTR     8.0.rake.index.rubygems.org
0.0.1.amalgalite-development.index.rubygems.org  84600   PTR     2.1.configuration.index.rubygems.org
0.0.1.amalgalite-development.index.rubygems.org  84600   PTR     5.2.rspec.index.rubygems.org
8.0.rake.index.rubygems.org                      600     CNAME   0.8.0.rake.index.rubygems.org
2.1.configuration.index.rubygems.org             600     CNAME   0.2.1.configuration.index.rubygems.org
etc ...

Downloads

In addition to dependency management another interesting use of DNS is to provide references to where gems can be downloaded. Here is how this might work:

  • Client sends question to local name server for ALL records at rails.index.rubygems.org
  • Local name server does not have the record so it sends the usual response indicating that the search should go upstream to the roots
  • Root delegates to .org name servers
  • .org name servers delegate to rubygems.org name servers
  • rubygems.org name servers can either respond to the query or delegate to another set of name servers. It'll answer in this case.
  • when queries for latest.rails.index.rubygems.org the rubygems.org name servers respond with a CNAME record pointing to 1.0.3.rails.index.rubygems.org and all NAPTR records for 1.0.3.rails.index.rubygems.org,

for example:

latest.rails.index.rubygems.org.  600   CNAME   1.0.3.rails.index.rubygems.org.
1.0.3.rails.index.rubygems.org.   60    NAPTR   100 10 "U" "TCP+http" "!^.*$!http://rubygems.org/rails-3.0.1.gem!i" .
1.0.3.rails.index.rubygems.org.   60    NAPTR   100 20 "U" "TCP+http" "!^.*$!http://backup.rubygems.org/rails-3.0.1.gem!i" .

Note that there is no need to do any complex regex translation to get the various URLs since they are mapped directly to the canonical name of the gem.

Other Considerations

Platforms

To support multiple platforms (i.e. jruby) the client will first try platform.z.y.x.gemname.index.rubygems.org. If this is not found then the client should use z.y.x.gemname.index.rubygems.org. If a platform gem is provided then CNAME records will also need to be provided for all of the variations, i.e platform.y.x, platform.x and platform.

Decentralization

DNS provides the tools necessary to make this a decentralized system if we desire. This would be accomplished by delegating responsibility for gem names out to different DNS servers other than the rubygems.org servers. For example, if responsibility for management of the Rails gem metadata was decrentralized then the interaction might look like this:

  • Client sends question to local name server for TXT records at rails.index.rubygems.org

  • Local name server does not have the record so it sends the usual response indicating that the search should go upstream to the roots

  • Root delegates to .org name servers

  • .org name servers delegate to rubygems.org name servers

  • rubygems.org name servers respond with the following NS record:

    rails.index.rubygems.org.   600   NS   ds1.rubyonrails.org
    rails.index.rubygems.org.   600   NS   ds2.rubyonrails.org
    
  • The question is then sent to one of the two name servers which responds with a CNAME record pointing rails.index.rubygems.org to 1.0.3.rails.index.rubyonrails.org.

  • The rubyonrails.org name servers would then respond as shown in the scenarios above.

Security

DNSSEC providers a means for signing DNS records so that you have verification that the name server is authoritative for the particular question. This technology is not yet widely deployed, however it does have the potential for providing layer of protection against gem poisoning when used in conjunction with and SHA signature. The SHA signature could also be stored in the name servers using a TXT or SIG record. This technology is still very experimental, but the potential exists for having a highly trusted distribution system.

Searching

DNS does not provide a mechanism for search for records given a part of a name. For example, there is no mechanism in DNS to query for the term "active" and get "activerecord", "activeresource", etc. This functionality would need to be provided using a protocol other than DNS.

Reference

List available gem versions

dig @ns8.dnsimple.com rails.index.rubygems.org ptr

List dependencies for the latest version of a gem

dig @ns8.dnsimple.com latest.rails.index.rubygems.org ptr

List dependencies for a specific version of a gem

dig @ns8.dnsimple.com 10.3.2.rails.index.rubygems.org ptr
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment