Anatomy of Gmane v2

Many people have been asking what technology / hardware is behind Gmane these days so I thought I’d put pen to paper (so to speak) and explain what’s going on under the hood.

Mid August we received a disk from Lars with the Gmane spool on it. We had already decided to go with ElasticSearch for the document store, it gives us great scalability and as we rebuild the site it will allow us to have a fast search engine.

We’ve currently setup:

  • 4  x ElasticSearch data servers (these are off-the-shelf Delimiter dedicated servers) each with Dual L5630, 48GB RAM, 2 x 2TB disk.
  • 2 x ElasticSearch routers (Delimiter Cloud) each with 4 Core KVM VM, 16GB RAM, 50GB NVMe accelerated storage (Ceph).
  • 2 x Nginx webservers (Delimiter Cloud) each with 4 Core KVM VM, 32GB RAM, 100GB NVMe accelerated storage (Ceph).
  • 2 x Redis servers (Delimiter Cloud) each with 4 Core KVM VM, 32GB RAM, 100GB NVMe accelerated storage (Ceph).
  • 10TB ObjSpace (S3 compatible object storage) which handles the ElasticSearch backups.

On the webservers we have a mix of Python and PHP handling the various lookup functions, Redis is caching the hot data to alleviate some of the pressure during busy periods on Elasticsearch and then the ElasticSearch routers handling the queries into ElasticSearch.

We’re working on adding the NNRP functionality into this and Martin is coding a NNRP server that will use ElasticSearch as a backend. It works but not ready for the prime-time yet. For now the NNRP remains running off INN.

We’re working between two priorities at the moment: a new NNRP frontend and new mailer front/backend. Once we have all the functionality restored then we can start looking at the web interface and fixing up some of the rushed scripting that was done to get the site back online.

We’d love to hear your feedback, what needs sorting, what would you like to see.

~ Mark