15. Moving Forward

Too Many SD Cards Problem

I realized pretty quickly when I took on this project that one of the major problems with it is that I need to flash 40 SD cards any time I want to reimage the entire cluster. Sure, I can just try to reimage it as infrequently as possible, but sometimes I know it will be necessary. On that note, I present you with my solution:

SD Card Reader Array SD Card Reader Array

I think this cost me about $70 to put together, which is way less expensive than an SD card duplicator. Here’s the onslaught of messages I got when I first plugged it in:

SD Card Reader Array - In Action

With this setup, I can image 7 SD cards at a time at about 30 MB/s total, which means that seven class 4 SD cards can be flashed at the same time at full speed. I think the top speed would be even faster if my laptop had USB 3.0 ports.

SD Card Reader Array - In Action

Assuming 20 minutes per batch of SD cards, I should be able to reimage the entire cluster in about 2 to 2.5 hours. I like this a lot better than the 14 to 17.5 hours it would take without my array of SD card readers.

Software Installation

I’m in the process of installing software for a Linux image that will allow me to run a number of different distributed software packages. What I want to create is a Raspbian-based image with Mesos and a number of the packages that run on top of it.

Benchmarks

I have not run benchmarks yet, but I intend to try to run the LINPACK benchmark once I get the prerequisites installed.

14. Final Assembly 15. Moving Forward

11 thoughts on “15. Moving Forward

  1. Hipska

    Maybe you can have a look on Puppet, then you only need an image with Raspbian and Puppet installed, the rest of the configuration and installation can be done with Puppet.

    Reply
    1. David Guill Post author

      I have no experience with Puppet or Chef. I assume they’re good tools, because people seem to say good things about them. However, a Puppet license for my cluster would cost $112/yr. I can’t justify that yet. I’ll make it a point to try the free version on a few machines at some point and decide based on that if it would be worth paying for it.

      Reply
      1. Giuseppe Turitto

        Well, you spent 70$ on your solution. 112$ for using puppet. So you add extra 52$, learn something new and have an infrastructure that can be upgraded anytime with having to pull 40 cards….

        By the way great Project. I have been working on my own in something similar, but so far I only had enough time and resources to set up a 7 nodes cluster with no HDD support.

        Reply
        1. David Guill Post author

          Thanks.

          Just so we’re clear though, I would have bought the array of SD card readers regardless of whether or not I install a configuration manager. If I have to reflash all the SD cards in this thing twice, I’ll consider it to have paid for itself in saved time.

          As I said to Hipska, I expect to at least try Puppet at some point. But I’ll probably also try other configuration managers. There are quite a few out there. Wikipedia has a nice page that lists several. Others may turn out to be more appropriate to the Pi than Puppet.

          Reply
      2. Thomas

        Okay, so you know already that there is also a free open source version. Good, because Puppet Enterprise adds for example more support and a GUI which is not really needed in this situation. I would go with Puppet Open Source and not pay for the Enterprise version as this is a hobby DIY project.

        Reply
        1. David Guill Post author

          I admit, I initially didn’t realize it was available with a dual license, but through further research I did figure it out. I was able to install the free version of Puppet on a couple cluster nodes. My current plan is to try out Puppet, CFEngine, and BCFG2 (most likely in that order), with the intent of learning the basics of all of them. Lately, I’ve been sidetracked by other things, including putting together a simple Python script that will allow me to bootstrap them a little more efficiently.

          Reply
  2. illuxio

    This is maybe a solution for your problem without additional investment. Add to the SD card an additional patition with a minimal linux that handles the flashing for your main patition. In that way every node handles the flashing itself by receiving the image from a source and added it to main patition.

    My configuration does the following, when I want to flash a new image to my raspberry.
    1. Execute bash script to modify the /boot/cmdline.txt to boot in the flash patition, add a /boot/source.txt, where the source could be found and reboot.
    2. Minimal linux for flashing is booted and automatically executes the flashing process.
    3. After finishing the /boot/source.txt will be deleted, the changes in /boot/cmdline.txt will be undone and the system reboots.
    4. Do optinal things like send an mail that the process is done.

    Reply
    1. David Guill Post author

      I’ve considered this type of solution. But transferring the image would have a huge time cost unless the Pis were able to share the downloaded file automatically. Otherwise, there are 40 of them to one Internet connection and 20 of them sharing a 10/100 connection, which would severely slow down the transfer.

      I intend to try to set something like this up after I can easily generate a complete image by script. At this stage, I’m still working on compiling all the software for ARM for the first time.

      Reply
      1. foxjog

        Have you considered using torrents to share the images across the local network?

        Reply
        1. David Guill Post author

          The BitTorrent protocol doesn’t solve any of the problems that are currently on my plate, but I have it in mind in case a problem comes along that it does solve.

          Reply
  3. Pingback: คลัสเตอร์ 40-node | Unofficial of Raspberry Pi Fan in Thailand

Leave a Reply to illuxio Cancel reply

Your email address will not be published. Required fields are marked *