PuppetConf 2015 recordings now available!

During the first week of october 2015 the last PuppetConf happened, Puppet Labs now released the recordings, here are a few that I can recommend:

Chocolatey and Puppet: Managing Your Windows Software Since 2011
Many people are still scared about managing Windows with Puppet (in fact, they are scared about almost anything with Windows). Rob Reynolds (Puppet Labs employee) talks about the management of software on windows with Chocolatey.

Are You Sure This Works? Test Your Modules for PE and Puppet Using Beaker, Docker and More
Richard Pijnenburg (elastic employee), who helped me a lot on setting up an elasticsearch cluster talks about the tricks and different ways on testing puppet modules.

Under the Hood: C++ Development at Puppet Labs
Puppet Labs was once a Ruby only development company. Now they are also heavily using clojure, C++ and other languages as well (chosse the best language for the job that needs to be done, not the language the know best). Kylo Ginsberg (Puppet Labs employee) talks about C++ at Puppet Labs, their most known C++ project is probably facter. Since version 3.0 it is written in C++, the old versions (that are way slower) are pure ruby.

Puppet, After a Decade and Change
Puppet itself is now more than 10 years old, Eric Sorenson (Puppet Labs employee) make a look back in the past and talks about the changes in Puppet in the last decade and gives a prospect of what will and could happen in the future.

Hacking Types and Providers – Introduction and Hands-On
Felix Frank (yey a german dude) is heavily contributing to Puppet itself and to our puppet-community namespace which made him a featured community member very early. He talks about how you have to read the types and providers source code, his struggles with it and how you extend the existing code base.

200,000 Lines Later: Our Journey to Manageable Puppet Code
Constant Contact is a U.S. company that started very early to use puppet but failed to keep track with newer versions. David Danzilio (the man with the best github avatar, shortly before daenney) started their to clean up the mess. He did an awesome job and now talked about the cleanup.

Puppetizing Your Organization: Taking Puppet from a Proof of Concept to the Configuration Management of Choice
I already wrote about the awesomeness of Rob (hey Rob o/) so I don’t need to repeat that here. He is currently working at AT&T and talked about proof of conecpts in configuration management and how to introduce it to colleagues.

Am I Awesome or Does This Refractor Suck?
Gary Larizza (who has an awesome blog that I mentioned earlier and does even more awesome work and talks) talks about refactoring. Please watch this video, I really like his kind of presentating and talking.

Last but not least: Identity: LGBTQ in Tech
Daniele Sluijters (the person with the second best github avatar) talks about the people in open source communities. I don’t want to spoil the talk so please just watch it.

All the videos are are available at puppetlabs.com

Posted in 30in30, General, Internet found pieces, Linux, Nerd Stuff, Puppet | Leave a comment

(not really) weekly link collection

Not Even Close: The State of Computer Security (with slides) – James Mickens from NDC Conferences on Vimeo.

Create Puppet modules with solid foundations

How to Use Puppet right

The CAP Theorem (only Wikipedia)

Serving Dovecot mailbox quota status to Postfix

Problems with programming and time part 1

Problems with programming and time part 2

Problems with time part 3

Problems during naming things in programming

Last but not least:David Danzilio Sad something wise about testing with a code example:
17:50:36 danzilio | it tests the implementation instead of the interface
17:50:44 danzilio | which is the number one rule of testing
17:50:52 danzilio | test the interface, not the implementation

Posted in 30in30, General, Internet found pieces, IT-Security, Linux, Nerd Stuff, Puppet | 1 Comment

LARS – Live Arch Rescue System

Yesterday I wrote about solutions for deploying and installing machines with installimage. As mentioned, these scripts need to be executed via a live linux system. Today I’m presenting you LARS, a Live Linux perfectly suited for this purpose. LARS is maintained by the VirtAPI Team and is based of a modified Archlinux (alos called arch) environment. How does this magic work?

build.sh
The main script is build.sh, this will fetch a minimal list of needed packages and installs them in a subdir, than adds packages from packages.both to it. Now you already have a small arch installation in a subdirectory where you could chroot into. The build.sh script also adds syslinux and EFI support into it, this allows us to boot the ISO via BIOS and (U)EFI mode from USB/CD drives but also from a PXE server.

customize_airootfs.sh
Before the ISO file will be created the customize_airootfs.sh script will be executed in a sort of chroot. This allows us to place hardcoded ssh key in the authorized_keys file, usage of internal mirrors or also modifying the fstab to automount our NFS server on startup that serves the installimages itself and the images.

config.sh
Every configuration option should be available in config.sh, this file is used for configuring the deployment of the image but also for settings inside of the ISO itsef.

automated_script.sh
The ISO has a cool default behaviour: you can add a “script=http://…bla.sh” like param to the kernel cmdline, the ISO will detect this after the startup, do a autologin in TTY1, download the script an than executes it. This magic is done be the automated_script.sh which is simply triggered by the .bash_profile.

rebuild_and_copy.sh
We created the rebuild_and_copy.sh script to easily rebuild the ISO and copy it to the DHCP server. It takes care about cleaning up your build environment before building the ISO via build.sh and also copies it to the DHCP server where it gets extracted.

How to boot it?
We wrote an example pxelinux config that you can use. We will add this later to our github repo. In the next weeks we will also create a Puppet module which will configure a server with NFS, TFTPD + pxelinux, nginx and isc-dhcpd to serve LARS and installimage to all nodes on your network. Let me know if you have got any suggestions or questions!

Posted in 30in30, General, Linux | Leave a comment

Deploying compute nodes/instances the awesome way!

Everybody is talking about cloud computing and how to use them. As a system engineer I’m more thinking about “how the hell do we build this magic black box”. One important aspect for delivering compute instances (like AWS EC2) is: How do I deploy/install these instances?

I need a generic solution that can install different linux distributions on different hypervisors (like Qemu and VMWare vSphere). Instances are still running on physical hardware, so I also need to deploy those. Every solution works with some kind of images, nobody wants to maintain the same distribution in two different image formats, so it would be good to have a really generic solution which offers the possibility to provision virtual machines and physical nodes. I’m the founder of the VirtAPI project, one of our projects is the installimage. This is a collection of bash scripts originally developed and still in use by Hetzner Online GmbH. The purpose is to boot a small live linux via PXE and than run these scripts. They will:

  • create software raids
  • create partitions
  • fomart them with your most loved filesystem
  • unzip a linux image
  • install grub in a chroot
  • set root password/ssh key/autostart options

All this can be completely automated by a config file, another solution is to use the guided installer. installimage currently supports Debian, CentOS, Ubuntu, Fedora, OpenSuse based on zipped images. It also features Gentoo and Archlinux as a live installation (pacstrap/stage3 archive) but this isn’t well tested yet. I’m currently updating the code to remove some legacy calls (for example parsing ifconfig instead if ifdata/ip). The VirtAPI Team will also release a new repository to build an ArchISO which is optimized for booting via PXE and running installimage.

PXE is a perfect deployment solution, this is a very well established protocol stack for over two decades which works on every physical node and on every paravirtualized virtual machine. installimage is a generic solution without a “vendor lock-in” like RedHats kickstart project or Debians debootstrap. installimage is currently widely deployed, let me know if you’re interested in this project and want to use it or participate!

Posted in 30in30, General, Linux | 1 Comment

Datawarehousing for Cloud Computing Metrics

Besides being a system engineer in the area of cloud computing I’m currently studying information technology with a focus on computer engineering. We’re a group of 16 students and need to build teams with two to five people and do our thesis as a group. I will probably lead a three man team of the project. One possible topic that I came up with is:

Datawarehousing for Cloud Computing Metrics

Description of the topic:
$company is running several linux hypervisors with many virtual machines. We want to log usage statistics like CPU Time/IO/Ram Usage/Disk Usage/Network Traffic (for example with collectd and their Virt Plugin), send these values to a database, transform the history values into trends by summarising them, and make all data accessible via an API and webinterface.

Goal:

  • Provide in depth statistics for every VM/VM-Owner which would allow per-usage-billing (for example with grafana and Cyclops)
  • Create an API where you can input specs for a virtual machine that you want to create and our datawarehouse will find the most suitable node for you based on his last usage

Project scope:
we’ve got a time-frame of at least 120 hours for every group member, so 360 hours for the complete project. We will meet with $company in person for kickoff/milestone meetings and work 1-3 Weeks in their office. But for the most of the time we will work via remote because we still have our regular jobs + evening classes in Germany.

This is a huge project and we need to put our focus on special aspects of the project, these could be:

  • Which use cases exists for the datawarehouse, which metrics would suit theses cases, how do we get them, how long will we have to keep them and how often do we have to poll them
  • What is the best solution for aggregating values?
  • What are the requirements to our database and which DBMS meets them? (document based, relational, time series, graph database)
  • Which information can we get from our database and how do we use it?
  • How do we provide information?

1. There are many tools to collect information on a Linux node, I already mentioned collectd, alternative solutions are sysstat and atop, it would also be possible to write an own solution to get these information. Important point to think about is: which information do we actually need? Many people like to save everything they can get “because I may use it later”. But in a huge setup, with 35.000 virtual machines, collecting information every 10 seconds and save them for a few months or for the complete life cycle of a virtual machine will create a huge amount of data (and also slow down the database?). Depending on the storage type (cheap disks or more expensive SSDs) it is worth a thought to think about the amount of metrics and if you really need them. We also need to think about the different metrics that are possible with the different tools, all of them offer different metrics, do we need all metrics of all tools?

2. The central collecting and aggregation service is the core part of this project, there are existing solutions for this problem, for example logstash with the ganglia or udp plugin. Another solution is Riemann, this is an event stream processor which handles many different kinds of event streams, combines them and triggers actions based on them. It would also be possible to write an own service in C/C++, Ruby, Clojure. Basic requirements for all solutions are: listening on an interface to wait for incoming values or pull them from the nodes, maybe aggregate them in any way, write them into $database.

3. Things to think about: Do we need a distributed search and analytics engine like elasticsearch, or a distributed NoSQL setup with cassandra, or is it okay to work with a single relational database like postgres? Another solution would be a time series database (a time series is a flow of multiple related values, for example the CPU temperature measured every 30seconds over a period of 10 minutes) like OpenTSDB (OpenTSDB also has a built in API for retrieving the values). Network links inside of a datacenter can be considered as stable, so faults are the exception. So is it worth building a huge cluster if the possible downtime is only a few minutes high (for example during maintenance work)? Or is a distributed setup needed because of IO bottlenecks? Should the central collecting service spool the values if the database is unavailable? In a large environment capacity planning is also important, how efficiently can the database compress and save the values? Which system has the lowest memory/CPU usage for each saved and processed value (this is also important for the software that processes the values, see point 2) and how do you measure the efficiency?

4. We can get detailed usage statistics from our database and use this for per-usage billing, it is also possible to find the perfect node for a new virtual machine. Another idea is to do a forecast for usage based on old data. For example one machine always had high IO usage on Christmas, our service could inform you before Christmas that this machine probably will do high IO again and offer you an alternative node with more IO capacity. We can also analyse the current usage of the whole platform and of each node and optimize the platform by offering recommendations for virtual machine migrations to increase packing density (free up as many host systems as possible, accept an optional list of allowed source and/or migration nodes). Last idea: you often have to do maintenance work and and exchange old hardware with new one. We can create an algorithm which accepts any node as the input as the source (and an optional list of allowed destination nodes) and outputs suitable migration destinations with the goal to migrate as fast as possible.

5. We need different solutions to access the data, one mentioned webinterface is grafana, an alternative is graphite, both are fine for end users to get stats. But what about system administrators? They need a working API to interact with. Which is the best data serialisation format? Do we need to expose graphs or the raw history/trends? How does the API needs to be structured?

Posted in 30in30, General, Linux, Nerd Stuff | 1 Comment

30 Posts in 30 Days

Robert is again doing a blog series where he writes 30 posts in 30 days. I’m also participating (as last year) and will try to write one post each day, from the 2nd of November until the 2nd of december.

Let us start this 30 posts challenge with a special word of praise to Robert.

I first met him in #r10k, I came there to learn about solutions for creating Puppet test environments somehow dynamically. Robert heavily contributed to the r10k documentation and is always present there to answer even the stupiest questions. These questions where almost always mine, without his and finchs help (the r10k author) I would probably still try to figure out how r10k works and how I implement it in my employers infrastructure. After setting all this up I had a great utility to create dynamic environments on all of our puppet masters, but now I had a new issue: How do I structure my puppet code to make it easily distributable in any puppet environment and how do I manage which node gets which code. Robert is blogging heavily and also wrote about the solution for my issue, the roles and profiles design pattern.

I’m trying to give support on the #puppet IRC channel, many people have the same problems as I had, I stopped describing the solution a long time ago and only refer to his blog post (series) which really covers all issues that you can find while designing your puppet infrastructure.

He even improved the complete workflow that comes with the Roles/Profiles pattern, wrote a very detailed article about it, didn’t forget hiera and even created a bootstrap puppet file and rspec tests for this.

I’m writing this blog series in english to honor his work, he is always reachable via IRC and open for any question, he usually has a good muppet show clip up his sleeve and regularly blogs about cool stuff (mostly puppet related). Sadly I didn’t made it to PuppetConf 2015 where he gave a talk, hopefully I can meet him on PuppetConf 2016.

Robert, I owe you a beer!

Posted in 30in30, General, Linux, Nerd Stuff, Puppet | 2 Comments

Short Tip: Smart Tests für alle Datenträger starten

Zuerst suchen wir uns per find alle Datenträger:

find /dev/sd?

Danach überlegt man sich das Kommando um den Test zu starten:

smartctl -t /dev/sdX

Und dann kann man dies wunderbar verbinden:

find /dev/sd? -exec smartctl -t long {} \;
Posted in General, Linux, Short Tips | Leave a comment

IRC Quote des Tages

22:34:09 canta | im normalfall rotzt der gcc eh alles uebern haufen und inlined deine mutter

Posted in General, Internet found pieces, Linux, Nerd Stuff | Leave a comment

Archlinux: Nicht mehr benötigte Pakete deinstallieren

Der Paketmanager pacman kann ein Paket entfernen inklusive der Konfigurationsdateien und rekursiv alle abhängigen Pakete:

pacman -Rns paketname

Außerdem kann pacman eine Liste an Paketen ausgeben, welche automatisch als Abhängigkeit installiert wurden, aber nicht mehr benötigt werden:

pacman -Qtdq

Das ganze kann nun kombiniert werden um sein System aufzuräumen:

pacman -Rns $(pacman -Qtdq)
Posted in General, Linux, Short Tips | Leave a comment

Pute/Hähnchen BBQ Rub

Zutaten für 1kgHähnchenbrustfilets:

  • 3 EL Rohrzucker
  • 1,5 EL rosenscharfes Paprikapulver
  • 1,5 EL edelsüßes Paprikapulver
  • 1,5 EL frisch gemalenes Salz
  • 1 EL frisch gemalener bunter Pfeffer
  • 1 EL Knoblauchpulver
  • 0,5 EL Cayenne Pfeffer

Zubereitung:
Relativ simpel, alles zusammenkippen, durchrühren. Hähnchenbrustfilets parieren (und nach Möglichkeit gleich dicke Stücke herausschneiden) und intensiv mit dem Rub bestreichen. Um einen besseren Geschmack zu erzeugen sollte man die dickeren Stellen einschneiden und die Schnitte mit dem Rub füllen. Die Filets sollten mindestens 12h in einer Dose ruhen, überflüssiger Rub kann in der Dose verbleiben. Zucker und Salz entzieht dem Fleisch das Wasser, dieses wird sich in der Box ansammeln und kann abgeschüttet werden.

Tip: Damit der Zucker ordentlich zur Geltung kommt sollten die Hähnchenbrustfilets in sehr dünne Streifen geschnitten werden (unter 2cm). Somit muss das Fleisch nur sehr kurz gegrillt werden und der Rub verbrennt nicht. Alternativ kann man die klassischen Putenfilets nutzen welche immer relativ dünn sind.

(Das rezept stammt im Original vom deutschen Grillsportverein)

Posted in General, Recipes | Leave a comment