Puppet | Scaling the Control of Repository to Large Organizations

The main purpose of this flexible setup is to provide controlled organizational growth towards DevOps principles in large organizations. 

Many larger organizations have made acquisitions along the way, that are generally almost completely separate entities that want to share when they can. Some organizations have developed teams that work in different rulesets (such as compliance) that make “treat all nodes exactly the same” a difficult process that cannot be solved all at once. These organizations are seeking ways to do things in a standardized and automated way, but often have trouble building meaningful projects to do it. PayPal coined a term called “InnerSource” which means taking the lessons learned from open source and applying them to a way a company develops software internally. These large organizations, which can really be thought of as communities of smaller organizations seeking to achieve similar goals, won’t necessarily do things exactly the same, but they can agree on things they’d like to share.

Larger enterprises I’ve worked with almost always seek to deploy puppet first to some type of Infrastructure as a Service or Platform as a Service team created internally (often called “Infrastructure”). This team deploys a baseline OS (with all security rules, company policies and compliance policies in place), and often some middleware such as Tomcat or JBoss, but could to container managers such as Kubernetes or Docker-Swarm. The various organizations consume this code, and thanks to git, have the ability to contribute back to the profiles module and provide global data (via Hiera) to the organization. The infrastructure team acts as the governing body of the profile, and can approve or reject the code, and even write automated tests for the acceptance of new code in rspec and beaker. 

By splitting the control repository into smaller control repositories, each team can achieve automation and share code with each other via git pull requests. Because of the modularity of the code, each team can approach automation at their own schedule and pace, while simultaneously taking advantage of automation already completed at other parts of the organization. With the changes in this article, the application organizations have the ability to provide governance to their own configurations, to call on company approved baseline infrastructure as code and improve on it for their specific needs, and even self nominate their own code to be shared globally across the organization, at different paces as the organization allows.

Scaling code with Puppet
Control-repositories act as a collection of things that define a Puppet Environment. The control repo can be a strange git repository of code to maintain, because it’s easy to write code in them that you don’t ever want to transfer to another branch. At a small scale, this can go undone or unnoticeable, but as more teams and users get involved, we start to see some data at our profile layer that becomes harder to test across Puppet environments. Our traditional code repository contains a few things: A Puppetfile, A site.pp, A Hiera Hierarchy and Dataset, and modules for our roles and profiles. One of the first steps of scaling the control repo, is redefining it for your organization. My general thought process is, that a control repo is for defining a shared environment, and anything that doesn’t directly contribute to that should be exported.

 One of the first steps of scaling the control repo, is redefining it for your organization. As a git repository, the control repo is a little weird. Because we use the control-repo to define both our organizational and code environments, it’s easy to end up in a hodgepodge of different long-lived and short-lived git branches. This can often make managing the control-repo difficult to fully understand, and has users individually plucking features across branches manually, and struggling to understand how to effectively rollback. Because of this, from a UX perspective, I think the control-repo is a difficult code repository to work in. I want to minimize the amount of work I do inside of the control-repo, and keep it as a record that defines an entire environment. 

Putting together a control repo based only on libraries, global configuration and datasets leaves us with a smaller footprint inside of this repository, letting us essentially create a file that only describes the environment. Your libraries are defined by the Puppetfile, which is a simple list of required code, a repository location to pull them from, and an optional set of data (such as version) to ensure we select an exact match. Your global configuration lives in manifests/site.pp. Hiera and the Puppet Enterprise Console have generally taken over the role the site.pp served in earlier versions of Puppet, and a site.pp rarely contains individual node or regex node configurations anymore. We do use site.pp to set global values for our environment though, such as setting the default provider of execs in Windows node to powershell, or package provider to chocolatey. We sometimes set the header and footer of our Firewall rules here, or various other “always use this” configuration (Filebucket) that’s hard to contain in even roles and profiles, when we want it to just apply to all machines. Finally, we have our Hiera dataset. This actually doesn’t differ that much in how we use it. We build a Hiera hierarchy and a data folder to contain the data. What we choose to store here is a bit different. My general rule at a large enterprise for hieradata: Don’t store it here unless multiple profiles (or manifests) will want to access the data. This level of hiera is no longer best suited to store individual manifest configurations like profile::ntp::servers. This is a great area to store things like LDAP settings or a common Pre-Shared Key (in eyaml) that can be used by the entire organization.

Stripping these things out of your control repo instead leaves you with a small subset of files that can uniquely describe each environment at your organization, without any confusion on versioning (which can be common when roles and profiles are integrated into the control repo). You can even easily version the control repository with a new branch, and point the Puppet Enterprise Console at that branch inside of an environment group with a more human friendly name (such as Preprod). We still need to account for roles, profiles, and single-instance hiera configuration.

RBAC and Governance become a much more notable issue at large organizations. Often, processes around separation of duties and ownership are pre-existing in the organization, and cannot just be discarded. This makes decisions on where to store data (such as encrypted eyaml data), and when to update organization-wide roles and profiles a contentious topic for teams. When rules exist around separation of duties, it can also becomes difficult to decide who is allowed to release any changes, based on different scenario’s into the wild. By relegating the control-repo to just configuration management of our configuration-management system (managing the managers), we allow different models of code governance to flourish.

We end up having two very different types of modules now. We have traditional modules, which manage a single piece of technology, allowing maximization of customization. We also have a new “team” class of module. This team module actually mimics a control repository. Thanks to data in modules, We have a hiera hierarchy that is not only unique for every team, but still has a complete “break glass in case of emergency” functionality via the environment-level hiera for engineers, and via the PE Console Variables for tier 0/1 support. We can even engineer the PE Console groups around these values to write Tier-0 support code right into our manifests, as long as they’re well documented and delivered. Teams can ingest their own hieradata, specific only to the applications or nodes they control, while still making use of those datasets at the environment level, meant to be consumed by all users (Like LDAP). A concept of “local” and “global” hieradata is pretty useful in this context.

Our other major change is how we use the inside of the manifests directory. We don’t use init.pp in this module, as it acts like a mini-control repo. Inside of our manifests directory, we create a roles folder, a profiles folder and work only on roles and profiles. Now each individual team is in control of their own roles and profiles, that are easily namespaced to the teams that actually have RBAC or governance responsibility for those nodes. Roles for example just become myteam::roles::app3 and files are referenced as file:///modules/myteam/app3. By it’s nature, THIS change encourages some stovepipes. The best advice I can give here: what I say about roles and profiles in general: If it becomes too big, get it out of roles and profiles and make a module out of it. If your team is managing a chunk of code that is being consumed in that many different ways, InnerSource it. Place it in a module in git, and decide who the governing body of that code is (anyone technical enough to really read the code, who has an active role in it) and share it across the organization. It’s just like starting a new open source project for the company, which is awesome!

Our original control repo can still contain our traditional long-lived environment branches (Development, Pre-Prod, Production). It can also support new environments for self-provisioning, by creating a new branch of the repository. We can protect the real environments like production with Protected Branches, and allow users or teams to create their own temporary environments for individual development. The organization just needs to assign someone technical to manage pull requests and effectively own the repository. But, this is mostly just incrementing versions and adding global data, so it doesn’t change as often as it did before.

When I’ve worked on this model with organizations, and they feel they can control the pace of automation, you actually hear teams talking together about how to solve problems. While one team may not intend to start writing code immediately due to other obligations, another two are planning on the first modules they’ll write to deploy something. Another team who already had puppet before the rest of the organization starts talking about which modules they’ll recommend, which of their own they’ll open source, and which code they may want to fix before sharing with others (many built on improper use of forge modules). Automated tests seem to make sense, when the expert on the application is also the one writing the roles, profiles and even modules to support the application. The development team is often ecstatic that they won’t need to wait for some manual process to get a custom machine for application development, and wants to write a role and profile that ingests profile::baseline and some middleware built by the infrastructure team.

The best part is some of the major changes that can happen just by developing good habits across teams. Many companies have duplicate environments, often numbered like: dev1, dev2, test, stg1, stg2, pre-prod, prod. Eventually, when we get good at deploying easily to dev1 and dev2, and they’re almost always the same due to automation, we just combine it into dev. And we repeat until we keep gaining confidence in our deployments. And we finally just send out many more deployments per day, to a fewer duplicate environments. Integration is expensive unless you do it all up front, continuously. The ability to do it at different paces gives an organization the ability to walk the walk, and not be forced to be a “DevOps company” on the first day. DevOps is a culture of good habits that can be started at any time, but changing a lot of habits simply takes time.

If you want a sample of what this control-repo and team repo might look like, with 1 team:

RyanR at IGNWs-MacBook-Pro in ~/workspace/ignw/puppet
$ tree control-repo
control-repo
├── Hiera.yaml
├── Puppetfile
├── data
│   ├── common.yaml # Just global values!
│   └── datacenter
│       ├── pdx.yaml # Just global values!
│       ├── sea.yaml # Just global values!
│       └── tx.yaml # Just global values!
└── manifests
    └── site.pp

3 directories, 7 files

RyanR at IGNWs-MacBook-Pro in ~/workspace/ignw/puppet
$ tree myteam
myteam

├── CHANGELOG.md
├── Gemfile
├── README.md
├── Rakefile
├── appveyor.yml
├── examples
├── files

├── hiera.yaml
├── data
│   ├── common.yaml
│   └── os
│       ├── RedHat.yaml
│       ├── Ubuntu.yaml
│       └── Windows.yaml

├── manifests
│   ├── profile
│   │   └── myapp.pp
│   └── role
│       └── myapp.pp
├── metadata.json
├── spec
│   ├── classes
│   │   ├── profile
│   │   │   └── myapp_spec.rb
│   │   └── role
│   │       └── myapp_spec.rb
│   ├── default_facts.yml
│   └── spec_helper.rb
├── tasks
└── templates

11 directories, 12 file

This module was created with the Puppet Development Kit. Check out the PDK if you haven’t, it’s a great QoL feature for building good modules. 

Download: https://puppet.com/download-puppet-development-kit 

Git: https://github.com/puppetlabs/pdk