San Francisco, Sep 23-24, 2014

Videos and slides now available - you need to enter your contact info though:

Good summary reviews on the OpenTable Blog (part 1, part 2, part 3) and on the Ship Show Podcast.

Day 1: Keynotes

Keynote 1: Luke Kanies, Puppetlabs

  • Native (C++) versions of Facter and Agents are in process (later in 2014, sometime in 2015)
  • Puppet-Server (Clojure + JRuby) is the future of the server side, huge performance boosts from both these options.  Available in early-access form *today*.  Metrics plugs into Graphite.
  • Puppet-supported and puppet-approved modules.  Fully tested on lots of platforms, follow all best practices, serve as great examples for low-level modules.

Keynote 2: Gene Kim, Phoenix Project, Lessons Learned.  High performers (that use puppet or similar best practices): “...have 30x more deployments and 8000x faster lead time, 2x the change success rate and 12x faster recovery

Day 1: Sessions

Session 1 - r10k:

  • middleman between git repository and puppetmaster
  • each environment checks out its own copy of a module with a specific version
  • enviroments assigned
    • on puppet agent: in puppet.conf or on command line, not as secure
    • on master w/node classifier: authoritative/secure
  • development/
    • Puppetfile
    •   git@internal:yum 119c30a063
    •   forge modules: puppetlabs/java
  • git hook
    • when git push to repository, r10k does git pull and create folders
    • should happen on all masters in a multi datacenter environent
  • puppet agent -t --environment new_feature --noop
    • allows you to create a new feature branch and simulate what would happen before it gets applied
  • Git hook looks like
    • r10k deploy -c r10.yaml environment production
  • git workflow
    • git checkout -b branchname
    • make mods
    • git add; git commit; git push
    • test
    • git checkout production
    • git merge branchname
    • git push
    • delete the old environment
    • git push origion :branchname
    • git branch -d branchname
    • ...
    • r10k deploy -c r10k.yaml environment
  • ls environment    # should have deleted branchname
  • Migrating from monolithic repo to repo per module
    • versioning Puppetfile
    • environment.conf 
  • alternatives to r10k:
    • shell scripts
    • puppet-librarian (does not do "given a source of environments, make them on disk"), just checks out modules listed in puppetfile

Session 2 - Doing the Refactor Dance - Making your Pupper Modules more Modular

Gary Larizza, Puppet Labs Professional Services, @glarizza
Slides from the talk

Blog posts: Roles and ProfilesClass containment in Puppet ( anchor and contains() pattern),  on R10k, On environments and R10k and Directory-based Environments
Music recommendations: Above and Beyond, Kaskade, some Dada Life
Component Modules:

  • base implementation for /anything/.  Collection of classes that set up individual bits
  • tip "stop writing custom component modules" - migrate things we did over to modules that exist
  • get out of the business of maintaining & updating modules to leverage the community - you are not unique
  • Parameterize classes
class apache (
$confdir= $apache::params::confdir,
$conffile = $apache::params::conffile,
) inherits apache::params {
file { $confdir:
ensure => directory,
class apache::params {
case $::osfamily {
'RedHat': {
$confdir = '...',
'Debian': {
$confdir = '...',
- Params are API!
- Params at top of module
- Give yourself a default from params class
- Single Entry Point
- The "Forge" test -- "Can I take your module and put it up on the Forge?"
- "Shareable Data" like above belongs in the class.Private data goes into Hiera.


  • validate_* functions, including validate_absolute_path (from puppetlabs stdlib)
  • Never pass unvalidated data to resources
Class Containment
- Example: mysql::server from the forge
class mysql::server (
## params here
) inherits mysql::params {
include ::mysql::server::install
include ::mysql::server::config
include ::mysql::server::service

#solution was to use Anchor pattern
anchor { 'mysql::start: }
-> Class['mysql::server::install']
-> Class['mysql::server::config']
-> Class[mysql::server::service']
-> anchor { 'mysql::end': }

- If class is included in another part, we don't know which class included it in the first place
- Solution in Puppet >= 3.4.0 can use "Contain" function:
class mysql::server (
## params here
) inherits mysql::params {
contain ::mysql::server::install
contain ::mysql::server::config
contain ::mysql::server::service

Use "contain" when those classes should be logically contained -- BLOG POST (in google doc)


  • Mechanism to extract *data* from puppet code
  • Where's $osfamily ? - when in Hiera hierarchy it introduces a WORLD of pain.  NOT RECOMMENDED!
  • What's "Application Tier" where you're using dev, test, prod
  • Concept of 'Environment'
    • Idea is it is short-lived, a migration path to 'production', 'The Model'.  Use for testing, then kill them off
  • Concept 'Application Tier
    • Long Lived, Data usually separate, 'The Data'
  • Hierarchy Structrure?
    • How/where is data different?  (e.g. locaiton BOU, TUX)
    • Most specific to Least Specific "nodes/%{clientcert}" before "location/%{location}" before "tier/%{application_tier}", then common last


  • Where you should be working the majority of the time.  Technology Wrapper where you reuse Component classes.
  • Namespacing -- Problem:
class profiles::jenkins {
# Throws an error because class is already defined:
include jenkins
- Solution - namespace your includes
class profiles::jenkins {
include ::jenkins

Data Separation - Hiera lookup with parameterized classes

class profiles::tomcat {
$java_version= hiera('java_version')
$tomcat_version= hiera('java_version')
class { '::tomcat':
version => $tomcat_version,
class { '::java':
version => $java_version,
- Base component module (apache) should NOT include corp specific things
- Profile should include corp specific things
class profiles::apache {
include apache
$keypath = hiera('apache_keypath')
file { "${keypath}/key.pem":
ensure file,
source => 'puppet:///modules/profiles/key.pem',

Dependencies - Bad (need to fix in our modules)
class tomcat {
class { 'java':
version => '6.0',
-> Class['tomcat']
- Better: Set up the dependency in the *Profile*, so should name all the ordering and dependencies.
class profiles::tomcat {
$java_version= hiera('java_version')
$tomcat_version= hiera('java_version')
class { '::tomcat':
version => $tomcat_version,
class { '::java':
version => $java_version,
-> Class['::tomcat']
- Best: create a **Profile** For Java and then include it in the TOmcat profile
class profiles::tomcat {
include profiles::java
$tomcat_version= hiera('java_version')
class { '::tomcat':
version => $tomcat_version,
-> Class['::tomcat']
- Could also use the "Require" function on the profile where it it will set up the dependency
- Recommendation - use hiera *function* in profile rather than automatic parameter lookup since it is *less* magical.


  • Hiera for *business* specific data
  • Proprietary Resources (that should not go in a shared module on the Forge)
  • Inter-class dependencies and containment
  • Implementation 'libraries' - should be designed to be "include profile:..." in manifests


- Business Specific Classifications of nodes
- "How do you know when a node checks in, what classes should be assigned to it?"
- Roles are designed to be what the machine needs to be -- MINUS the environment data from Heira!
- Designed to differentiate what classes should be assigned to a node
- Inheritance may actually make sense here:
class roles {
include profiles::security::base
include profiles::mycorp::users
include profiles::mycorp::os_base
- Then can use inheritance to add:
class roles::app_server inherits roles {
inclue profiles::tomcat
include profiles::our_app
Class ['profiles::tomcat']
-> ...
class roles::app_server::pci inherits roles::app_server {
include profiles::pci
- That may be an issue for users new to puppet.Can build it up at every level. So have it include *all* the profiles.May increase legibility and readability.

- Like hostnames minus Hiera, are Technology independent, Inheritance can make sense.
- Goal is that when a Node comes to life, you classify it with a Single Role!

- Module Pinning: how do I use it with more than just 1 person and make it work?

- Lists the Forge, individual modules, or modules from git pinned to a specific version or tag.
- For forge modules, can either have it get the latest (default) or comma-versionnumber
R10k - Bad name, good robot:
- Ensuring modules based on a Puppetfile
- DYnamically creating Puppet environments
- Webinar: (Google for Adrian Thebo)
Control Repository
- Contains: Puppetfile, Manifest (manifests/site.pp), Hieradata (hieradata/**)
- Every *branch* in the control repository becomes a Puppet Environment
- Every Puppet Environment, gets all its modules checked out based on Puppetfile

COOLNESS: ini_setting type
- Allows changing settings in ini files like puppet.conf without entire template!

r10k puppetfile install -v
- Pulls down all the modules and checks them out next to Puppetfile - useful even without Environmetns

r10k depoy environment -pv
- Deploys environments to production - creates environment perbranch:
- production
- branchname

The only thing that's not in environments directory is hiera.yaml - so need to
take that into account in the toplevel hiera.yaml

basemodulepath in puppet.conf [main]
- Will be shared across all environments!
 e.g. basemodulepath = $confdir/modules:/opt/puppet/share/modules/
- environmentpath = $confdir/environments - controls where environments are set up
- Should work after setting up: puppet module list (will now work)
- in environments directory, can have environment.conf
- Can controle modulepath in environment.conf:
# This says environment specific modules come *before* shared ones from basemodulepath
modulepath          = modules:$basemodulepath
# Can now have an environment specific config version
config_version      = '/usr/bin/git --git-dir $confdir/environments/$environment/.git rev-parse HEAD'
# How often to refresh the cache on the puppetmaster
environment_timeout = 3m

Today: 5pm Phil Zimmerman to show r10k workflow.

Passwords in Hieradata:
- hiera-eyaml module and hiera-eyaml-gpg allows encrypted passwords to be checked in udner version control
- still requires the puppetmaster and the user puppet runs as to have the gpg key to be able to decode

- Should be able to install multiple versions of java on single machine

Session 3 - Using Git to manage Puppet Code

Terri Haber, Puppetlabs Prof Services,,

Workshop Code -

Why Git? - distributed, popular, used on forge/puppetlabs, Can use open source git hosting tools:

  • Gitlab: provides most of the function of github, but internally

Recommendation: ONE Repo Per Module!

  • Treat each module as its own project: less confusion!
  • Exception: r10k Control Repo
  • Automate simple testing with Hooks
    • Stop simple errors in cod
    • pre-commit is nice, post-receive is better
    • puppet and puppet-lint

Git Hooks:

  • Get installed in a .git/hooks/ subdir in each project
  • Shell script that checks things and exits with error code if error.
  • Problem:
    • Each developer needed to copy that script into .git/hooks dir!  Not likely to happen on every developer desktop!
    • Pre-receive is done on the git server!
  • Pre-Receive hook
    • ssh into git server, install into project .git/hooks dir
  • puppet-lint:
    • - can tune rules to match corp guidelines.

Session 4: Easy Monitoring with Puppet Exported Resources

Derrick Dymock, PuppetLabs Technical Operations, @actown

  • Moved Puppetlabs from Nagios to Icinga,
  • Modules used:
    • arioch/puppet-icinga
    • puppetlabs/concat
    • puppetlabs/stdlib
  • Setup
    • Centos
    • PE 3.3.2
    • Puppet Debug Kit (vagrant package with master and agent)
  • Exported Resource
    • node or module can put a resources into puppetdb to realize a resource later on
    • see puppetlabs type reference for nagios_
  • Define "Nagios_service <<||>>" on your montiroing master.
  • Define "@@nagios_service { "check_name_${::fqdn}": }" on the monitored nodes
  • Pros:
    • quick n easy, set & forget, never look at nagios configs again, easy to standup in case of DR.
  • Cons:
    • Sometimes can get out of hand
    • Ned to deactivate and purge nodes from puppetdb
    • Needs cleaning & purge from time to time
    • Collecting resources can be slow
    • Checks might have a window of 1 puppet run to show up
  • Solutions:
    • puppetdb-external-naginator queries puppetdb for resources matching nagios_ and uses a jinja template
    • github/favoretti/puppetdb-external-naginator
  • Tags on Exported Resources
    • Use to limit exported resources to collect per application_tier
    • So that the "dev" nagios/icinga server only collects resources for @@exported resources that have that tag.

Session 5: Killer R10K Workflow,

Phil Zimmerman, Time Warner, @phil_zimmerman

  • R10k drives all the process, also using jenkins, capistrano, ...  R. Tyler Croy uses it at Jenkins
  • Workflow: 
    • understanding your job
    • understanding your tools
    • not thingking about it any more
  • Single Repo: Bad!
    • was originally to simplify development
    • easy jenkins flow
    • having puppet code & heira together let everything stay in lock step.  KISS.
  • Jenkins CI Job
    • would do rspec-puppet, syntax check, lint for all modules
    • single release job: create/push tag
    • single deploy job: capistrano tasks, poor man's dynamic environments
    • worked well... until it didn't... PuppetForge modules: ad, upgrade, remove....
  • Toolset today:
    • Add Sinatra, Ruby, R10k (
    • Deploys Puppet Code/modules, Handles Git/Svn fu, is Awesome, does caching
  • Puppetfile:
    • inventory of all modules and their versions/branches (git ref's)
  • r10k deploy:
    • world: go deploy every branch in my Puppetfile repo (slow)
    • single environment (used often)
    • single module (used occastionally)
    • r10k deploy environment test -p
    • r10k deploy module tomcat
  • Jenkins
    • Single CI Job *per-module*
    • Release Job *per-module*
    • Deploy Job for each module and Hiera
    • Puppetfile manip/branch creation


Day 2: Sessions and Wrap Up

Session 1: Continuously Testing Infrastructure
Gareth Rushgrove, Puppet Labs, @garethr, DevOps Weekly
garethr/erlang, ...

1. Testing images and containers
Packer: building images based on a JSON template, has some puppet integration.
- Verifying image: use packer provisioner to run shell script to verify image works
- suaunduncan/packer-provisioner-host-command -
- Rspec tests for your servers - helpers (used by beaker) - uses ruby, supports port, file, ppa, selinux, user, group, lxc, iptables, cron, and some windows primitives.
- only publish the image if the tests pass.  Run tests automatically in a CI loop
- Werker:  Lets you define a bunch of steps:
- Same approach works for containers too: garethr/docker-spec-example (test both inside & outside container before publish in CI workflow)

2. Test drive IaaS - Test driven development
- FIRST the developer writes an automated test case
-- meaning your infrastructure has an API to inspect/monitor its state and be able to verify it
-- you write tests against your API - leads to Policy driven development - how do you assert Policies?
- garethr/digitalocean-expect (clojure example of tests against cloud API)
- run all the tests against infrastructure all the time
- run the tests *first* then provision the infrastructure.  When tests pass, you're done (w/step 2!)
- puppetlabs/gce-compute module

3. Testing with PuppetDB -- ("seriously, awesome, amount of data in there is excellent")
- Stores a LOT of data about your infrastructure - spelunking
- elg, most recent facts, catalog, metrics from every node
- -- write tests against that API
- use to test "what is on the node and running w/o error"
- e.g. verify every node running desired operatingsystem
- e.g., verify security enforcing packages installed everywhere ("an auditors love is a special kind of ...")
- github garethr/puppetdb-expect - examples of querying/testing puppetdb API

3.2. Testing based on PuppetDB API
- Could serverspec tests be generated from puppetdb data? (is this useful?)
- Match puppet resources to serverspec resources
- github garethr/serverspec-puppetdb
- Is this monitoring?  Probably? 
- Talking about Policy As Code might help communicate intent

Session 2: DevOps Field Guide to Cognitive Biases (2nd Ed.)   !!!!!!!
Lindsay Holmwood, Bulletproof Networks


Session 3: Puppetmaster on the JVM - Introducing Puppet Server

Chris Price, Puppet Labs, @cprice404,

  • Puppet Server - new open source project, introduced on Monday, alternative to Webserver functions
  • Clojure, Jetty, JRuby
  • Performance, Scaling, Availability
  • Avg Request response time: drop from 80ms down to 24.5ms
  • Catalog Compilation time down from 1400ms to 1000ms
  • Agent Run Time: from 9s down to 3s
  • At 2000 agents, went up to 60s Agent Run time on older stack, at 4s on new stack
  • Focuses on: correctness, backward compatibility, stability
  • Architecture
    • based on puppetdb architecture which has been successful & high performance
    • open source libraries
    • Clojure Trapperkeeper Process: manages embedded jetty, thread pools, JRuby Interpreter Pool.  Still uses exising puppetmaster Ruby code
    • Metrics Service - added as a background thread that collects metrics from inside the process
      • Exposed via JMX and allowed to send to Graphite
      • Will publish documentation & puppet modules to install/config graphite & grafana + JSON grafana config file
  • Extending Puppet Server
    • scheduling async tasks without affecting ongoing requests
    • still love ruby
  • Trapperkeeper & SOA
    • t.k. currently used just to turn individual services on and off
    • Certificate Authority has been ported from Ruby to Clojure
    • Puppetmaster still runs in Jruby interpreter pool
    • Could run 3 puppet servers: 2 running puppet master/jruby with CA turned off, 1 running CA but no JRuby
    • Goal is to break Master into Node, Catalog, File Server, and Report service
  • Packages are released in Puppetlabs Package repository *today*
    • (not considered production ready yet - can try in Test environment)
  • Useful projects used in development:
    • JRuby
      • Gatling - used to record & replay traffic from client to master - VERY efficient at generating load, graphs!
      • Codahale Metris
  • Source:
  • Package name in repo's: 'puppetserver'

Session 4: Puppetizing Multi-Tier Architecture

Reid Vandewiele, Solutions Engineer, Puppet Labs

  • Multi-Tier Application - Puppet Enterprise Itself
    • Is a classic multi-tier application: PG database, PuppetDB, Master, Dashboard, ...
    • Can be monolithic install or Split Install
    • Goal: develop puppet driven implementation of multi-tier, multi-node puppet enterprise installation
  • Defining a Multi-Tier application
    • Most puppet definitions are node-centric
    • Roles & Profiles for: CA, LB, Master, PG/PuppetDb, ActiveMQ Hub, AMQ Spoke, Puppet Agent

Application Class - is empty (no resources, just parameters to hold data)

# Model all settings/connections/endopints for applicaiton
class pe (
$puppetdb_port = 8081,
) { }

class pe::puppet_master (
$puppetdb_port = $pe::puppetdb_port
) inherits pe { 
... resources ..
  • Dynamism/Elasticity
    • *Exported Resources* - or Erik Dalen's puppetdb query module "dalen-puppetdbquery"
    • each master exports @@pool_member { 'master_1': }, @@pool_member { 'master_2': }, ...
    • Load Balancer gathers exported resources: Pool_member <<| filter |>>
    • Can take multiple runs for eventual consistency since exported resources won't exist yet
  • ENC Problems:
    • When ENC returns node classification with class parameters *it can break* because results are returned as an *unordered set* !!!
    • Solution: Hiera - add "env_tier=development" to some nodes and "env_tier=production" to others and have a hiera set of settings under env_tier/development.yaml, env_tier/production.yaml
Alternative -- Global Variables
class pe (
$puppet_master_host = $::puppet_master_host,
) { }
- Then set global variables in Node Classifier
  • How To Deploy
    • In a Multi-Tier application *Ordering Matters*
    • Puppet DAG is Node-Centric, cannot control ordering between nodes
    • Ways to trigger puppet agents: every 30 minutes, on puppet agent -t, or triggered by mcollective
    • The clock method would work in about 1.5 hours because of eventual consistency
    • Could have mcollective trigger 3x in a row separated by about 30-60s
  • External Tool
    • All it needs to do is do correct ordering and tell nodes to run "puppet agent -t" in the right order (Rundeck? just shell script?)
    • DB Servers, then App Servers, Then front end WWW servers, then Load Balancer
- Ordering and dependencies
class pe (
) {
anchor { 'barrier: pe certificate authority': }
-> anchor { 'barrier: pe puppetdb_database': }
-> anchor { 'barrier: pe puppetdb': }

- Model the *whole* app as a single class

Exported Resources/PuppetDB are *NOT* environment-aware
- So "test" resources could leak into prod
- Need to use tags on "env_tier" to isolate exported resources.

Session 5: Using Docker with Puppet

James Turnbull, VP Engineering, Kickstarter
"Containerization is the new Virtualization" -

  • Docker is Operating System Level Virtualization
  • - namespaces: pid,network,..; cgroups: no hypervisor, no HW virtualization
  • - build, ship, run
  • - Can be up to 28x !!! faster than in VM
  • - Build once.  Shared with rest of team.  Run in many places.
  • - Isolated, layered, standard, content-agnostic.
  • - Not new (Solaris Zones, IBM LPARs, OpenVZ, ...) but focused on Easy To Use
  • - Self Provisioning for Dev Teams
  • - Spin up 1 big box per environment, let teams self-service!
  • - Can run on a VM, on Bare Metal, or in a Cloud Provider
  • Docker Basics
  • - Images & Dockerfile, Layers, Copy-on-Write
  • - Dockerfile - basically sequence of shell commands to build an image
  • - Docker hub/registry - for sharing images
  • docker build, docker push, docker run -ti -p 80:80 jamtur01/apache2
  • Docker and Puppet
  • - Doesn't that Dockerfile look like a puppet manifest?
  • - CM: library of reusable, composable templates; bad: learning curve, requires trigger, resource-intensive
  • - Before: use puppet to setup hardware, install packages, deploy code, run services
  • - After: use puppet to setup hardware, install Docker, run containers.  Use Dockerfiles to install packages, deploy code, run services
  • Deploying a puppet-powerd container
  • FROM ubuntu
  • RUN apt-get .. update, install rubygems
  • RUN gem install --no-ri --no-rdoc puppet
  • RUN mkdir /puppet
  • WORKDIR /puppet
  • ADD site.pp /puppet/site.pp
  • RUN puppet apply site.pp
  • # Then inherit from that image to do other cool sturr
  • Use librarian-puppet (or r10k) to install modules, then puppet apply
  • FROM ubuntu
  • RUN gem install --no-ri --no-rdoc puppet librarian-puppet
  • ... set up Puppetfile, run librarian, run ssh
  • What if we get rid of:
  • - sshd: access via nsenter or docker exec (next version)
  • - crond in a container - create a 2nd container that runs crond
  • - logging in a container - create a 2nd container that run syslog/logstash/hector to pick up logs from others
  • Creates a new Architecture
  • - separate orthogona concerns
  • - don't rebuild your app to change services
  • - have diffrent policies in domains
  • - ship lighter apps
  • What if?
  • - we could run puppet agent outside the container?  Run single agent for many containers? Share cost of agent?
  • Q&A: Orchestration: start with fig and go from there

Session 6: Building & Testing Puppet with Docker

Carla Sousa, Reliant, Puppet since 2010, Containers since 2008
Environment: ~15k nodes, Redboxes (Debian), Amazon EC2 instances, Virtual machines, OpenVZ containers, KVM

  • Private git repo, push to on-site puppet masters, agents pull changes from puppetmaster
  • Code QA.  Syntax: puppet parse validate, puppet-lint, yaml syntax check, erb syntax check
  • Variable data type validation: use validate_* functions in stdlib on parameters in base modules
  • Look at puppetlabs "ntp" module for a sensible example
  • Smoke testing: class { 'apt': }; puppet apply --noop ...
  • rspec-puppet - see examples in puppetlabs-apt
  • beaker: write rspec tests and can apply in multiple operating systems to test in closer to real machine
  • beaker: HOSTS and CONFIG in yaml file
  • Code Review - using gitlab or other process: submit new merge requests (like pull requests in github). Have team review pull requests
  • "I don't always test code... But when I do, I test in production..." (Dos Equis meme)

[Had to leave early to catch flight - follow up with recorded session]

Much to digest - worth tracking Docker.

AuthorD. J. Hagberg

24- to 28-Feb-2014 with a special "hack" day on 24-Feb-2014

Day 1: Monday

Chaired session 1 day 1 on Continuous Delivery strategies and tools.  Full discussion and audio now available:

Summary: lots of tools to help make it happen, takes a concerted cross-functional and large effort across development, release engineering, test (specifically automated test development), operations (heavy emphasis on monitoring, A/B testing ability, and ability to deploy fixes quickly).  Single-Jar deployment is a good thing: Dropwizard & Spring Boot.  Gradle can be a lot more flexible as a build tool than Maven. Zuul as a "Shunt" proxy idea to test development code in parallel with production load. Go Forth and Deploy.

To Read: Orbiting the Giant Hairball.

Session 2: UI - What is the Way Forward/HTML 5 & JavaScript.  Summary: HTML and JavaScript: not dead yet.  They will inherit the earth, like C.

Day 2: Tuesday

Chaired session 3 day 2: NoSQL State of the Art / Cassandra.  Full discussion and audio now available:  

Summary: MongoDB - Dick's experience is it focuses on speed at all costs, including reliability - but very fast.  Issue: silent failure/silent data losss. Alternative commercial back-end w/higher reliability: TokuMX.  Slick Framework: mapping Scala to RDBMS. NoSQL tends to map better to Scala's immutable view of the world, better than JPA, but frameworks not as mature as JPA. Versioning/journaling databases: CouchDB, Datomic (from the Clojure guy). 

Cassandra: CQL queries are very practical, atomic batches, fewer modification race conditions if properly structured.  Still VERY very robust high availability and automated recovery. Concurrency drove Netflix and other users to make this switch in a high transaction environment (100,000 inserts/second). Tradeoff: eventual consistency (see above re: journaling/versioning).

Graph: Neo4j, major trend is graph databases is to distribute, but a difficult problem because of graph connections between objects.

APIs: JPA/Annotations: require mutable beans.  Spring Data: one API to rule them all.  Try putting a REST API in front of data store to be able to tune/cache/swap out data store, even different API for reading vs writing. Functional: Monads for interacting with the database. Storm & Kafka to handle database inserts in a queueing fashion.

  • Criteria for selecting a NoSQL database:
    • Couch/BigCouch - good materialized views, bad: data loss
    • Cassandra - write throughput/resiliency across regions/recovery from failure/linear scaling; bad: hiring is difficult. Priam tool for starting instnce.
    • Mongo - VERY easy to get started; bad: possible data loss, writes can fall behind

Considerations: migration costs, batch analysis/data cleanup/consistency checking, possible dump to RDBMS to analyze data with reporting tools (or you have to write your own)

Session 1: AntiFragile and Beyond the Simian Army - chaired by Andrew - lots of discussion on building resilient & clustered systems, managing state, and (VERY important) verifying the actual resilience of systems in test and production.  Full discussion on Netflix tools.

Session 2: Open Sourcing Corporate Code - chaired by Joe.  Thoughts on buy-in from upper management.  Enlightenment: actually helped Netflix with *recruiting*.  Get your corporate github Corpname-oss account NOW to have a place on the web to put stuff.

Hack Day: Wednesday

Docker - very interesting method for application deployment, build an image to run a single service, deploy as an "immutable container".  Questions: maturity, ability to use puppet to build container configuration, how to manage separate instances/configurations for dev/test/prod.  Minimal support for data persistence and cross-machine networking

etcd - Convenient way to have services register their ports and addresses in a cluster-aware high-availability fashion across hosts.  Can help with the Docker cross-machine networkng aspect.

Day 3: 27-Feb-2014

Session 1: Venturing Beyond the JVM/Alternative System Languages.  Discussion of Go and Rust - highly recommended.  Useful for building small command-line tools or single-focus network service.  Rust still maturing, Go nearing point of maturity.  Possible consideration: JavaScript on node.js -- useful for JS-heavy shops but there are a LOT of flaky corner cases and scalability problems under load compared to anything on the JVM (try Nashorn instead?)

Session 2: Reactive/RxJava/Akka: Rapidly maturing set of frameworks to start moving more functional.  Hugh/HP: very very positive experience refactoring a large legacy application to run on Akka-on-Java.  Downside: lots of casts and instanceof in Akka Java vs the more clean syntax of Scala, *but* you can actually hire Java developers.  RxJava: useful, ramping up in usage inside Netflix and getting considerable contributions from outside.  Reactive Manifesto.  SpringSource: started Reactor framework as an alternative implementation.  Currently have to pick 1 - no common API.

Session 3 - Monads WTF?  Getting closer to understanding, but elusive like Heisenberg.  First: drop the term "monad".  Think of it more like a sequence/series combined with a JS-style transitive closure.  Listen:

Day 4: 28-Feb-2014

Session 1 - TDD - Chaired by Julie.  Lots of into on alternatives to JUnit.  TestNG, performance/load testing, A/B testing, see above re: testing development code with production load using Zuul.  Testing deployment/VM build pipeline, etc.  Audio TBD.

Session 2 - Inspire Me -  Lots of info on continuous learning: Coursera, Berkley, Stanford, MIT.  Huge reading list - see links. Podcasts, science, and "everything is awesome".

Wrap up session - let's do it again next year!


AuthorD. J. Hagberg