August 04, 2016

About

@signal_vs_noise on Twitter

I am an engineer at Google in Madison, WI, where I work on the Andromeda software defined network (NSDI 2018). Prior to this, I worked on the virtual machine monitor powering Google Compute Engine and as a committer on the DragonFly BSD project. My professional interests include operating systems and performance analysis & optimization.

This is my personal weblog. All writing here reflects my opinions alone, not those of my employer or any other affiliations.

In the summer of 2018 I attended Recurse Center, a self-directed & collaborative programming retreat in New York City. I studied discrete event simulation & queuing theory, following Dr. Harchol-Balter's Performance Modeling book and writing a series of simulators. I wrote a few blog posts with simulations & visualizations - [rc].

In the summer of 2012 I cycled from Baltimore, MD, to Portland, OR with the 4K for Cancer. The 4K was a unique experience in my life; I met amazing folks and saw such kindness across the country that I had never dreamed of. I recorded and photographed a small part of our journey and of 4K journeys after in an online journal, please take a look to follow along!


July 16, 2018

Recurse Center - First steps w/ Queues

I am currently attending Recurse Center in New York City, where I’m studying discrete event simulation & queueing theory for the next 12 weeks. I plan to study and gain intuition about queueing theory by working through Mor Harchol-Balter’s "Performance Modeling and the Design of Computer Systems" textbook and through simulation, playing with knobs and asking “what if” questions about scenarios in the book and from outside experience. What are the boundaries of the universe of parameters and what happens at the edges?

A queue is the summation of the difference between an arrival process and a service process. A queue can appear anytime an arrival rate is higher than a service rate at a point - a short-term difference leads to a transient queue, a long-term difference leads to a standing or unbounded queue.

Imagine a road - draw a box around a stretch of the road. A fixed n cars/sec arrive at the box, a service process allows n cars/sec to leave. No queue forms, the time it takes for a single car, its service time, is only governed by its speed and the size of the box.

IMG_4042.jpg

Now imagine a row of ducklings wanders across the eastbound lane.

IMG_4043.jpg

The maximum service rate transiently drops and cars queue to wait patiently for the ducklings to cross. The sojurn time, the time from when a car arrives at the box until it leaves, rises to include time spent waiting.

Schematically, queueing systems are represented by interconnections of queues, processors, and switches. The road example would be -

IMG_4038.jpg

Queueing theory seeks to understand how queues form, behave, and how they should be controlled. Queueing theory also seeks to understand and control how units of work flow and wait in systems and to maximize throughputs or minimize average/worst-case delays.

Queueing theory generally assumes a system’s long-term arrival rate is less than the long-term service rate - otherwise a system would diverge and queues could grow without bound.

Queueing theory allows arrival and service processes to be statistical ‘counting’ processes - for example, on average λ units of work/sec may arrive, but the arrivals may not be uniformly spaced in time and the inter-arrival time may follow many distributions. Units of work may also take non-uniform ‘resources’/time to service, again following some distribution.

Tero Parviainen (@teropa) wrote a fantastic interactive tool to experiment with Markov arrival processes, a common arrival process where inter-arrival times are independent - A Dash of Queueing Theory. It is worth spending some time looking at both the cumulative arrival and service processes in his examples - it is important to understand that queues can form even when average arrival and service rates are matched.

Via queueing theory, we can approach questions like - "Should we have one processor with a service rate of μ or have multiple processors N, each with a service rate μ/N?" or "When should we power on/power off additional processors to service work?" or even "When/how should a queue push back on a source to change an arrival process?"

Some of these questions may have other approaches - control theoretic approaches to ‘when to power on/off additional processors’ using the first derivative of queue depth have been fruitful, for example.

Mor Harchol-Balter’s text starts with a few design questions - I’d like to highlight two:

  • Imagine a system where work arrives at a server, following a Markov process (inter-arrival times follow exponential distribution, memoryless process); on average, 3 units of work arrive each second. The server is able to service up 5 to units/sec.

    Imagine the arrival rate doubles; how much faster do you need to make the server to maintain the average sojurn time?

    It turns out that the server needs to get less than twice as fast to maintain the average sojurn time! I found this result fairly counterintuitive.

    I wrote a simple discrete event simulator to work through this scenario, in two steps. First I started with a system where 3 units/sec arrived following a Markov process and 4 units/sec could depart. Then I turned up the service rate from 4 units/sec -> 8 units/sec; the distribution of sojurn times shifted:

    This is a stacked violin plot of the 3 units/sec // 4 units/sec scenario (blue) and the 2 units/sec // 8 units/sec scenario (orange). In a violin plot, the width of a horizontal slice represents the part of the Density Function that is at a particular y-axis value. The means and extrema are highlighted by horizontal bars.

    In both the blue and orange violins, we see most of of the PDF at ‘zero sojurn time’ (no delay). The blue distribution has substantial density at 1 and 2 units of delay, whereas the orange (faster server) distribution does not. The mean of the orange distribution is just a hair above zero; doubling the service rate has more than halved the average sojurn time.

    Let’s now double the arrival rate as well, from 3 units/sec -> 6 units/sec:

    The blue distribution is the same as the prior plot - 3 units/sec arrive, 4 units/sec service rate.

    The orange distribution is for 6 units/sec arriving, 8 units/sec service rate. The Density Function of the faster server’s sojurn time remains better than the original scenario!

    Looking at the time series of sojurn times of the initial case (3 units/sec // 4 units/sec) was crucial to understanding this result.

    Here the x-axis is job sequence #, the y axis is sojurn time.

    There is underlying structure here! If job N experiences a delay, it is more likely that job N + 1 experiences a delay. Intuitively, the system is stateful - a job can see a ‘unloaded’ or ‘loaded’ system, based on recent history.

    Doubling the service rate halves the amount of time required for a single job to be processed; it also reduces the probability that a job will see the system in a bad state.

    Combining the two effects more than halves the average sojurn time.

    (6 units/sec // 8 units/sec scenario - the x scale is different from above because a higher arrival rate results in more jobs arriving within a given time)

  • Imagine a system where work arrives following a Markov process. You can feed this work into either one server with a service rate μ or into N servers with a service rate of μ/N. Under what circumstances would you choose one server or N servers?

    img4048b.png

    Let us start with the assumption that there is sufficient work to fully utilize all N servers; if there is insufficient work, one server would be preferable - the multi-server configuration would have idle capacity.

    The result is “it depends on the variance of the job size distribution” - High variance favors multiple slower ones servers over one fast server.

    Let’s start with the simple case where there is zero variance in job size - in this case, the fast server will finish a job every 1/μ seconds, the slow servers will finish N jobs every N/μ seconds. Both configurations should perform identically.

    In the high variance case, however, we can end up with an expensive job in front of a cheap job - a classic “head of line blocking” scenario. The one server configuration will force the cheap job to wait for the expensive job whereas the N server configuration would allow the cheap job to ‘slip’ around the expensive job.

    We will come back to this configuration in future examples.

Thank you to many peers at the Recurse Center for reviewing this post

May 12, 2018

opengrok.net

opengrok.net is a source code cross-reference I'm hosting for the Linux kernel and DPDK packet processing libraries. The cross-reference has both git history and tag information for indexed projects. It is running the opengrok indexer from Sun, originally for Solaris.

You can access annotation and revision tag information from the history tab of a particular source file, for example: linux/net/core/dev.c.
opengrok.png

March 21, 2018

Organizations I'm supporting in 2018

Early in 2017, I committed to support a number of nonprofits and civil society organizations; I wrote about the organizations I planned to support and why: Organizations I'm supporting this year - broadly I planned to focus on civic and environmental groups. Last December, I looked back at the year and where I followed or deviated from plan: Organizations I've supported in 2017.

This year, I continue to support a broad set of nonprofits/organizations, primarily in civic and environmental sectors. New this year, I have added nuclear nonproliferation and science & society organizations (ex: Federation of American Scientists, Bulletin of Atomic Scientists) and the Center for Civilians in Conflict. Through the year, I plan to support the organizations below proportionally and to highlight specific work each/any does (such as policy papers/testimony/etc) that I think is important.

2018q1.png

Union of Concerned Scientists15%
American Civil Liberties Union12%
Pro Publica8%Nonprofit newsroom; also serves as a resource for traditional news organizations
EarthJustice7%
Environmental Defense Fund7%
Brennan Center for Justice5%
National Immigration Law Center4%
Natural Resource Defense Council4%
Northwest Immigrant Rights Project4%
Ploughshares Fund4%Grant-making nonprofit, focused on nuclear nonproliferation
University of Washington Climate Impacts Group, various4%
Rocky Mountain Institute4%Studies & writes on climate change, energy, and decarbonization
Center for Arms Control & Non-proliferation2%
Center for Democracy and Technology2%
Center for Civilians in Conflict2%
Federation of American Scientists2%Runs extraordinary Government Secrecy Project; provided recent comments on autonomous weapon systems
International Refugee Assistance Project2%
United to Protect Democracy2%
...17 others, <= 1%...

This does not include donations to candidates for office / PACs, raw data for those contributions are available on the FEC website. This does not include my carbon offset purchases at scale either.

I think public lists & commitments valuable to cast lights on work at civic nonprofits and to mobilize further support. If you know of any interesting civic, nonproliferation, or climate change nonprofits doing good work, I'd be interested to hear!

December 28, 2017

Organizations I've supported in 2017

"I have a little hope that America’s amazingly robust and wealthy civil society, which is unlike any other civil society in the world, ever, will change the situation, or will make it progress differently." - Masha Gessen, Apr 2017

Early in 2017, I committed to supporting a number of nonprofits; I wrote about the organizations I planned to support and why. I focused on civil society and on climate change organizations, with a bias towards legal groups. (I was inspired by DJ Capelis's post earlier).

2017 brought disruptions, natural triggers were amplified into logistic and human disasters. 2017 brought old fears into new focus - nuclear proliferation, civil-military relationship strains, use and control of force questions. Progress addressing climate change (emissions and decarbonization) has been disappointing. I found it difficult to sustain attention and focus on particular problems.

I did not foresee how 2017 would and did proceed.

The year also brought the incredible strength and depth of our civil society and their role in the public sphere into focus - legal organizations via lawsuits as expected, but policy and journalism organizations to educate and focus attention too. I did not know about whole classes of organizations in the policy and public education spaces just ten months ago, now I am deeply grateful for their works.

Breaking down my donations ---
By "area": image2.png
By organization: image1.png Top organizations:

Union of Concerned Scientists13.6%
Natural Resource Defense Council9.3%
EarthJustice7.3%
Pro Publica5.6%
Electronic Frontier Foundation5.0%
Ploughshares Fund4.9%
American Civil Liberties Union4.9%
National Immigration Law Center4.5%
World Food Program3.6%
UNICEF3.0%
Global Zero3.0%
University of Washington Climate Impacts Group2.9%
Environmental Defense Fund2.75%
Northwest Immigrant Rights Project2.6%
International Medical Corps2.5%
Rocky Mountain Institute2.2%
International Rescue Committee1.9%
Freedom of the Press Foundation (Signal)1.6%
Team Rubicon1.3%
International Refugee Assistance Program1.25%
Brennan Center for Justice1.2%
Medicines Sans Frontieres0.9%
Mercy Corps0.9%
World Resources Institute0.9%
Let America Vote0.6%
Airwars0.6%
Washington National Parks Fund0.6%

For the most part, I donated to the organizations I committed to. Standout differences --

  • I did not expect to have to think about weapons control and nonproliferation; and I did not know about the work of Ploughshares or Global Zero at the start of the year
  • I did not know about smaller "think-tank" environmental organizations, such as Rocky Mountain Institute (RMI), Resources for the Future, Ceres, etc. at the start of the year

I did not include donations to political candidates or organizations above.

I also began purchasing carbon offsets this year in bulk, as a direct means to mitigate unavoidable personal emissions and in excess of what zeroth-order calculators claim. I purchased 110 metric tons of offsets from the Bonneville Environmental Foundation, from cooleffects, from the Colorado Carbon Fund, and from a private provider. (Not included above). I am not sure about the effectiveness of carbon offsets to in bulk to mitigate climate change and would appreciate any data or arguments either way. Without additional arguments, I plan to purchase offsets for ~1000 metric tons through the end of 2018.

March 21, 2017

Detect. Transmit.

2017 is the 40th anniversary of the launch of the twin Voyager spacecraft. I found this essay about Voyager on the Internet years ago - I don't remember where and I can't find any breadcrumbs. I'm sharing it, lightly edited, as one of the finest space program essays I've ever found.

Voyager sketch
Voyager sketch by @TychoGirl

To imagine what being the Voyager probe would be like, consider the following:

Your life begins, conceived during the mid-60s golden years of the space program.

The core concepts of your design are settled during the first years of that decade, and refined for fifteen years as different attempts are made to extend the reach of man's knowledge first to the skies, then to our nearest neighbors.

Your idea forms in an era of slide-rules and pencils, as astronomical calculations reveal a particularly fortuitous alignment of the outer planets in the coming decade, one that will slingshot you to the outer reaches of the solar system, hopping from planet to planet.

Continue reading "Detect. Transmit."


Recurse Center - First steps w/ Queues July 16, 2018
opengrok.net May 12, 2018
Organizations I'm supporting in 2018 March 21, 2018
Organizations I've supported in 2017 December 28, 2017
Detect. Transmit. March 21, 2017
Organizations I'm supporting this year February 12, 2017
4K for Cancer 2015 - Seattle, WA August 14, 2015
4K for Cancer Portland 2014 - Portland, OR August 16, 2014
A couple of days with 4K Portland 2014 July 31, 2014
4K for Cancer 2014 Sendoff & Day 1 June 08, 2014
Trust Fall June 06, 2014
4K for Cancer Seattle 2013 - Seattle, WA August 26, 2013
4K for Cancer Portland 2013 - Cannon Beach August 25, 2013
Baltimore to Portland September 29, 2012
Blink and life flies by. August 03, 2012
Day 68 -- Tillamook to August 02, 2012
Day 67 -- Corvallis to August 01, 2012
Halfway to Corvallis. After that, July 31, 2012
Before riding to Corvallis, Team July 31, 2012
Day 66 -- Eugene to July 31, 2012
Day 64 -- Rest day July 29, 2012
Crossing Country and Catching Up, part 1 July 18, 2012
In Denver, CO. Veiled mountain July 03, 2012
Something is hidden in the July 03, 2012
@4KPortland 200mi in 2 days, July 01, 2012
Welcome to Colorful Colorado! July 01, 2012
The country from Franklin, NE June 28, 2012
#4kportland Cheers from Arapahoe, NE!! June 28, 2012
#4kportland Day 33 -- To June 28, 2012
#4kportland Iowa is not flat!! June 23, 2012
#4kportland Day 28 -- Atlantic, June 23, 2012
#4kportland Day 25 -- Wheatland June 20, 2012
#4kportland Day 24 -- Iowa June 19, 2012
#4kportland Day 23 -- 120mile+ June 18, 2012
@4kportland Day 21 -- Chicago June 16, 2012
@4kportland winning June 09, 2012
#4kportland -- Cincinnati to Lexington, June 08, 2012
Intro & Twelve Days June 07, 2012
First week of #4kportland!! (@4KPortland) June 03, 2012
#4kportland Day 8 -- Canton, June 03, 2012
#4kportland at the host in June 02, 2012
#4kportland Day 7 -- Youngstown June 02, 2012
Reached host in Youngstown! _So_ June 01, 2012
#4kportland Day 6 -- Pittsburgh, June 01, 2012
#4kportland Day 4 -- Ligionier, May 30, 2012
4K day 2: Waking up May 28, 2012
@4KPortland group 4 is in May 27, 2012
My last night in Baltimore. May 27, 2012
An alumni's thoughts on 4K; May 18, 2012
IOCCC Korn 1987 August 04, 2007

July 2018
May 2018
March 2018
December 2017
March 2017
February 2017
August 2016
August 2015
August 2014
July 2014
June 2014
August 2013
September 2012
August 2012
July 2012
June 2012
May 2012
August 2007