Wednesday, March 22, 2017

Building a Roster: What is Rare?

K-1-d #404 with World of Mirth train in New Britain.
August 1940. Kent Cochrane 
A question was posted in response to a post by Marty McGuirk on the Modeling Steam Era Freight Cars blog some time ago. The post itself is regarding the percentage that each road comprises of a study of box cars through White River Junction, VT over several days in 1954.

The statistics are quite interesting, with 50% of the 3,605 box cars that were documented in those days. Marty included a pie chart along with a list of almost all of the reporting marks and their individual percentages.

Simon Dunkley questioned why the list showed that 50% of the cars were CN, but the pie chart shows at least 60% of the cars were CN.

The answer is simple - Marty only included the Top 10 roads in the pie chart. In which case the CN accounts for nearly 68% of the cars.

Ignoring the rest of the roads, no matter how small of a percentage of the total, skews the numbers. 

Individually, the roads represent what might be viewed as "statistically insignificant" in many cases. But collectively, they account for almost 27% of the cars during those days. I think it's important to remember that.

Freight Car Syndromes
I think it was John Nehrich that coined the "Pickle Car Syndrome." His point was that people like unusual cars (like pickle cars), and so they are overrepresented on our layouts. In general, I agree with this.

A related syndrome is "RTR Syndrome" in that we also overrepresent cars that are most easily acquired. I have far too many 1932 ARA Box Cars. Atlas made it simple by releasing the model with 10 different prototypical variations, covering nearly every one of the 26 roads that rostered the cars. So I have at least 10, probably more.

But for a prototype with around 15,000 total copies, I probably should only have 1 or 2 on the layout at any time. But because the current size of my roster is barely enough to cover a given operating session, almost all of them are on the layout during that session.

Many modelers combat this by determining how many cars they need of a given prototype (of the roster of box cars in 1950, for example, the 1932 ARA box car accounts for about 2% of the entire fleet of North American Box Cars. So out of 100 box cars on the layout, two of them should be 1932 ARA box cars. So they decide which two to purchase, and that's it.

This approach is often combined with a similar look at a given road's roster. For example, you might determine that you only need 1 box car owned by the New Haven railroad. They accounted for less than 2% of the national box car fleet. The obvious choice would be a 1937 AAR box car, since that was the most numerous box car owned by the New Haven Railroad in 1950. But they also owned PS-1 box cars, and more interestingly, 10' IH PS-1 box cars. What do you do?

This is where I think we fall victim to the "36-foot Double-Sheathed Box Car Syndrome."

In our quest to build a more "realistic" roster for our layouts, and ignoring budgetary concerns, we often restrict ourselves from purchasing rare cars, or too many cars of a given road, prototype, whatever. A number of studies have been done on 36-foot Double Sheathed box cars, which would seem to indicate that in 1950 your typical layout should have only one or two.

In 1950 only 8% of the national box car fleet were less than 40' in length, and of those, 71% were CN or CP Fowler single sheathed box cars. So in theory, out of 100 box cars, you need 8 cars less than 40-feet in length, and 5 or 6 of them should be Fowler cars, and 1 or 2 of them double-sheathed box cars.

The problem is, for many of us those specific 36-foot Double Sheathed box cars are always on the layout. Let's say it's an D&H car, F&C makes four variations of this car. There were about 1,000 still in service in 1950, or 0.1% of the national fleet. 

In our attempt to show that short double-sheathed cars are rare by only having one or two in our roster, we have instead given the impression that the D&H short double-sheathed car is relatively common in small numbers. Why? Because it's always on the layout.

This is a similar problem as reducing a list of approximately 60 road names with little representation, to a Top 10 list. We've skewed the numbers. In fact, by attaching hard numbers to several days' worth of data is undoubtedly skewing the numbers as well, almost certainly in favor of the CN and CV in that particular study.

A Cure?
So what do we do? Well, in terms of a study such as the one Marty references, that's among the best we have. So it's a great starting point. If we're modeling a specific prototype like New Britain, I can also look at the industries themselves to help determine appropriate roads based on prototypical offline industries. I can go further and look at the traffic on the through trains, and their origination and destination.

With the Waybill studies that have been done by others, plus other resources (such as American Commodity Flow published in the early '50s) I can see where the bulk of inbound cars are coming from with loads for CT. All of this helps.

But what do we do with this actual data? First, understand that nearly every box car that is owned in 1950 is fair game. This is part of your roster. If you want one 1932 ARA box car for each road that owned one, that's great. The more the better. 

Because what you roster doesn't have to be the same as what you run.

Sure, if you have different operators at every session, then you can run the same cars representing the same day every session. I would like more variety than that. I'm already accounting for changes over the years by building structures that can be swapped out as needed. Autos can be changed, and the era was a busy time for building new freight cars showing the evolution of freight cars in this period. And the New Haven not only retired steam in this era, but their diesel acquisitions were so rapid that the motive power changes on almost an annual basis during this time too.

The data from studies like the one Marty references is fantastic for populating your operating session from your larger roster of freight cars. I classify cars as common, uncommon, and rare.

Common cars are the ones that should not only appear in every session, but probably in quantities greater than one. In my case that includes PRR X29 box cars, NYC USRA-design steel box cars, and CN/CP Fowler box cars, among others. 1937 AAR box cars in general also fit this category. This also includes roads themselves, such as PRR, B&O, NH, CN, CP,  NYC, etc. in my case. For these handful of prototypes I need to figure out roughly how many (usually 2-3) would appear in any given operating session and make sure I have enough quantity to meet this demand. These cars account for about 50% of those in a session.

Uncommon cars are those that should appear in most sessions, but rarely in quantities of more than one or two. In Marty's post this comprises the roads that have a greater than 1% distribution. These cars account for about 25%. The quantity refers to the roads as a whole, individual cars may be different prototypes from the same road, or even the same prototype if there is a large amount of a single type of car on that road. Most of these cars will appear every few sessions.

Rare cars are those that shouldn't appear in every operating session. Or everything that is less than 1%. Of a given road or prototype, there probably shouldn't be more than 1 per session. This is about 25% of the mix based on Marty's post. These can be just about anything. While these make up a sizable portion of what's in the mix on the layout, a given car might appear very, very rarely.

DER-1 (DL-109) locomotives with Strates train at Whiting Street Yard.
June 1949. Kent Cochrane.
So the example of this post are some circus trains in New Britain. In various books are photos of the Ringling Brothers train in New Britain too. But there won't be a circus train every session. Maybe not more than once or twice a year. Most likely one will be used for my more seasoned operators, since they will tie up a lot of yard tracks making regular work more complicated. Otherwise they will be display and photograph models.

Another example is the PRR X30 box car. These were built specifically for servicing American La France in Elmira, NY to deliver fire engines. Sometime c1950 New Britain received new fire trucks, at least two, a pumper and an aerial from American LaFrance which were probably delivered in one of these box cars. The pumper is very similar to the Sylvan Scale Models 700-series open cab model. Corgi makes (made?) an American LaFrance aerial that could be a good starting point. The Jordan fire truck looks a lot like an earlier truck that was still in service as well. There is a fire station next to the police station on Commercial Street across from New Britain Station, so these are models that I'll be able to use. Of course, the station itself will be a flat, but the models being delivered in an X30 box car would make for an interesting photo too.

So buy or build whatever you want. Personally, as a freight car enthusiast, I want pretty much anything that is appropriate for my era that I can get. That's great. Because a larger roster allows you to more accurately populate your operating sessions. 


  1. Devils advocate....if I model the Southern Pacific in Southern California during the 1950's, a fleet of PFE wood and steel reefers and a ton of tank cars would look right. A Norfolk Western coal hopper would be out of place in Los Angeles.

  2. Yes, but there's a difference between a rare car, that is one that has been documented as having run in a given location, and a car that never made such a movement.

    So if your data shows a 0% occurrence, that's not a rare car, it's a nonexistent car. I suppose I should have said anything less than 1% but greater than 0%. The threshold might be greater than 1% for the ceiling, depending on the locale and prototype, and on Marty's pie chart the threshold was 1.5%.

    Eliminating the rare cars creates a non-prototypical mix. If Marty had lumped everything less than 1.5% as rare cars, the data would show that roughly 25% of the roads/cars are rare, and the CN cars in the chart would have been 50%. So in that data set, the number of rare cars is about 50% of the most common cars. For every 2 CN cars, you'd have a road that accounts for less than 1.5%.

    So let's say your railroad has 100 cars on the layout in an operating session and you decide you only need 100 cars on your total roster. Everything is on the layout all the time. For one of your rare cars, you're going to pick a New Haven hopper because you have a photo of one in southern California. If you had such a photo, it was probably a very rare occurrence. But if the same hundred cars are on your layout all the time, and that's one of them, it gives the impression that it was relatively common since it's there every session.

    My point is that rostering more rare cars helps balance your mix, provided that you have a large enough roster that you aren't running a particular rare car every session. In the data Marty provided, MKT cars accounted for .3% of cars. In other words, you'd see one for every 300 cars or so. So if you have 100 cars on your layout during an ops session, you would expect to see an MKT car every 3 or so sessions. If you have a frequent group of operators, though, at some point they might notice it's always the SAME MKT car that runs through every 3 or 4 sessions. That might be relatively accurate (maybe MKT auto cars accounted for 90+% of MKT cars through White River Junction, for example). Or it may not be, if the mix should be greater, meaning you would probably do well with a couple, or even a few MKT cars, to avoid that sort of pattern from occurring.

    So yes, that N&W hopper would look out of place. But if you have documentation of such a movement, you can show that such oddities existed and it's a lucky day that you happened to be railfanning the layout on that day!

  3. A really thoughtful, detailed post with lots of fodder for additional discussion. We should get Marty to weigh in. And maybe cross post to MRH - would love to see how the resulting thread would evolve!

  4. Yep, it's already there:

  5. For most of us having multiples of rare cars is somewhat of a luxury, especially for prototypes which would have to be kit built. It would certainly be a good thing to do this, but not that practical. I know in my case, it will be a long time before I get close to obtaining the cars I have determined ought to be in my roster, never mind coming up with extras.

    Another issue to consider is the distance a foreign road is from your layout's locale. If ATSF had 50,000 boxcars and the PRR also had 50,000 boxcars as a NH modeler I would not want nearly as many ATSF boxcars as PRR boxcars.

    I generally agree with your results, but have gotten to a similar place by a way I find logical and yields a mix of cars which would allow someone observing those cars to conclude that they are on the NH.

  6. Thanks, Bill!

    Yes, budget and modeling skill are always factors for all of us. It's going to take me a while to pick up all I need as well.

    I definitely agree that as you get to individual trains (especially when we have to reduce 100+ car trains down to 20 cars) that you have to take into account the regional impacts to get the right "feel."

    There are plenty of approaches to getting the right mix, of course, and as you know I'm pretty familiar with all the research you've done (which continues to be quite helpful to me!). I just think that Marty's post highlighted one of the common pitfalls I've seem modelers run into, by setting an arbitrary cutoff, for both major car types, and smaller roads.

  7. As some who has used statistics professional - which includes a period of teaching the subject - I have to say that a general ("national") survey is a good starting point, but you have to adjust for the locality. As Marty shows, there was a lot of CN traffic on the CV. This is no surprise, given the history and heritage of the CV. But it also means that a non-CN line in New England would see less CN traffic

    1. Hi, Simon -

      Yes, absolutely. There was a lot of CN traffic coming off of the CV onto the New Haven too. For CV, CN traffic is almost home-road. Via the CV, CN was almost a direct connection to the NH.

      The mix is also different within a certain road, depending whether you're on a mainline with through traffic - in my case I'm not quite a mainline, but in terms of Maybrook to Hartford it's very similar to an actual mainline - or a terminal branchline where the specific industries will have a different, more specific mix.

      Unfortunately, I don't have the sort of data for the NH that Marty has for the CV. But photos confirm that CN traffic was very common on the NH - both destination traffic and through traffic.

      I don't have the data (or the skills) to assess WHAT mix is appropriate for a given line on a given railroad. That's the sort of research that each modeler has to do for themselves. Like so many things, specific information (that is, data from your road) is always better than general data (that as the nation, or continent, as a whole.

      But whatever the data set you've got, not using the whole data set skews the results.

  8. I handled this in the following way for boxcars: The top 90% of roadnames rounded up to whole numbers. 90-95% - I used about every other name, favoring the Northeast. Beyond that I picked one name from each remaining 1% bracket. This wound up being LV, AC&Y, WAG, BCK and P&E. P&E is home road, as I model NYC; the other ones have a regional bias. Not saying I was right or wrong, just a way to include some of the little guys.