Monday, October 27, 2014

Experience Spotlight - Andrew Distler, CBS Sports


In this semester's Experience Spotlight series, the blog will be featuring Cornell ILR SBS members who have excelled in positions in the sports industry. Many talented Cornell students are making impressions all across the sports world, and this is their chance to showcase their experiences.

This week's Spotlight focuses on Andrew Distler '15.  Andrew is a senior in Cornell's College of Arts and Sciences where he majors in Sociology and minors in Policy Analysis and Management, Demography, and Music. Andrew has been an active member of the club since his freshman year, attending several conferences and many other events. He also has played a critical role in the success of the Big Red Sports Network as well.  He can be reached at abd76@cornell.edu.

Andrew worked this summer with CBS Sports Network, serving as an Intern in the Programming Division. He was kind enough to answer some questions about his experience.
What were some Day-to-Day responsibilities of the position?
My day-to-day responsibilities included maintaining, and helping create, the 2014-15 college basketball and football schedules, fact checking documents and presentations, and compiling research for CBS Sports Network’s programming, which included providing team previews for several college basketball, football and hockey teams and conferences.

How were you able to get the Internship?
I had originally applied for this internship after my sophomore year. After finding out I needed more experience in order to be considered for a position, I was fortunate enough to assist in the launching of Cornell’s Big Red Sports Network last year, in which I gained lots of experience in sports marketing, research, and journalism. At my interview, my boss seemed impressed how much of the sports world I was able to comprehend, and working with BRSN is a major part of that.

How has this experience shaped your career plans?
This internship definitely made me realize that going into sports television is something I would enjoy. Before this summer, I had only really focused on careers in either league or team offices, but I discovered how much fun working for a sports network can be!
What advice would you give another student interested in a similar experience?
If I had any advice to give students interested in a similar path, it is to get involved with sports in any way possible (i.e., write for your school’s newspaper, join a sports business club, work for one of your school’s teams). Every week, the interns met with an executive from a different department, and each one said that in this industry, experience counts more than anything, so look to get involved any way you can (which is why having BRSN was so helpful for me!). Another important note: Most people in this industry, regardless of position, WANT to help you. I cannot say enough about how my bosses eagerly gave me career advice every chance they got, and how willing they were to put me in touch with other people they knew in sports. Never be afraid to reach out to anyone you know in this, or any other, industry, for advice (also helps with networking, which was also mentioned quite a bit!).

What was your favorite aspect of the experience?
My favorite part was definitely our college basketball “draft”, in which we “drafted” college basketball games from certain conferences, with other networks such as ESPN on the phone, waiting to make their picks. It felt like being the War Room for an NFL team, and they really took my suggestions of which games to air to heart. What’s cool is I can see my imprint on this season’s college basketball schedule now! I also loved being in CBS’s main office (Black Rock), as I was able to see first hand how a lot of the network is run.

Thank you to Andrew and CBS Sports for allowing us to share this awesome experience. We hope you have learned about some of the wonderful opportunities that Cornell, the ILR School, and the ILR Sports Business Society can provide in the sports world. We hope to feature many more stories from students and employers this fall!

Labels: , , , , ,

Wednesday, May 7, 2014

Part 2: How to Predict Postseason Success in Baseball

Wouldn't it be nice to predict the next time your team will hoist the Commissioner's Trophy?

While Part 1 looked at driving in runs without hitting home runs, the second hypothesis has more to do with hitting the league's most elite pitchers in the postseason. Will this hypothesis lead to some statistically significant results?


Performance Against Top Pitchers

Hypothesis

Against top-line starters and relievers, it is very difficult to hit home runs, so my theory is that teams that have a more simplistic batting approach will have a better opportunity against these very good pitchers. Also, because a team is very likely to face great pitching in the postseason, I also hypothesize that teams that face good pitchers (I have categorized “top pitchers” as those who finish in the top 20 of ERA minus, or ERA-, as calculated by Fangraphs) more often and/or have more success against them (in terms of runs scored per nine innings) are more likely to have playoff success.

Results

By hand, I compiled the top 20 starting pitchers in terms of ERA- every year from 2003-2012, and then used Baseball Almanac to record every game these pitchers played against teams who made the playoffs that year. I compiled total innings, total runs scored, total games and runs scored (not just earned runs) per 9 innings for each team each year. The reasoning behind looking at all runs, and not just earned runs, was because runs of any kind are so hard to come by in the postseason, or when facing a top pitcher, and even if a run is unearned, most of the time the opposing team would still need to string together a couple of hits to allow that unearned run to score.

When I finished compiling data on team performances against top 20 pitchers, I ran individual regression analyses with PV being the outcome variable, and these new statistics being the predictors. However, no single statistic correlated to having a high PV. Even when using multiple predictors with the top 20 pitching stats, there was still no significant correlation.

Conclusion

Based on the results of my tests of these two hypotheses, I unfortunately did not find any significant regression models that could predict PV from any of these statistics, I was not hugely surprised by this outcome for a few reasons. Because I only looked at playoff teams in the past ten years (many of the statistics I used in these models were not compiled before then), my sample size was smaller than ideal to start with. Also, there is high multicollinearity among so many of these statistics. This means that it was it was difficult to interpret the individual coefficients.

Also, having too many predictors, or controlling for too many variables, makes it extremely difficult to find a model that is both significant, and that makes sense from a baseball perspective. There were a few interesting findings, such as how LDp is marginally correlated with playoff wins (but not correlated with playoff series wins), but for the most part, no major discoveries were made.

Possible Improvements

One of the changes I could have made included how I calculated the top 20 pitchers statistics. I chose the number 20 randomly, but I also compiled the top 20 pitchers regardless of league. In hindsight, I probably should have compiled the top 20 pitchers from both the American and National Leagues in each year. Also, maybe there is a better statistic than “runs per 9 innings” to gauge how well teams do against these top pitchers. Also, when my second hypothesis failed, I started to compile 28 new statistics from Fangraphs’s “high leverage situations” split. I originally tried this because essentially all playoff batting situations can be considered “high leverage.”

However, these statistics were compiled from late and close game situations, rather than ability to drive in runs without hitting home runs, which is what my two hypotheses were related to. My time might have been better spent looking at statistics with runners in scoring position. Those kinds of statistics would have been more relevant to my hypotheses, as driving in runners in scoring position is not only the most effective way to score off top pitchers, but it is also a skill that requires the batter to shorten his swing, and have a more simplistic batting approach. As I continue this research in the future, I will take into account all of these factors in my quest to find a formula for postseason success in Major League Baseball. 

Labels: , , , , ,

Tuesday, May 6, 2014

Part 1: How to Predict Postseason Success in Baseball



Just how did the Red Sox get past the Rays and Tigers in 2013?

Introduction

“They got hot at the right moment.” “They’re just lucky they peaked in October.” “It was just meant to be.”

These are all things that have been said about recent World Series winners. Ever since Major League Baseball switched to its current three-division system (and after adding a second wild card in 2012), it has made it more difficult for teams with the best records to win it all. This is because probably more than any other sport, baseball’s playoffs are so much different than its regular season.

Baseball’s 162-game regular season is a marathon of endurance and mental toughness. On the other hand, the playoffs are a sprint, with the winner often times being a team that by all traditional metrics (such as wins and winning percentage) is inferior. However, it is extremely difficult to predict when such a team will go on a World Series run. Even though there are several metrics to measure a player’s overall value to his team (such as WAR, or Wins Above Replacement), there is not a lot when it comes to statistics or groups of statistics that can best predict postseason success.

Michael Lewis’s Moneyball introduced the importance of on-base percentage (OBP) to many baseball fans, but I have determined through a simple regression analysis that statistic alone does not correlate to team postseason success. The general consensus among fans, commentators, and analysts is that having dominant pitching, particularly starting pitching, is the key to advancing far in the playoffs.

I agree that the most important variable on a playoff team is their starting pitching, but pitching alone doesn’t win you the World Series either. The 2013 postseason saw the Boston Red Sox in the ALCS beat the Detroit Tigers, a team that had what was considered to be the most dominant starting rotation in baseball. This was after they beat another team with excellent pitching, the Tampa Bay Rays, in the previous series. In a sport that has metrics to measure everything from speed on the base paths to the strength of an outfielder’s arm, there is no accepted metric that can accurately and consistently predict postseason success based on regular season performance. My goal was to see if I could find such a measure.

This is not a simple task. In an October 2013 article for ESPN’s Grantland, Rany Jazayerli wrote, “Trying to find the magic formula for postseason success has been the sabermetric community's version of trying to turn lead into gold: Many have tried, but none have entirely succeeded.” I first came up with the idea for this project after angrily watching the New York Yankees over the past decade consistently be one of the best teams in the league, but then lose in the postseason (often in in the division series).

Most fans and analysts pointed to the Yankees’ lack of quality starting pitchers post-2003 to why they couldn’t win in the playoffs after winning four of five World Series from 1996 to 2001. However, the Atlanta Braves, led by their dominating pitching trio of Greg Maddux, Tom Glavine and John Smoltz, had even more trouble in the postseason, winning only one World Series title from 1992 to 2005, despite winning the NL East title in all fourteen years. It amazed me how these teams could consistently dominate their respective divisions and leagues for 162 games, only to come out flat in a five or seven game series. It made me wonder if there were hints in a playoff team’s regular season statistics that could predict a successful postseason run.

For this research, I have defined postseason success as “playoff value” or PV. A PV of 1 means losing in the division series, 2 means losing in the Championship Series, 3 is losing in the World Series, and 4 is winning the World Series. Therefore, in order to find statistics that can predict postseason success, I ran hundreds of linear regression models, with the outcome variable PV, and with many different predictors.

Ability to Drive in Runs Without Hitting Home Runs
 
Hypothesis

For my research, I decided to focus mainly on regular season batting statistics of playoff teams from the past ten years (2003-2012). I did this for a few reasons. First off, as previously mentioned, it is widely accepted that good pitching beats good hitting in the playoffs. However, I think this only holds true when looking at conventional measures of “good” hitting, such as batting average and runs scored. Instead, it could be more important to look at team batting patterns and tendencies. It is my hypothesis that teams that have more simplistic batting approaches, or those that emphasize contact and putting the ball in play and deemphasize over-swinging to try to hit home runs, will be more successful in the postseason. The reasoning behind this is that the pitchers in the postseason are so dominating (the number of off days in the postseason means that teams usually only use three or four of their best starters), a team might only get one or two chances a game to get a rally going or drive in runs. And because the top pitchers in the playoffs, are usually less likely to give up home runs, it is important that when given the proper opportunity, teams are able to drive in runs without hitting home runs.

Results

I started by using the stepwise regression function in R in which, I predicted PV from the original 38 statistics I gathered. These statistics ranged from simplistic, such as hits and home runs, to advanced, such weighted on base average (wOBA) and weighted runs create plus per 600 plate appearances, to contact-based, such as groundball percentage and home run to fly ball ratio. The stepwise function took all possible predictors and entered and removed them from the regression model until all predictors in the model had a p value of less than .1.

The stepwise function gave me the following: PV ~ H + HR + BABIP + GBFB + LDp + HRFB + BUH + Swingp + Contactp. What this meant was that playoff value could be predicted by the combination of hits, home runs, batting average on balls in play, ground ball to fly ball ratio, line drive percentage, home run to fly ball ratio, bunt hits, swing percentage and contact percentage. After finding the summary of this model, I discovered it was statistically significant, as it had a p value of .038.

I was not surprised by a few aspects of the formula, as teams with higher LDp (line drive percentage) and GBFB (ground ball to fly ball ratio) stats usually mean they have more simplistic hitting approaches, as higher rates of hitting line drives and ground balls means that they aren’t over-swinging or trying to only hit home runs as much. However, it is very difficult to interpret these individual coefficients, due to the multicollinearity of the model.

This multicollinearity is caused by the high correlation between the variables in this model. For example, teams that usually have more hits are going to have more home runs, and a higher Batting Average on Balls in Play. After trying several other models that included variables that I thought would be significant (such as contact percentage, line drive percentage and zone contact percentage) I was still unable to find another model that was statistically significant, so I came up with another idea.

Be sure to check back tomorrow for Part 2 of Andrew's analysis.

Labels: , , , , , , , ,