Past MS-QA Research Projects

Each student is required to work closely with at least one faculty member to develop and report research with a significant quantitative component, or perform an acceptable application and analysis of quantitative methodologies. MSQA Research Projects are highly variable in nature. You will note that topics of interest include mathematical programming, statistical analysis, simulation, operations management, and many other disparate application and methodological areas. The underlying commonality of all projects is that they must have a significant component of quantitative analysis, as defined by the candidate's committee. The following list of projects is presented in reverse chronological order. The Committee for each project is listed with the Chair of the Committee first. The date provided for each of the Projects is the date of the Project Presentation; final graduation could have been somewhat later.  Where available, abstracts of the projects are provided as well.


Edmund A. Berry, National Estimates of the Inpatient Burden of Pediatric Bipolar Disorder in an Inpatient Setting. An Analysis of the 2003 and 2006 Kids Inpatient Databases (KID) Data, September 25, 2009 (Martin S. Levy, Pamela Heaton)
Bipolar disorder (BPD) is a debilitating recurrent chronic mental illness, characterized by cycling states of depression, mania, hypomania, and mixed episodes.  This disease, ingenerating tremendous societal and economic impact, is associated with a high degree of morbidity and mortality and is particularly costly and debilitating in pediatric patients.
The objectives of this study were 1) to calculate national estimates of the annual burden of inpatient hospitalizations of children and adolescents with BPD, where burden is measured specifically in terms of charges, cost, and length of stay; 2) to describe and compare the burden across various demographic characteristics, hospital characteristics, and key comorbidities associated with BPD; and 3) to determine the independent effects of these demographic, hospital-type, and comorbidity factors on hospitalization costs.  To accomplish these objectives, we examined data in both 2003 and 2006 from the Kid's Inpatient Databases (KID).
National estimates of the means and standard error of the mean for cost, charges, and length of stay, for inpatient pediatric bipolar disorder (BPD)) used the complex sample design of the 2003 and 2006 KID data, which contains weighting, stratification, and clustering variables. Two Ordinary Least Squares regression models, using 2003 and 2006 KID data, were used to determine key predictors of cost along demographic characteristics, hospital characteristics, and comorbidities. Finally the Chow test was used to determine whether the underlying regression models estimated in 2003 and 2006 were the same.

Deepankar Arora, A Decision Support Methodology for Distribution Networks in a Stochastic Environment using Mixed Integer Programming in Spreadsheets, September 11, 2009 (Jeffrey D. Camm, Kipp Martin)
In an effort to reduce the distribution costs from distribution centers to the customer location a company is considering opening a set of five distribution centers to cater all of its customer locations. The main problem that the company faces is demand uncertainty at the customer location, which can have an adverse effect on its transportation costs.
The main goal of this project is to introduce a decision support methodology for identifying a robust distribution network that will lead to minimized transportation and handling costs under stochastic demands using VBA (Visual Basic for Applications) in Excel. Specifically the methodology helps the decision maker to narrow down his choices by giving him cost distributions corresponding to a candidate solution; an efficient frontier; cost value which corresponds to best worst case scenario, value at risk (VaR) and finally expected loss below a certain value.
The project aims to develop a concept which can be utilized to aid the decision maker to make a decision when parameters in an optimization model are stochastic. The final call in such cases is subjective and a "good" decision depends on the choice of decision maker, but this methodology aims to give the decision maker tools to facilitate and inform the decision making process.

Bethany Harding, Safety Stock Level Analysis for Replenishment Planning using Actual-to-Forecast Demand Ratios, September 4, 2009 (Uday Rao, Amitabh Raturi)
Senco Brands, Inc., currently stocks approximately 10,000 items at one or all of their domestic distribution locations. The planning and replenishment for these items is performed using a basic MRP planning system. Forecasts are created for each item and prorated for each distribution center based on historic usage percentage to total corporate demand. Desired ending inventory levels are set using a safety-time factor of weeks. The current model requires a safety-time level defined for each item at each distribution center. Desired ending inventory is calculated in each weekly time period of the planning horizon by accumulating the demand forecasts over contiguous future weeks specified by the safety-time. Using the company's data, an Excel-based tool was developed to: 1. Recreate the MRP planning system's approach to setting planned order releases using the desired ending inventory approach and the input safety-time. 2. Apply an actual-to-forecast demand "A/F" ratio approach to determine the probability distribution of demand over the planning horizon, 3. Simulate various scenarios for future demand, 4. Use the simulated demand scenarios to determine the performance of a chosen safety-time (or desired ending inventory) using key performance indicators such as expected customer fill-rate, inventory investment, and working capital. 5. Calculate an optimized safety-time that achieves satisfactory performance (e.g. 95% fill rate), as determined by the company. Various applications of the Excel-based tool are illustrated.

Andreas Kuncoro, Empirical Study of Supply Chain Disruptions' Impact on the Financial and Inventory Performance of Manufacturing and Non-manufacturing Firms, August 28, 2009 (Amitabh Raturi, Uday Rao)
Supply chain disruptions are various unanticipated events in the supply chain caused by internal and external factors which cause a firm to significantly deviate from its original plans and consequently affect its performance This work assesses the relationship between supply chain disruptions and overall firm performance as measured by financial (return on asset and leverage) and operational (inventory turnover) metrics.  We first chronicle 75 supply disruptions in 47 firms as reported in the business press over a three year period (2005-2007).  We then categorize these disruptions on causal factors as internally versus externally caused, and across several origin sources. The performance metrics are then observed from Compustat quarterly one year before through one year after the disruption announcements.   The impact of such disruptions is first analyzed by firm size, firm type (manufacturing versus non-manufacturing), reason and responsibility. In multivariate analysis of covariance tests, firm size showed a significant positive association with overall firm performance while disruption event announcement showed a significant negative association with overall performance.  Consistent with previous studies, our findings indicate that supply chain disruptions negatively impact both financial and operational performance. Firm size significantly moderates this impact.  One year after the event announcement, the firms are able to recover their performance.

 James Andrew Kirtland III, Simulation Efficiency of the Finitized Logarithmic Power Series, August 27, 2009 (Martin S. Levy, W. David Kelton)
It is often times appropriate or desired to limit a distribution's support.  This can be due to the actual environment that an analyst is trying to model or to increase the efficiency of simulating random variates from a model.  This can be done using traditional truncation.  However, when truncation is used, undesired and often times unpredictable effects occur to the moments of the parent distribution.  Finitization is a method of limiting a Power Series distribution's support while preserving its moments up to the order of finitization, n.  The logarithmic power series distribution will be used to discuss properties of theoretical, truncated, and finitized distributions.  Four algorithms designed to generate random variates from a theoretical logarithmic power series distribution are compared to an alias method designed to generate random variates from a finitized logarithmic power series distribution.  The variates created from these four algorithms as well as the alias method will be tested against a theoretical logarithmic power series to check if the moments hold.  Finally, a horserace is used to test whether the finitized logarithmic distribution using an alias method is more efficient at generating random variates than the four other algorithms based on an infinitely supported logarithmic distribution.

Shannon Peterson, Development of a Long Range Capacity and Purchasing Plan for a Manufacturing Environment, August 24, 2009 (Jeffrey Camm, Uday Rao)
Long range capacity planning is an essential part of business planning.  This can be complicated by seasonality of products, varying material pricing plans and supplier capacities, criticality and substitutions of raw materials, and multiple production sites and bills of materials.  This project develops a flexible tool that reveals an optimal, high-level long range production schedule and purchasing plan to satisfy customer demand and identify potential outages.

Ndanatsiwa Anne Chambati, Locating an Optimal Site for a New Natorp's Garden Center, August 21, 2009 (Michael Magazine, Uday Rao)
A well known aphorism states, "the most important attributes of stores are location, location and location". The area of research for optimal store location has grown rapidly in the last decade. Most of the research in this area has been undertaken by marketing researchers, urban geographers and economists with applied mathematicians recently entering the field. Applied mathematicians have become involved in the study of retail location theory through the development of algorithms and mathematical models applicable to location problems. At the mathematical level the problem is abstract and exact removed from the practical problems of the real estate developer or marketing expert.
Natorp is a family owned business that has been around since 1916. They currently have two Garden Center locations, a nursery and landscaping services. They would like to open an additional Garden Center in the Ohio Kentucky and Indiana (OKI) region and need to know where the optimal location for it would be. First, we review the current literature on optimal store location then look at the most important factors for Natorp to consider in the expansion. Next, we evaluate each of the 8 counties in the OKI region using a multi-factor site location rating system and come up with potential sites for the new Garden Center. These potential sites will be evaluated based on population projections over the next 30 years, median household income, median home value, and proximity to competitors.

Ashutosh Mhasekar, Application of Statistical Procedures to Target Specific Segments for Upgrading Marginally Sub-par Members to Rewards-eligible Level in a Retail Loyalty Environment, August 19, 2009 (Michael Magazine, Uday Rao, Marc Schulkers)
The retail industry has become extremely competitive with loyalty programs constantly used to monitor customer behavior and engage customers for incremental sales / revenue. Retailer R runs a points-based loyalty program. Members can earn rewards certificates which are good towards future purchases. With the current economy and stiff competition, the Retailer is using targeted bonus offers to members that need additional points to earn a reward certificate. In this project we use various statistical tools to efficiently target members that need additional points to earn a reward certificate and to maximize certificate redemption which results in incremental sales to the company. Also, a test and control group approach is employed to monitor and measure the incremental behavior / performance of this “Bonused” group during the promotional period and post period as well. Using the targeted segmentation approach an increase in redemption rate was noted. There was significant increase in revenue during the promotional period, without impacting the post period sales.

Shaonan Tian, Data Sample Selection Issues for Bankruptcy Prediction, August 12, 2009 (Yan Yu, Martin S. Levy)
Bankruptcy prediction is of paramount interest to both academics and practitioners. This paper devotes special care to an important aspect of the bankruptcy prediction modeling: data sample selection issue. We first explore the effect of different data sample selection methods by comparing the out-of-sample predictive performances using a Monte Carlo simulation study under the logit regression model. The simulation study conducted suggests that if forecasting the probability of bankruptcy is of interest, complete data sampling technique provides more accurate results. However, if a binary bankruptcy decision or corporate rating is desired, choice based sampling technique may be still suitable. In particular, within the logit regression context, a simple remedy could be applied to justify the cut-off probability, such that choice based sampling technique and the complete data sampling technique display the same explanatory power in forecasting the bankruptcy classification. We also find that appropriate adjustment of the cut-off probability is complementary if taking into account different misclassifications. Finally, we contextualize the proposed recommendations by applying them to an updated bankruptcy database. We further investigate the effect of the different data selection methods on this corporate bankruptcy database with a non-linear classification method, Support Vector Machines (SVM), which has recently gained some popularity in the applications.

Xinhao Yao, Option Pricing: A Comparison Between Black-Scholes-Merton Model and Monte Carlo Simulation. August 7, 2009 (Martin S. Levy, Uday Rao)
An option, a kind of financial derivative, is a special contractual arrangement giving the owner the right to buy or sell an asset at a fixed price on a given date.  In this project, we focus on comparison between two option pricing methods: Black-Scholes-Merton model and Monte Carlo simulation.  The results from both methods can be considered equivalent and an equivalence test is applied to determine the number of iterations of Monte Carlo simulation.  We also try some modifications of the Monte Carlo simulation to see how to improve the pricing method when rare events happen.

Wei Huai, Bankruptcy Prediction: A Comparison between Simple Hazard Model and Logistic Regression Model, July 27, 2009 (Yan Yu, Uday Rao)
As a serious issue for both firms and individuals, bankruptcy has recently drawn increased attention from society thereby making its prediction an important topic. In this research project, two popular bankruptcy forecasting models, Shumway (2001) Simple Hazard Model and Logistic Regression Model, are studied and compared. Three different measurements, Deciles Ranking, Area under ROC curve and Hosmer and Lemeshow goodness of fit test are implemented to evaluate and compare these bankruptcy forecasting results. The conclusion that simple hazard model is superior to logistic regression model in accuracy of bankruptcy forecasting is reconfirmed. 

Mayur Bhat, Study of Uplift Modeling and Logistic Regression to increase ROI of Marketing Campaigns, June 5, 2009 (Uday Rao, Amitabh S. Raturi)
In this research project, we study a technique known as Uplift Modeling which uses control groups judiciously to measure the true lift in sales that a marketing campaign generates. In addition, Uplift Modeling proposes customer segmentation to achieve better campaign results by way of selective targeting. The results show how using test versus control groups helps in measuring true lift. We also demonstrate that selective targeting of customers using Uplift Modeling increases incremental revenue when compared to the existing alternative called Traditional Response Modeling. Logistic Regression, using categorical attitudinal data, is also used to further strengthen and complement the results seen from Uplift Modeling.

Venu Silvanose, Developing and Assessing a Multiple Logistic Regression Model on Mortgage Data to Determine the Association of Different Predictor Variables and Borrower Default, June 3, 2009 (Martin S. Levy, Norman Bruvold, Yan Yu)
The purpose of this paper is to develop and assess a logistic regression model to determine the association of different predictor variables and mortgage borrower default. In the current housing market, where none of the widely used models in the industry were able to predict with some certainty the high level of default by borrowers, models are still used albeit with a sense of extreme caution to identify good and bad credit risks. 

Manish Kumar, Intelligent Allocation of Safety Stock in Multi-item Inventory System to Increase Order Service Level and Order Fill Rate, June 3, 2009 (Amitabh S. Raturi, Michael J. Magazine)
In this study, we propose a model to establish safety stock in a multi-item inventory system to increase order fill rate and order service level based on correlation between the demands of multiple products. A customer order to a multi-item inventory system consists of several different products in different quantities. The rate at which a manufacturer is able to fulfill the demand for all products to the customer's order in a specified time is termed as order fill rate (OFR). Whereas, the statistical picture of how successful the manufacturer is in fulfilling all the orders completely by the required date is termed as order service level (OSL). The OFR and OSL are very important indices in measuring the performance of the manufacturer and customer satisfaction. We evaluated the order fill rate and order service level performance of the inventory system in a model in which total customer order demand process is based on normally distributed but correlated demands. We show that if the safety stock level is adjusted in accordance with the level of correlation in product demand, both the order fill rate and order service level can be improved.

Larisa Vaysman, Quantifying the Impact of Draft Round on Draft Pick Quality Using Non-Parametric Median Comparison, June 2, 2009 (Michael Fry, Jeffrey Ohlmann, Geoff Smith) 
At the beginning of each season, NFL teams take turns selecting rookies to add to their rosters in a days-long process known as the NFL Draft. The NFL Draft consists of seven rounds. Since each team wants to have the strongest possible roster, players who are thought to have the potential to be outstanding are chosen early, and less desirable players are generally chosen later in the process or not at all. We seek to quantify the “cost,” in terms of player quality, that is incurred when a team chooses to wait until a later round to draft a player at a particular position. We also examine a number of position-specific metrics to measure player quality.

We use the Kruskal-Wallis test, a non-parametric comparison of medians, to determine which draft rounds are likely to offer picks of equivalent quality, and which draft rounds are likely to offer picks of significantly better or worse quality. Our analysis is meant to assist teams during the decision-making process of drafting players by quantifying the tradeoffs inherent in each potential decision.

Michael D. Platt, Distribution Network Model Using Mixed Integer Programming and a Combination of Distribution Centers and Cross-Dock Terminals, June 2, 2009 (Jeffrey Camm, Michael Fry)
In an effort to reduce manufacturing costs, a company is considering moving its manufacturing facilities from the United States to Mexico. Though the facility costs and labor costs will be much lower at the Mexico facility, they are concerned that the move could have an adverse effect on their transportation costs.

The goal of this project is to determine the distribution network that will result in the lowest transportation and material handling cost while maintaining desired customer service levels. Specifically, the project will focus on incorporating cross-docking terminals in the solution in conjunction with fully stocked distribution centers. At a cross-docking terminal, product is moved directly from a receiving dock to a shipping dock, spending very little time in the facility. This process eliminates the need to hold these finished goods in inventory, thus reducing inventory costs and material handling costs.

Taylor W. Barker III, The Expected Box Score Method: An Objective Method for NFL Power Rankings, May 29, 2009 (Martin S. Levy, Michael J. Magazine, co-chairs)
One of the more interesting pages on ESPN.com during the NFL season is the NFL “Power Rankings” that they compile each week. This is basically a ranking of the relative strengths (during that week) of the NFL teams based on the votes of several panel members (ESPN.com NFL writers/bloggers). While the results take into account the subjective rankings of each of the panel members, it would be interesting to see if there is an "objective" method to develop weekly power rankings, based on current season statistics to date.

An objective method for weekly power rankings is found through a process I have named the Expected Box Score (EBS) Method. The EBS Method determines expected box scores between two teams with a given venue based on current season data and then plugs them into a linear regression model based on 20 years of data to get a current estimated point differential between the two teams. This process is repeated for every team playing against every other team exactly twice (once at home and once away) and are used to determine how many of those games each team would be expected to win. The team with the most wins is ranked #1 and so on. Shortcomings of other methods are addressed and then considered in the development of the EBS method. Validation for this method is provided via comparisons with Las Vegas point spreads and NFL.com Power Rankings.

Lori Mueller, Norwood Fire-Department Simulation Models: Present and Future, May 28, 2009 (David Kelton, Jeffrey Camm) 
The Norwood Fire Department (NFD) currently operates one fire station, serving approximately 22,000 people. In 2008, the NFD made approximately 4,400 runs, which averages to about 12 runs per day. With an increase in retail and business development in the city, there has been a subsequent increase in the number of emergencies the department responds to each year. If the development in the city continues over the next few years, the NFD will have to grow along with the city.

The NFD has a few options for expansion. One option is to open a second fire station at a location currently owned by the city, which used to be the Norwood fire station before a new station was opened at its current location. Another option is to expand their current station, which is located near the geographical center of the city, so that they could increase the amount of equipment and firefighters. Using simulation modeling, these different options were explored to determine which option is best for Norwood, when the time comes for expansion.

Vinod Iyengar, Call Volume Forecasting and Call Center Staffing for a Financial Services Firm, March 13, 2009 (Uday Rao, Martin S. Levy)
 
In this project, we use statistics and data analytics to build scalable and robust models for call center forecasting and staffing. The core of the problem involves predicting call volumes with lead times of a few months, when conditions are dynamic and there is high variability with multiple types of calls. We use data from a US-based prepaid debit card vendor with two types of calls: application calls and customer service calls. We predict application calls using a model of historical effectiveness of marketing dollars and incorporate data on card activation history and customer attrition. We predict customer service calls from active cardholders using time series analysis and regression to capture trend, seasonality, and cyclicity. Call volume predictions are then input into a stochastic newsvendor model to set a staffing level that effectively trades off staffing costs with lost-sales penalty costs for unsatisfied calls. The impact of different staffing level choices on expected costs is explored by simulating call center volume. Performance improvement resulting from this work includes more accurate forecasts with increased service levels and agent occupancy.

Lei Yu, A Comparison of Portfolio Optimization Models, March 13, 2009 (Martin S. Levy, Uday Rao) 
Applications of portfolio optimization models have developed rapidly. One issue is determining which model should be followed as a guide for investors to make an informed portfolio decision. In this paper, five optimization models: classical Markowitz model, MiniMax, Gini's Mean Difference, Mean Absolute Deviation, and Minimizing Conditional Value-at-Risk, are presented and compared. Solutions generated by different models applied to the same data sets provide insights for investors. The data sets employed include real world data and simulated data. MATLAB, VBA (Excel as host), and COIN-OR software were employed. Some observations about alternative selection, similarities, and discrepancies among these models are found and described.

Moumita Hanra, Assessing ultimate impact of Brand Communication on market share using Path Models and its comparison to Ridge regression, March 12, 2009 (Martin S. Levy, Uday Rao)
Path modeling, based on structural equation modeling, is a widely used technique in market research industries to analyze interrelationships between various measures and to measure which ones are really significant in driving sales. In this study, the objective is to find the best fitting path model to assess which attributes are really important to a consumer in terms of sales using respondent level survey data. Also, this model would predict the best media sources companies should focus on in advertising their brand for gaining maximum public awareness of that brand and how this awareness drives the way one thinks about the brand in different dimensions and its effect in turn in driving sales. The second half of this study is focused on comparing the results of Path model to ridge regression to assess which model yields better fit and gives results intuitively. Ridge regression reduces the multicollinearity among independent variables by modifying the X'X matrix used in Ordinary Least Squares regression using a ridge control parameter. The results indicate that the path model gives a much better fit than ridge regression especially when multicollinearity is not in its extremity.

Man Xu, Forecasting Default: A comparison between Merton Model and Logistic Model, March 11, 2009 (Yan Yu, Uday Rao) 
Merton default model, which is based on Merton's (1974) bond pricing model, has been widely used both in academic research and industry to forecast bankruptcy. This work reexamines Merton default model as well as the relationship of default risk with equity returns and firm size effect using an updated database from 1986 to 2006 time frame obtained from CompuStat and CRSP. We concur with most of the findings in Vassalou and Xing (2003). We find that both default risk and size have impact on equity returns. The highest returns come with the smallest firms with the highest default risk.

We then focus on the comparison between Merton model (financial model) and a logistic regression model (statistic model) for default forecasting. We compare Default likelihood indicator (DLI) from Merton model with estimated default probability from logistic model using rank correlation and deciles rankings based on out-of-sample prediction. We find that the function form of Merton model is very useful in determining default. The structure of Merton model captures important aspects of default probability. However, if bankruptcy forecasting is desired, our empirical results show that Logistic model seems to provide a better prediction. We also add distance to default (DD) from Merton model as a covariate in our best logistic model and we find out that it is not a significant predictor.

Luke Robert Chapman, A Current Review of Electronic Medical Records, March 11, 2009 (Michael Magazine, Craig Froehle) 
In this project, we research the imminent installation of Electronic Medical Records (EMR) in all hospitals and clinics throughout the United States. This project was motivated by our interaction with the Cincinnati Department of Health (CDH) via a project that focused on persuading the Cincinnati council that EMR should be immediately invested in at all six of the CDH clinics. We review the advantages of EMR and also recognize the disadvantages, some of which were overlooked in the original project with CDH. The current growth of Electronic Medical Records in the US and what the future holds for EMR is reviewed. The main analysis will review the claim that EMR helps to reduce medical errors. The analysis will use multivariate techniques such as factor and cluster analysis.

Chetan Vispute, Improving a Debt-Collection Process by Simulation, March 9, 2009 (David W. Kelton, Norman T. Bruvold)
The Auto-Search Process is an automated business process flow that has been designed by Sallie Mae for its in-house collection agency; it works sequentially to procure good phone numbers of delinquent borrowers. The process involves outsourcing of data to private vendors wherein the failed data from one vendor are sent to the next vendor until we have tested against all. Also, the process is governed by time-related business rules that allow the data to be sent to the next vendor only after a certain period. Keeping the cheapest vendor first, the process aims at reducing the cost while increasing the procurement of good phone numbers. Before this process could go live, it was required by the analytical team to analyze the process by building a time-related model, and make recommendations. This thesis explores the building of this time-based model using dynamic discrete-event simulation with Arena, and then talks about the findings and recommendations developed while working on the project which helped the company improve its annual revenue position by over $440,000.

Cary Wise, Cincinnati Children's Hospital Block-Schedule Optimization, February 10, 2009 (Kipp Martin, Craig Froehle, Michael  J. Magazine)
Cincinnati Children Hospital is implementing an automated process to schedule clinical and surgical patient visits. The goal is to create a program that allocates operating rooms to requests submitted by individual doctors for clinical time and surgical time. The schedule creation process takes place in two phases: the first phase schedules spaces for specialties (Ortho, Cardio, etc.); the second phase allocates doctors to the specialty schedule. The program that generates the specialty allocation is named the Space Request Feasibility Solver (SRFS). The inputs of the SRFS are a set of specialty requests and information about the operating rooms; the output is the schedule of specialty assignments. The problem is formulated as a mixed-integer linear program (MILP) that minimizes the number of unfulfilled spaces requested. A very large number of potential assignments may be generated depending on whether the request parameters are very specific or general. Indeed, the instance quickly becomes intractable for a realistic problem. We implement a branch-and-price column generation algorithm to overcome the problem of an intractable number of variables. The SRFS invokes a COIN-OR solver named “bcp” to perform the procedures of branching, solving the LP at each node and managing the search tree. The scope of this master's project is to implement a column generation scheme in the SRFS. Testing of the SRFS was performed by verifying the column selected had the minimum reduced cost, and verifying the results of the LP relaxation and IP against the solution of the exhaustive enumeration of all columns. The performance of the SRFS in terms of the number of columns and nodes created to arrive at a solution was also investigated.

F. Alan Shukairy, NFL Fourth Down Decision Making: 2002 - 2007, November 21, 2008 (Dr. Michael J. Fry, James A. Deddens, Richard M. Males)
This paper uses categorical data analysis and logistic regression to explore National Football League (NFL) fourth-down decision making using data from the 2002 through 2007 seasons. The focus of the analysis is on game situations where Romer's (2006) Dynamic Programming model predicts that teams should go-for-it. The likelihood of going-for-it on fourth down is examined including factors such as game time, score differential, yards-to-go and field position. The impact of going-for-it on game outcome is also reviewed. Conversion rates and play calling for both third and fourth-downs are examined. The impact of the home field advantage and momentum – the latter defined as increase in the probability of scoring after a successful fourth conversion - are also considered. Results indicate that teams deviate from Romer's optimal policies. We also find that teams employ play calling that appear contrary to those that would maximize the likelihood of a successful conversion. We find that the home field advantage is real but that the home team's advantage decreases as the game goes on. We also find some momentum benefits with fourth-down conversions.

Fred Ahrens, A Build to Forecast Model using Real Options, September 26, 2008 (Amitabh S. Raturi, Jeffrey D. Camm)
The Build to Forecast (BTF) production strategy described by Meredith, Camm, Raturi, et al, is a response mechanism to the divergent requirements of a long lead time product with high customization and a short customer accepted lead time. The BTF model, developed in the 1990s, addresses this challenge by initiating production of a product prior to receipt of actual sales order functional requirements. The ‘Build to Forecast with Real Options' strategy proposes to achieve the same objective, while also increasing engineering and procurement flexibility, by delaying fundamental design intent decisions. Using the original BTF model as an inspiration, the new model defers both component allocation and final design configuration until late in the build cycle. This is achieved by using ‘real options', an adaptation of the investment concept to an operational environment. Options enable, but do not obligate, the selection of a product design parameter at a later point in time.
While the original BTF model selects components based on a forecast then attempts to match product to sales orders, the new model only selects options based on a forecast that allow components (if the option was enabled) to be added later. The specific component would be selected based on an actual sales order (if its option was enabled), as in a traditional Make to Order model. This paper describes the original BTF model and studies the new BTF with Real Options concept.

Linda Kay Kromer, Design and Development of a Data Acquisition Application for CCHMC Scheduling Optimization, August 25, 2008 (Michael Magazine, Kipp Martin)
Increasing health care costs continue to increase the demand for greater efficiency, creating tighter constraints on physical and human resources. The Cincinnati Children's Hospital Medical Center (CCHMC) is trying to address their existing scheduling inefficiencies, as they also address the additional resource demands of a new satellite location. CCHMC currently uses a manual scheduling method based on legacy schedules, believed to be optimal, and maintained by each distinct specialty. As part of an on-going project with UC MSQA faculty, several students have made attempts at optimizing the scheduling process across all locations and specialties. However, the only request data available is for the new location. All other data consists of schedules, which do not allow for optimization. The problem addressed here is the acquisition of request data, allowing for optimization. The data attained will include requests by each specialty for clinic space, surgical space and doctor's individual requests. This data will provide the opportunity for feasible, and possibly optimal, allotment of physical and human resources. Further, this application must be user-friendly and PC-based, in order to get administrative buy-in.

Weiqun Wu, An Empirical Study of Corporate Bankruptcy Prediction Using Hazard Models, August 25, 2008 (Yan Yu, Martin S. Levy)
This project investigates proportional hazard model approaches to corporate bankruptcy prediction using multi-period accounting data. One of the critical issues in the use of bankruptcy prediction models is the poor out of sample forecasting accuracy, especially when the bankruptcy rate is extremely low. Recent developed corporate bankruptcy prediction models adopt Cox Proportional Hazard analysis to create dynamic models which incorporate time-dependent covariates. Shumway (2001) developed a simple hazard model; by using logit estimation to calculate maximum likelihood estimates the hazard model can be interpreted either as a logit model done by firm-year or it can be viewed as a discrete accelerated failure-time model. In this project, we investigate US IT market data, obtained from COMPUSTAT data base yearly over 1986 till 2006. We find that both traditional Cox proportional hazard model and Shumway's hazard model estimated with logistic regression approach perform well with time-dependent covariates in dynamic models and yield almost exactly same estimated coefficients. Shumway's method performs well in forecasting out-of-sample data. Incorporating categorical variables in Cox model and baseline function in Shumway's hazard model are also explored.

Luchan Byrd III, Statistical Analysis of Testing and Production and Yield-To-Complete Data for Reactors at SUMCO, August 25, 2008 (Uday S. Rao, James Evans)
SUMCO Phoenix Corporation (SUMCO) manufactures electronic-grade silicon wafers for the semiconductor industry and employs about 1,500 people at three manufacturing facilities in the United States. They are currently analyzing components of enhanced production planning systems at the Cincinnati facility to optimize the scheduling of their Reactors – high value assets used to deposit a thin film of material on silicon wafers. The scheduling problem is multifaceted. Dependencies exist based on relationships of part number (finished product type) to reactor type, reactor model, reactor capability, availability of material, qualification process, and several other factors.

This project focuses on the analysis of recent testing and production data to help determine drivers for delivery performance. More specifically, batch testing and production data are analyzed using descriptive statistics to determine if there are differences in performance between different reactors, product type, work shifts, testing reason codes, etc. Additional analysis was performed to determine yield-to-complete (YTC) statistics for each stage of production. Due to lack of normality in the data, the nonparametric analogue to ANOVA, the Kruskal-Wallis test, is used to determine if certain differences identified are statistically significant. Results indicate that different reactors show significant differences in the amounts of finished product and scrapped parts. Knowledge of these differences and how to schedule particular product types (parts) with the reactors that are most efficient at producing them can lead to improved machine maintenance, resulting in decreased scrap and increased cost savings.

Andrew Faehnle, Estimation of Radiology Patient Wait Times, June 3, 2008 (Craig Froehle, Michael J. Magazine) 
Making accurate estimates of patient waiting times within the Radiology queue at Cincinnati Children's Hospital is a non-trivial exercise. Herein three different methods for predicting the time a patient will wait are investigated: a heuristic “Simple Algorithm”, linear regression, and iterated logistic regression. We find that while the iterated logistic regression performs the best of the approaches tested, the performance of the approaches depends on the homogeneity of the dataset.

Zhiyuan Dong, A Matrix Approach for Comparing Estimates of a Population Total Under a Many-to-Many Frame Structure, April 11, 2008 (Martin S. Levy, Yan Yu) 
We propose a matrix approach comparing estimates of a population total under a many-to-many structure, an improved method for calculating the 2nd order inclusion probability of the Horvitz-Thompson method in this many-to-many structure context, an improved method for characterizing the Eigen-structure of the Arc-Weight method, and a Mathematica-based package for doing the corresponding analysis.

Praveen Singaraju, How the Plant Closing Announcement Affects the Stock Price of a Firm, March 31, 2008 (Amitabh S. Raturi, Uday S. Rao) 
Plant closings are widespread throughout the US economy. The affected businesses are not limited by industry, size or any other factor. This work tries to understand the impact of plant closing announcements on the stock market. We propose that there are two antithetical perceptions of a plant closing announcement. Sometimes the market sees it as positive news and sometimes as negative. The results tend to support an over-all negative reaction on the stock market; at the same time, firms that experience a positive effect possess certain identifiable characteristics. We find that that all the inclines are associated with optimistic announcements and all declines with pessimistic announcements. By examining the quarterly financial statements of all companies we identified the variables that best discriminate between the inclines and declines. The results validate the argument that there are indeed two types of plant closings.

Sangeetha Mallya, Applied Bayesian Forecasting of U.S. Medicaid Program Expenditure on Antidepressant, March 4, 2008 (Martin S. Levy, Jeff J. Guo, Christina Kelton)
Mental health drugs expenditure, especially on prescription medicine for depression has been on a steady rise. Depression is among the most prevalent major mental disorders today with about 10% of the US population suffering from Depression. The Social Security Act established Medicaid as a jointly-funded, Federal-State health insurance program. Medicaid plays a fundamental role in the provision of prescription drugs to over 42 million low-income and disabled beneficiaries. The state Medicaid programs spent altogether approximately $2 billion on antidepressant drugs in the US in 2005, across three categories of antidepressants.

To better understand this spending and to safeguard the Medicaid program from excessive expenditure on mental health drugs, state-of-the-art forecasting models can be of great aid. Here, we focus on exploring, building and interpreting forecasting models for Medicaid's expenditure using applied Bayesian modeling methodology. The synthesis of the routine model output with dynamic assimilation of external information is the centerpiece of Bayesian forecasting. Further, a comparative assessment of the forecasts is performed with prior results from classical time-series models. The results from these forecasting processes can be leveraged by Medicaid for research, planning, optimization and inferential purposes.

Rudranil Manna, Development of a Predictive Model for Food Consumption in USA, March 3, 2008 (Norman Bruvold, David Rogers)
The accuracy of the prediction of a household's expenditure in food is a major concern for retailers and manufacturers engaged in food-marketing campaigns. The purpose here is to develop a model to predict the household spending in the major food categories, based on
geographic location and household demographics. The modeling is done with "consumer expenditure diary survey data" obtained from the public domain of the US Department of Labor. A mixed modeling methodology is adopted, which includes a mixture of the fixed effects of the socio-economic characteristics of the household and random effects of each household specific intercepts. This model has taken into account the correlation between the household expenditures for the different food categories. Finally the model predictions are benchmarked against a univariate tobit regression model, widely available in the
literature for similar predictions of household food-consumption.

Qiuhong Zhang, Empirical Verification of Optimal-Portfolio-Based Foreign Exchange Rate Theory, February 29, 2008 (Srdjan Stojanovic, Yan Yu) 
The recent optimal-portfolio-based Foreign Exchange Rate theory, introduced by S. Stojanovic in Foreign exchange rates, is implemented and verified using the market data for the economies of: Canada, Japan, UK, and US. The key parameter in the implemented theory is the market (relative) risk aversion parameter ? (or the market sentiment). Therefore, one of the main goals of this empirical study was to estimate the value of the relative risk aversion parameter for the pairs of the considered economies, and to conclude whether it has the same/similar value for all of them. Finally, the statistical hypothesis on whether the Foreign Exchange Rate data conforms to the theoretical model is tested as well.

Feng Yu, A simple discrete-time hazard model for forecasting bankruptcy in construction companies, December 19, 2007 (Martin S. Levy, Jeffrey Camm, Uday Rao)
The construction industry has played a powerful role in sustaining economic growth and helping the recovery. This industry is inherently very fragile and extremely risky, and the failure of construction firms has had a serious impact on the economy and society. Consequently, the prediction of the failure of construction firms is essential not only for the economy, but also for society. To date, many bankruptcy prediction models have been developed to predict the probability of failure of construction firms based on company financial information and economic information. However, these models have their limitations and disadvantages because of one reason or another, which are reviewed in this study. There is a need to develop prediction models capable of forecasting long-term failure for construction firms of different sizes. In this study, a discrete-time hazard model is proposed to predict the probability of bankruptcy for construction firms in a long time frame. The research is based on a statistical analysis of good and bankrupt construction firms and related financial and economic data in a time frame of about 10 years. A prediction model using survival analysis is developed through this study.

Balkrishna Apte, Worldwide Desktop Computer Supply Chain Complexity and Performance Models for the Hewlett Packard Company, November 30, 2007 (David Rogers, Amitabh Raturi, Michael Stephenson)
In this project is a quantification of supply chain complexity for different business regions across the world for the personal computer desktop business, and its correlation to supply chain performance parameters. Regional supply chain performance is consolidated and quantified with parameters for order cycle time, forecast accuracy, inventory cost, excess, and/or obsolescence. Statistical techniques are utilized to determine if there is a correlation between product line complexity and key supply chain performance measures. Statistical models indicate the impact of change in supply chain complexity for various supply chain performance parameters. Results provide guidelines for management for determining the impact of product line complexity on various supply chain performance measures and ultimately upon profit. Changes for decisions regarding offering additional products by employing the impact of complexity will be posited.

Jeremy Scheidt, Clinical and Surgical Scheduling Across Multiple Facilities Using Integer Linear Programming, November 28, 2007 (Michael Magazine, Craig Froehle, Jeffrey Camm)
Rising health care costs are a complicated issue. Health care organizations have a delicate balancing act of scarce resources with high standards for care and service. Large scale operations can have several advantages for efficiency and service, but the coordination of so many resources using manual methods can be cumbersome, time-consuming, and carries a risk of being less than optimal. Scheduling doctors at several facilities in a metropolitan area is an example of such a problem. Cincinnati Children's Hospital has several locations that share many resources, such as doctors and administrators. The problem considered here is how to efficiently coordinate the scheduling of doctors at various facilities for consistency in quality and service while minimizing the already heavy demands on personnel. The proposed model uses integer programming to choose the schedule that best meets the multiple objectives of a good schedule in this situation. It handles a wide variety of scheduling requests in an automated manner that reduces manual work, minimizes the number of schedule requests that can not be met, minimizes the travel between facilities, minimizes the changes required to accommodate ongoing schedule updates, and provides a consistent space for each doctor to use.

Feng Ji, An Introduction to Credibility Theory With An Actuarial Frequency Case Study, November 21, 2007 (Martin S. Levy, Jeffrey Camm, Yan Yu)
Credibility theory is a set of quantitative tools which allows an insurer to perform prospective experience rating (adjust future premiums based on past experience) on a risk or group of risks. There is a manual rate which is designed to reflect the expected experience of the entire rating class and implicitly assumes that the risks are homogeneous. However, no rating system is perfect, and there always remains some heterogeneity in the risk levels after all the underwriting criteria are accounted for. Credibility theory provides models which are a compromise between the historical observations and the manual rate, and also a more credible premium. In this paper, three classic credibility approaches, which are Bayesian Methodology, Buhlmann credibility, and Non-parametric Empirical Credibility, are discussed. A case study with a true claim experience from Humana Inc. then shows that credibility premiums outperform either the manual rate or the estimate based on the historical observations.

Yanping Chen, A Case Study on the Linear Modeling Fitting with Outlier , November 14, 2007 (Martin S. Levy, Norman Bruvold, Jeffrey Camm)
In the application of ANOVA for hypothesis testing, the assumptions such as the homogeneity of errors or normality are often violated because of scale effects, design of the experiments, outliers and the nature of the measurements. This experiment deals with design and statistical analysis on the balance control capability of obese workers. Functional Reach (FR) is a measure of how far a person can reach without losing balance. The hypothesis assumption is that obese workers because of their larger body mass may not be able to reach as far as non-obese people without losing body balance. Except for the obesity_level (obese and non-obese), gender is chosen as another primary factor in the hypothesis testing. However, the plots of the residuals arising from fitting the 2x2 ANOVA show the heteroscedasticity due to the fact that one subject seems to be an outlier. Remedial measures are applied in the project to cure the heteroscedascity, such as the seemingly outlier removal, log, square root, inverse and the Box-Cox algorithm transformations, evaluation on the model adequacy and inadequacy, verification, and the Rank ANOVA. The consequences of these techniques are compared and the ANCOVA model succeeds to reducing the variance and removing the heteroscedasticity for the hypothesis testing.

Yann Ferrand, Forecasting U.S. Medicaid Program Expenditures on Antidepressant Drugs, November 14, 2007 (Christina Kelton, Jeffrey Guo, Martin S. Levy, Yan Yu)
Healthcare costs and drug prices have been on the rise, and the state Medicaid programs spent altogether approximately $2 billion on antidepressant drugs in 2005. Our goal is to build forecasting models that can be used to predict U.S. Medicaid's future spending on antidepressants. We gather quarterly data (1991-2004, Centers for Medicare & Medicaid Services) on Medicaid national antidepressant expenditure. We use Box-Jenkins forecasting techniques on expenditure time series for specific antidepressants including Prozac®, Zoloft®, Wellbutrin®, Paxil®, Effexor®, and amitriptyline. Intervention analysis is used to determine the effects of patent expiration, new branded-drug entry, and new indication approval. Forecasts are computed and compared to a holdout sample, comprised of the 2005 data, to assess the performance of the models. The Prozac® and Paxil® models incorporate an intervention term corresponding to patent expiration. The model for Wellbutrin® has a pulse with decay intervention term for the increase in Direct-to-Consumer advertising. The model for Zoloft® has an autoregressive factor, and for Effexor® both an autoregressive and a moving average factor. For amitriptyline, the final model is a random walk. Maximum likelihood was used for estimations. Usual checks on the residuals proved to be satisfactory. We find that the drugs studied are affected differently by generic entry. We found no effect of either new branded-drug entry or newly approved indications.

Claudia Rosales, Optimal Inbound Trailer Allocation at a Crossdock - Optimizing Operations and Balancing Workload, August 29, 2007 (Michael Fry, Jeffrey Camm, Rajesh Radhakrishnan)
Transfreight, LLC is a third-party logistics provider that supports Toyota's lean manufacturing operations in North America. Our work provides the optimal allocation of inbound trailers to docks at a crossdocking facility operated by Transfreight. We focus on improving the efficiency of operations as well as balancing workload among crossdock workers. We compare two different implementation tools for our models: a spreadsheet-based solver and CPLEX. Since 2006, Transfreight has successfully used our implementation model for its inbound trailer assignments, leading to considerable cost savings and growth opportunities.

Bhaskar Narayanaswamy , Impact of Interruption and Forgetting in a Knowledge-Intensive Environment on Productivity, August 21, 2007 (Craig Froehle, Jeffrey Camm, Uday Rao)
With the rise of telephone, email, and ubiquitous connectivity, one increasingly common barrier to productivity in professional and knowledge-intensive environments is interruptions. Interruptions cause stoppage of the current task and often induce forgetting on the part of the worker. Beyond the direct delay caused by the interruption, the induced forgetting also causes rework; in order to complete the interrupted task, additional effort and time is required to return to the same level of task-specific knowledge the worker had attained prior to the interruption. Together, these phenomena – interruptions, forgetting, and rework – create significant barriers to productivity in knowledge-intensive work environments. In service environments, interruptions pose an especially significant problem due to the “interruption conundrum” of facing negative consequences from both ignoring and accommodating interruptions. When customer relationships are damaged by both addressing and ignoring a potential interruption, there is no obvious best recourse. This research employs observational and process data gathered from a hospital radiology department as inputs into a simulation model in order to better understand the impact of interruptions, forgetting, and rework. To help mitigate the deleterious effect of interruption-induced rework, we introduce and test the operational policy of sequestering, where one of the service resources is protected from interruptions. Our results suggest two key conclusions. First, sequestering can improve overall productivity and cost performance of the system, but the decision to implement a sequestering policy must consider the costs associated with delaying both interruptions and production work as well as the forgetting rate of the system's human workers. Second, if interruption-induced forgetting is not explicitly considered, the model's results tend to substantially underestimate the benefits of a sequestering policy.

Hsin-Chih Kao , Asymmetric-Response Study among Stock Markets of South Korea, Japan, China, and the US, July 9, 2007 (Martin S. Levy, Norman Bruvold, David Kelton, Weihong Song)
This project investigates whether asymmetric responses exist among stock-price indices of South Korea, Japan, and China. Magnitude asymmetry and pattern asymmetry are two main foci in the project and are tested by using regression analysis and vector autoregression (VAR) models, respectively. The main findings are as follows: magnitude asymmetry exists as the Japanese index affects the South Korean index. Second, by analyzing impulse response functions derived from VAR models, we find that pattern asymmetry exists among three Asian stock indices. When the possible US effect is accounted for in the analysis, the results show that the movement of index returns of US stocks influence those of South Korea and Japan, but not that of China.

Hua Zou, Developing a Predictive Model for Targeting Potential Donors: Application of Logistic Regression, Classification Trees, and Support Vector Machines in Analysis of Responses to Direct Mailing, May 29, 2007 (Yan Yu, Martin S. Levy, David Kelton)

Direct-mail campaigns are employed as a core marketing strategy by various organizations, from catalogue-order companies and direct retailers to credit-card and insurance institutions.  As the response of a given random selection of prospects is uncertain, many data-mining techniques are used to target good prospects and improve the likelihood of response.  In this study, we compare model performance built respectively by binary logistic regression, classification trees and support vector machines (SVMs), and show that lift and gain tables are better than ROC curves, and areas under curves (AUC) to distinguish the optimal model and select the target size because they take profit and gain into account.  Finally, support vector machines stand out from other classification algorithms to understand customer behavior and maximize profit in this case.

Raja Nooti, Analyzing Search-Engine Server Patterns, May 25, 2007 (David Kelton, Jeffrey Camm, Uday Rao)
This paper deals with resequencing of server patterns in a search engine with the objective to increase resource utilization and decrease the time taken per query in the search process.  A query is a request for information from a database.  A server is a computer that holds information and responds to requests for information from it (based on the query).  Server patterns refer to the allocation of queries to servers based on query type or frequency.  This problem is motivated by the highly competitive search-engine market where each second saved is massive, and there are many potential ways to improve the search process.  A base search-engine model is simulated in Arena with a real-world time distribution input to reflect the current search engines' server patterns.  Real times are obtained from AOL search logs to develop the model as accurately as possible.  Building on this base, an alternate remodeled model is developed incorporating logical constraints on query flow within the model to improve the resource utilization and reduce time taken per search.  In addition to proving to be amenable to implementation, this remodeled scenario has several significant advantages over the base scenario, all of which are analyzed.  Furthermore, a new model is developed and analyzed that features the enhancements possible and is proved to be more effective than the remodeled scenario.

Guoxiang Xu, What Factors Explain Investor Sentiment?, March 1, 2007 (Brian Hatch, Martin Levy, David Kelton)

The sentiment index recently reported by Baker and Wurgler (2006) reveals dramatic cross-sectional performance patterns in stock returns based on a variety of factors such as Firm Size (ME), Earnings-book Ratio (E/BE), Book-to-Market Ratio (BE/ME), and Sales Growth (GS).  When the sentiment index is negative, the subsequent returns are relatively high on small stocks, young stocks, high volatility stocks, unprofitable stocks, non-dividend-paying stocks, extreme-growth stocks, and distressed stocks.  Because the sentiment index has some ability to forecast stock returns, it would be valuable to know if there are any factors that explain this sentiment index.  Initial efforts reveal that macro-economic factors have little correlation with this sentiment index; however lagged equal-weighted stock index (EW) returns have a strong correlation.  Equal-weighted stock index returns annualized from the previous six years (EWP6Y) explain a majority of the variance of the sentiment index.  I discuss two possible explanations for this phenomenon, the business cycle and how fund management is evaluated.  For further investigation, I used a Hodrick-Prescott Filter to decompose the sentiment index into a general trend and the deviation from the trend.  My analyses reveal that the trend and the deviation are composed of different groups of the six variables initially used to synthesize the sentiment index.  Logistic regression reveals that EWP6Y has strong predictive power on the sign of the sentiment index.

Prido Lumbantoruan, Univariate and Multivariate Time Series Modeling Application on the Unseasonally Adjusted US Index of Industrial Production, December 8, 2006 (Martin Levy, Norman Bruvold, David Kelton)
The Index of Industrial Production is an important indicator that could be used as a barometer of economic level of a country.  In this project we used three monthly economic series, the S&P 500 index (SP), the Unemployment Rate (UR), and the Money Stock Measures (M2) as the input series to model the Index of Industrial Production (IP).  Two multivariate frameworks, a dynamic regression with transfer function and a multiequation time-series model, were built to model the Industrial Production Index.  Dynamic regression and multiequation time series models are immensely useful in examining the relationship of past values from multiple time series with each other.  Additionally, a univariate time-series model was examined and built using the Box-Jenkins method as a baseline model for comparison with the multivariate models results.

Jonathan Healey, A New Model for the Cost- and Priority-Based Carrier-Selection Problem, November 29, 2006 (Jeffrey Camm, Michael Fry, David Rogers)
The three-dimensional bin-packing problem is to pack all or a subset of boxes into one or more bins.  In the three-dimensional singular bin-packing problem, the objective is to minimize the wasted bin volume.  In the three-dimensional multiple bin-packing problem, the objective is to minimize either some type of bin cost or the number of bins used.  Bin-packing problems have many important theoretical and practical implications.  On the theoretical side, they have challenged computer scientists and discrete mathematicians for decades because they are NP-hard, and there is no universal algorithm to find the exact solution in a reasonable amount of time.  For this reason, heuristics have been developed for attempting to find approximate solutions in a smaller amount of time, some of which I describe.  On the practical side, they have many applications in industry, including scheduling and loading cargo into trucks (Sweep 2004).  However, many approaches to the three-dimensional bin-packing problem that appear in the operations-research literature are applicable to only a portion of all situations encountered in practice due to their assumptions.  The objective of this work is to develop a model for the cost- and priority-based carrier-selection problem and determine the problem sizes that are solvable to optimality in a reasonable amount of time.  This model is a more practical approach to the three-dimensional bin-packing problem and builds upon a model by Chen, Lee, and Shen (1995).  After showing how the new model works, I present results, observations, and statistical analyses from testing it.

Rossana Bandyopadhyay, A Two-Stage Newsvendor Problem for a Call Center with Downward Substitution, October 12, 2006 (Amitabh Raturi, Uday Rao, Jeffrey Camm)
The call-center industry has a constant demand problem whereby it is difficult to assess the inventory of seats that need to be maintained in order to meet demand and yet not have an overflow of idle seats.  Many studies have been done to explore this.  Call centers also have different seat types, and customers are specific to certain seat types.  So while one seat type experiences idle hours, another may have a demand surge and be unable to fulfill all customer calls.  This paper explores how revenue may be affected when we allow substitution of seats between classes.  We evaluate the case of a two-stage call center offering high-service-level and low-service-level seats to customers.  Upward substitution of seats to callers is generally not a concern.  We explore how the effect of downward substitution of seats can affect overall revenue.  An integer-programming model is first created to define the process and identify the parameters.  Three scenarios are presented for studying the problem.  The first determines the effect of variance of demand on the level of substitution.  The next experiment evaluates how downward substitution may vary by relative profit rate between the seat classes.  Finally, the influence of the differential target service level is evaluated.  We use simulation with Crystal Ball to evaluate the process.  From the results, we derive the conclusions that downward substitution does contribute to increasing the overall revenue of the firm and so it is a viable option that can be considered.  We also find that downward substitution gives marginally decreasing returns and hence we recommend that managers at call centers implement a policy on the extent of downward substitution a priori based on the additional value generated by this flexibility and the marginal cost (such as goodwill losses in either market or the cost of transferring demand).

Andrew W. Lundberg, Modeling a Sports Draft Using Dynamic and Linear Programming, August 25, 2006 (Michael Fry, Jeffrey Ohlmann [University of Iowa], Jeffrey Camm)

We model a professional sports draft using dynamic and linear programming.  Our goal is to determine the best drafting strategy for a team competing in a multiple-round sports draft. We formulate the problem first as a stochastic dynamic program using a team's needs at each player position and the current pool of available players to be drafted as the state of the dynamic program.  However, this formulation is not generally solvable for reasonably-sized problems.  Therefore, we introduce a number of additional assumptions and relaxations that results in a more tractable deterministic dynamic program.  To solve our models, we reformulate the problem as a linear program.  We develop an easy-to-use application in Microsoft Excel that allows the user to implement our algorithm to determine drafting strategies under a variety of conditions.  The application allows the user to change a number of parameters including player rankings and valuations, length of draft, and the team's initial drafting needs.  We then compare our algorithm to several competing draft strategies by measuring the performance of each in a fantasy football draft for the 2005 season.  Our results indicate that our drafting strategy out-performs these competing strategies in every instance.

Rachel M. LaRosa, Optimal Sequencing in a Multiple Machine Job Shop, August 21, 2006 (Jeffrey Camm, Michael Fry, David Rogers)
In this paper I present an optimization of a specific deterministic Job Shop Scheduling Problem (JSSP).  The JSSP studied involves six machines performing a total of eight processes on ten jobs in a real-world company.  The schedule was obtained through a model developed with Premium Solver Platform Version 6.5 for Microsoft Excel.  Comparison with the current scheduling practices of this job shop revealed many points, including insights into bottlenecking and downtime of machines and operators.  As described in the literature, this type of problem is extremely hard and time-consuming to solve.  This model may be further developed in the future for implementation in the job shop's schedule planning.

Dongmei Yang, Comparison of Import Vector Machines with Support Vector Machines to Make Predictions in Marketing, July 21, 2006 (David Curry, Martin Levy, Yan Yu)
Many marketing problems require accurately predicting the outcome of a future event.  In today's business environment, analysts often face datasets with hundreds of variables related in complex ways so that outcome classes are not linearly separable.  In the 1990s, the support vector machine (SVM) was developed for problems of this type by using kernel transformations to transform a highly nonlinear problem (in the original attribute space) into a linear problem in a higher dimensional "feature" space.  The SVM performs well (Cui and Curry 2005, 2003) but is limited by the fact what it does not naturally produce probability estimates, it cannot be easily extended to multi-class problems, and it may be computationally "expensive," depending on the kernel selected.  In this project, we propose and test a new technique, the import vector machine (IVM) that also employs kernel transformations, but overcomes the shortcomings of the SVM.  The IVM provides classification probability estimates, it naturally generalizes to the multi-class case, and it requires less computation than the SVM.  We compare the SVM and IVM using data from two sources: (1) a discrete-choice problem based on simulated data, and (2) a large-scale field study involving the prediction of the incidence of client repeat business in the marketing-research industry.  Each new technique is also benchmarked against logistic regression.  Results indicate that the IVM performs (nearly) as well as the SVM on these problems and that both machine learning techniques significantly outperform logistic regression.  Because the IVM provides class-membership probabilities, it leads to deeper understanding than the SVM in both problems.

Kartheek K. Reddy, Regression and Time Series Modeling of the United States Civilian Unemployment Rate, July 6, 2006 (Martin Levy, Norman Bruvold, David Rogers)
The unemployment rate (UER) is an important indicator of the economic performance of a country and there are many ways of forecasting the UER.  Economic indicators like the gross domestic product (GDP), the inflation rate (IR), the civilian labor force (LF), and the industrial production index (IPI) may have statistically significant influence upon the UER.  The relationship among various economic indicators was examined.  Regression and time-series models were developed for the UER.  Ordinary least-squares regression methodology was adopted to develop the regression model and univariate autoregression (proc autoreg) and multivariate vector auto regression (VAR) procedures were used to develop the univariate and multivariate time-series models, respectively.  The forecasting abilities of regression, univariate, and multivariate time-series models were compared by performing static and dynamic forecasts of the UER.

Elena Bichescu, Bankruptcy Prediction using Logistic Regression and Multiple Imputation, June 29, 2006 (Martin Levy, Jeffrey Camm, Timothy Keyes [General Electric], Yan Yu)
Altman (1968) notes that bankruptcy represents a serious financial distress state that not only affects the bankrupt company, but also has negative social and macroeconomic ramifications.  In this context, models that could accurately predict the probability of a company filing for bankruptcy have wide applications, e.g., criteria for bank loans and financial investments, financial turnaround measures, etc.  This work proposes the use of logistic regression models and multiple imputation techniques to predict bankruptcy.  Our analysis is based on a dataset created by the author and which contains 165 companies, of which 55 have been declared bankrupt.  We formulate a logistic regression model where the bankruptcy state is a binary dependent variable and the predictors are continuous financial ratios.  Model building is performed on the dataset that results after applying listwise deletion on the initial input data.  The models thus obtained are then validated using two approaches: train/test, where the models are validates on separate test sets, and cross-validation.  The misclassification rates returned by our logistic regression models average around 10%, a performance similar to models proposed by Altman and Beaver.  The proposed logistic models show that among the best predictors of bankruptcy are the financial ratios obtained based on total or current liabilities and on total or current assets.  This result verifies both previous work by Altman and Beaver and the intuition that a company's financial health depends crucially on the delicate balance between assets and debt.

Jeremy Jesse, Optimal Warehouse Delay for a Supply Chain Backorders Optimization Model, May 26, 2006 (David Rogers, Amitabh Raturi, Jeffrey Camm)
In this paper a multi-level retailer inventory distribution model with backorders is considered.  It is a periodic review system where the optimal base stock levels are determined by minimizing the total penalty cost of backorders subject to delay time constraints.  Lead times are deterministic with possible delays, lateral shipments are not allowed, and shipment times are integer constrained to model situations where a fleet of trucks is only able to make one delivery per day.  A highly nonlinear mathematical programming model was adapted for this setting.  The case of non-identical retailers created a formidable challenge for standard software to yield reliable results.  Interval search techniques and optimal selection were utilized within Excel and VBA to provide numerical results for the case of multiple identical retailers.

Yue Wu, An Empirical Study of the Post-Deregulation Electric Utility Wholesale Market, May 23, 2006 (Yan Yu, Martin Levy, Norman Bruvold)
This work explores the volatility structure of daily electricity price returns for 6 markets across the US. Based on daily data from 1998 to 2005, we examine the wholesale electricity prices for Cinergy, Entergy, PJM, Chicago, Michigan, and Ercot with parametric modeling methodology.  A family of GARCH type models is implemented to model the return behavior, in which exogenous explanatory variables, seasonality, and asymmetric effects are taken into account.  The behavior of electricity prices exhibited a strong tendency to stabilize as a common commodity after deregulation at the end of last century.  Several misspecification tests are conducted to evaluate model appropriateness.  Different back testing techniques are applied to identify the best model.  Finally, a bootstrap simulation methodology is applied to evaluate the model performance of an updated model using data from 2001 to 2005 and an overall model from 1998 to 2005.  The updated model turns out to generate a much narrower prediction interval and is more accurate.  This supports the conclusion that a structural change happened around 2001.

Robert E. Carter, Estimating Tuition Elasticity Using a Dynamic Discrete Choice Model, May 19, 2006 (David Curry, Jeffrey Camm, Michael Fry)
Prior research on tuition elasticity for institutions of higher learning has consistently found a downward sloping demand curve. That is, as tuition increases, enrollment decreases. However, most published studies relied on aggregate data covering multi-year time frames. Elasticities estimated in prior research reflect the likelihood that a student will attend any college or university. The research does not provide guidance on the choice of college that an individual student may choose to attend. The research presented in this thesis is unique because it employs discrete choice experiments on an individual student basis in order to determine the tuition elasticity for 12 colleges within the University of Cincinnati. Additionally, web-based survey software containing a unique "rules engine" was developed (as none were available commercially) so that the list of competitive schools in the choice set could be dynamic and, hence, reflect the college consideration set for each student. Thus the discrete choice experiment employed here uses a data collection format personalized for each respondent in the study.  Results are consistent with prior research in that we identified a downward sloping demand curve. However, our estimated elasticities are considerably greater than those reported in previous research due to the focus on individual student level data as compared with aggregate level analysis.  Furthermore, within the University of Cincinnati, we found that students attending the Colleges of Pharmacy, Medicine, and The College Conservatory of Music (CCM) exhibited the lowest tuition elasticity, while students from Business, Engineering, and the College of Education, Criminal Justice, and Human Services (CECH) displayed the highest relative elasticity.

Mayank Seksaria, Portfolio Risk Management Techniques for Electricity Generating Companies, May 19, 2006 (Yan Yu, David Rogers, Martin Levy)
In the past decade electricity markets have been deregulated all around the world. In this new environment energy is traded as any other commodity. Price volatility in deregulated electricity markets is max as compared to any other commodity. Confronted with this extreme price volatility market participants and traders face enormous risks and hence need risk management in electricity markets. With the volatility that fuel prices have encountered in recent past, price risk becomes most paramount for electricity companies risk management. In this thesis we start by calculating the volatility of electricity spot prices using historical simulation methods. Then I used time series models to determine characteristics of spot price returns and also do comparative forecasting of electricity spot prices. Risk cannot be avoided in any market. Modern theory of utility is an approach to decision choice under uncertainty. I developed an optimal portfolio consisting of bilateral contracts and spot pricing. I also used sequential optimization to determine the effect of various factors on the allocation ratio in the portfolio. Besides MPT I also used VaR as a risk control technique and calculate its values. I calculated the VaR for individual asset as well as for the portfolio and compare those values to illustrate the diversification provided by developing an optimal portfolio. I have provided an overall framework of risk management for Generating companies in the competitive electricity market. The proposed energy allocation model provides an analytical and quantitative approach to energy trading.

Zhouzhou Peng, A Dynamic Self-Adaptive Algorithm and Simulation Study for Warehouse Organizing, May 18, 2006 (David Rogers, Amitabh Raturi, Uday Rao)
How well the contents in a warehouse, i.e., the variety of items stored in it, are organized is among the most important factors that determine productivity and efficiency.  Current organizing methods are inadequate and cost-prohibitive when facing volatile warehouses where a huge variety of goods are frequently transferred in and out in large and unpredictably fluctuating numbers.  The reason for that incapability is twofold: first, current methods tend to focus only on the storing process and ignore the impact of the order-picking function; second, current methods often use a top-down approach and lack the flexibility needed for the ever-changing environment.  In this thesis is a new algorithm that integrates both the storing and the order-picking activities and employs a bottom-up perspective to solve the problem, utilizing only basic information readily available within a modern computerized warehouse management system (WMS).  A simulation study based upon a real-life case is used to show the algorithm's dynamics and analyze its improved performance over the current method.

Andrew R. Remington, A Study of Unsupervised Learning, April 20, 2006 (Yan Yu, Martin Levy, David Kelton)
Unsupervised learning is a collection of methods that are extremely effective in producing accurate summaries of relationships in a data set. With the recent evolution of computing power and the free implementation of the statistical programming language R, these powerful methods are now readily available to anyone interested in data mining. This project studies association rule analysis, cluster analysis, self organizing maps, principal components, independent component analysis, and multidimensional scaling, offering summaries of each method, descriptions of each method's implementation in R, examples of the application of the method to a real data set, and an assessment of the attributes of each method. Due to the new nature of the field and fragmented documentation of each method, this project crystallizes the process of usage and understanding of each method in a freely available software language to provide novice data miners with a structure of understanding and instructions on the application of each method. This project summarizes journal publications, textbooks, and R code that deal with each method individually. The results of this project show that many unsupervised learning methods are easy to apply, execute quickly, and provide similar results among differing methods. Furthermore, the results demonstrate the redundancy of different methods concerning gene tumor data and the effectiveness of unsupervised learning as exploratory analysis. The significance of the finding is that because the methods are freely available and are easily applicable to a data set, it is prudent that data miners or statisticians apply unsupervised learning methods during their initial exploration of a data set in order to define their starting assumptions more accurately.

Honghua Shang, A Model for Profiling Asian American Association Telecom Services Customers Using Logistic Regression, February 21, 2006 (Martin Levy, Norman Bruvold, James Evans)
Data mining is an information-extraction activity whose goal is to discover hidden facts contained in databases.  Using a combination of machine learning, statistical analysis, modeling techniques, and database technology, data mining finds patterns and subtle relationships in data and infers rules that allow the prediction of future results.  Typical applications include market segmentation, customer profiling, fraud detection, evaluation of retail promotions, and credit-risk analysis.  This project attempts to develop a model for profiling potential customers using statistical methods, such as logistic regression for a given data set.  That is, the relationship of some responses and explanatory variables will be explored so that we can determine which variables are the most and least correlated with the response variable.  The goal is to segment data provided by the Asian American Association Telecommunication Services into potential customers and non-interested customers.  Logistic regression was chosen mainly because of its ability to analyze categorical data.  Gender, language, age, dwelling, household income, location, and time zone were variables found to be statistically significant and are therefore important contributors in determining the potential AAATS customers.  AAATS will adjust their future marketing campaign based on these findings.

Guohua Wu, A Study of Value-at-Risk Methods, February 7, 2006 (Martin Levy, Jeffrey Camm, Norman Bruvold)
Value at risk (VAR) is a method widely used in financial corporations to measure the risk of holding a portfolio over a period. Three basic methods to get VAR are the delta normal method, the historical method, and Monte Carlo simulation. Among these three methods, Monte Carlo simulation is most powerful while the delta normal method is most popular one since it is economical. However, these methods have a lethal drawback if VAR is forecasted over a volatile period because they assume common variance. Univariate and multivariate ARCH/GARCH models are discussed to deal with heterogonous data. Since the software for the time-varying covariance ARCH/GARCH model is not available currently, the common correlation multivariate GARCH model was used. The vector autocorrelation model is based on the idea that the conditional variances of the portfolio components not only have autocorrelation with themselves but also with other components. Thus, VAR can improve the GARCH model further.

Ying Huang, Development of RFID Technology Measurement Scales, December 2, 2005 (Craig Froehle, Michael Fry, Suzanne Masterson)
Over the past few decades, radio frequency identification (RFID) technology has been used to track and identify goods, assets, and even living things. It is gaining momentum in supply-chain management.  Compared with barcodes, it is a more powerful tracking tool in many aspects and can provide more detailed and accurate information in a more timely manner.  As the most promising ID technology that might revolutionize the industrial world, it has drawn a lot of interest from supply-chain participants.  Millions of dollars have been invested into research to examine its potential and improve its features and benefits.  Although a number of surveys have been conducted to explore people's concerns about this hot topic, it is important that RFID technology as a concept be subject to the same serious and careful academic study that has been focused on the technology itself.  This could help reveal current and potential RFID users' interests and expectations.  Perceptions of RFID are not well understood, likely due in part to a lack of valid measurement instruments.  In this paper, we summarize the current state of RFID application. We then propose four important attributes of RFID - reliability, durability, flexibility, and security - and develop multi-item scales to assess the importance of each to managers.  Employing a combination of primary field (internet survey) and artificial datasets, we perform reliability and validity analyses using the SAS and AMOS statistical tools.  The results of the iterative reliability and confirmatory factor analyses suggest that two of the tested items should not be employed in further applications of the instrument.  The results and limitations of the research are then discussed.

 

Ying Li, Using Bayes Estimation Under BLINEX Loss to determine the Mailing Size for a Direct Mail Marketing Campaign, November 22, 2005 (Martin Levy, Norman Bruvold, David Kelton)
Direct mail marketing is a growing area of marketing practice.  Many corporations use a data-mining technique, called scoring model, to estimate the response probability of each household in the mailing list.  The selection of targets is based on the assigned probabilities in descending order.  The problem that remains unsolved is the size of mailing.  In practice, the direct marketers make the decision either based on budget or maximized response rate, which are suboptimal for profit-driven firms.  Bayes estimation, which takes cost into consideration, has been applied to find the optimal mailing size.  Traditionally, the point estimates are often derived by implicitly assuming a squared error loss (SEL) function, but the SEL may not reflect the actual loss in a direct-marketing problem.  This paper use Bayes estimation under bounded linear exponential loss (BLINEX) to find the response rate that corresponds to the optimal mailing size leading to maximized profits.  A case study with real data sets from a catalogue company demonstrates the BLINEX loss structure and the financial advantage of BLINEX method over the SEL and the mailing-to-all scenarios.

Pooja Singh, Application of Linear and Non-Linear Modeling with Random effects to Analyze Biomechanical Data, November 21, 2005 (Martin Levy, Jeffrey Camm, Norman Bruvold)
This project deals with design and statistical analysis of biomechanical data.  The biomechanical data pertain to a tissue-engineering experiment that aims at accelerating tissue repair.  Repair of tendons, ligaments, and capsular structures is common given that these injuries represent almost 45% of the 32 million musculoskeletal cases in the US each year.  As a consequence, surgeons and basic scientists have sought to identify new approaches like tissue engineering for tissue repair and returning the patient to pre-injury activities.  This experiment sought to understand how the cell-to-collagen ratio affects contraction kinetics of mesenchymal stem cells (MSC) as they mature around posts in a culture.  A split-plot design was successfully applied to the experiment and hypotheses were tested using the model.  Also, a nonlinear model was fit between the response variable, contraction factor, and time.  The model allowed that the random effect in the experiment could enter the model nonlinearly.  The analysis was implemented using Proc Mixed and Proc Nlmixed in SAS.

Vinutha Nagesh, Clinical Data Mining: Frozen Shoulder, November 18, 2005 (Martin Levy, Yan Yu, Norman Bruvold)
Data mining, an interdisciplinary research area including artificial intelligence, statistics and databases, is the science of extracting useful information from large databases.  In this research project, techniques of data mining were used to analyze the relationships in a clinical condition called frozen shoulder.  The data set derived from the clinical database of a shoulder surgeon at Cincinnati Sportsmedicine and Orthopedic Center consists of 65 patients' records.  Records include patients' demographics and clinical diagnoses information.  The severity of the frozen-shoulder problem is measured in terms of the Simple Shoulder Test score (SST), the range of motion of the aggravated arm in different elevations and the American Shoulder Elbow Score (ASES), which is calculated from the patients' responses regarding the functionality of their shoulder.  Treatment included physical therapy or surgery.  The data were used to do comparative analyses on pre-treatment and post-treatment measurements using paired and un-paired methodologies.  Predictive studies are performed to predict which treatment group a patient is assigned as a function of demographics, pre treatment scores, and clinical diagnoses.

Steven Harrod, Numerical Methods for Realizing Nonstationary Poisson Processes with Piecewise-Constant Instantaneous-Rate Functions, October 24, 2005 (David Kelton, Uday Rao, Martin Levy)
Nonstationary Poisson processes are appropriate in many applications, including disease studies, transportation, finance, and social policy.  We review the risks of failing to model nonstationary Poisson processes properly and discuss three algorithms for the generation of Poisson processes with piecewise-constant instantaneous rate functions.  We test these algorithms in C programs and make comparisons of accuracy, speed, and stability across disparate rate functions and microprocessor architectures.  Choice of optimal algorithm could not be predicted without knowledge of microprocessor architecture.

Vishva Raj Bangad, Bioequivalence and Sample-Size Determination in the Pharmaceutical Industry, October 5, 2005 (Martin Levy, Jeffrey Camm, Uday Rao)
Assessing bioequivalence between the bioavailability of a generic drug product and the innovator drug product has gained a lot of importance in recent years since the generic-drug manufacturer does not need to perform costly clinical trials to demonstrate the safety and efficacy of the generic product if the bioavailabilities of the two drug products are demonstrated to be bioequivalent.  However, this bioequivalence must be demonstrated in a statistically sound way to protect the consumer from ineffective and unsafe drugs.  Until the 1970s the statistical test of hypothesis of no difference between the bioavailabilities of two drug formulations, usually supplemented by an assessment of what the power of the statistical test would have been if the true averages had been bioequivalent, was used in the statistical analysis of bioequivalence studies.  Westlake proposed a new approach based on a confidence interval for the difference between the true means.  During the same period, Schuirmann proposed a two-one-sided-test (TOST) method.  Anderson and Hauck proposed a new test and claimed that their test was always more powerful than the above two tests.  Wilcoxon, Mann, and Whitney proposed a nonparametric version of TOST if the assumption of normality or lognormality is not valid.  We will discuss and compare these methods in this paper.  We will also determine the power and sample size of Schuirmann's TOST.  In the end, we will briefly discuss some of the new approaches that have been proposed in the last decade and define population bioequivalence and individual bioequivalence.

Bogdan Bichescu, Channel Power: Its Implication on Supply-Chain Performance, September 1, 2005 (Michael Fry, Amitabh Raturi, Pradyot Sen, George Polak [Wright State University])
Our work, comprising two essays, examines decentralized supply chains composed of one supplier and one retailer facing stochastic customer demand.  We develop models for both periodic (1st Essay) and continuous review (2nd Essay) inventory policies when the decision-making rights are split between supply-chain agents.  We seek to answer: 1) when does decentralized decision making result in the greatest loss in supply-chain performance and 2) what effect does the distribution of channel power have on system and individual agent performance.  In our first essay, we assume the retailer is responsible for choosing order sizes and the supplier chooses delivery frequency.  We find that performance losses from decentralized control are strongly influenced by the relative holding and penalty costs, but somewhat invariant to demand uncertainty due to risk pooling.  Furthermore, our numerical results suggest that concentrating channel power with the supplier can lead to supply-chain profits that are very close to a centralized scenario, but also results in lower customer-service levels.  Our second essay studies supply-chain performance under a vendor-managed inventory (VMI) agreement where the supplier controls delivery sizes and the retailer sets customer-service levels.  Within the VMI setting, we model various power scenarios: equally powerful retailer and supplier, powerful retailer, and powerful supplier.  According to our numerical results, the best system performance is achieved when the supplier acts as the Stackelberg leader. Furthermore, somewhat contrary to intuition, we find that individual agent performance is greatest when the agent acts as a follower.

Mohammad Rouholiman, Evaluating Mezzanine Finance in Real Estate: A Monte Carlo Simulation Approach, August 26, 2005 (Jim Clayton, James Evans David Kelton)
Mezzanine finance has emerged as an important source of financing in commercial real estate.  It helps to complete the market by bridging the gap between what equity investors are willing to put down and what conventional senior lenders provide.  The mezzanine position is structured as a junior debt piece or preferred equity share that takes the first loss after the equity investor, in the event of a default.  Due to the riskiness of the position a more rigorous analysis of the property's future cash flows (pro forma) is warranted.  Traditional property valuation relies on a static ten-year pro forma.  A more risk-adjusted approach is very timely given the aggressive pricing of equity and debt in property markets over the past few years.  Real-estate prices have soared and spreads on debt have contracted, leaving investors and bankers with very little room for error.  This paper aims to provide a methodology for using Monte Carlo simulation to evaluate the riskiness of a property and aid the mezzanine lender in the decision-making process.  The goal is to use Crystal Ball software to provide the mezzanine lender with a better picture of the possible outcomes for the property and see if it meets their initial underwriting criteria.  Then OptQuest is used to search for the set of loan attributes that meet the lender's IRR and default risk requirements.

Guoqiang Zhang, Numerical Methods in Valuation of American Options, July 29, 2005 (Michael Ferguson, David Kelton, Martin Levy)
Unlike European options, which can only be exercised at the time of maturity and can be priced with the explicit Black-Scholes formula, American-style options can be exercised at any time before the maturity and there is no closed-form formula to price them.  American Asian options, such as arithmetic average American Asian options and geometric average American Asian options, pose more difficulties in valuation since their values depend not only on the underlying assets, but also the arithmetic or geometric averages of the underlying asset over a certain time interval.  Numerical methods such as binomial, least-squares Monte Carlo simulation, and finite differences, must be used to valuate American options.  The binomial tree method proposed (Cox et al. 1979) provides a simplified numeric approach for valuing options and assumes that the price of the underlying can go up or down by fixed multiples.  Each price jump is assigned a probability and a tree of possible underlying prices is built.  Working from the tree points or nodes at the option maturity date, the worth of the option can be back calculated until the option can be valued at the desired date.  Least-squares Monte Carlo simulation (Longstaff and Schwartz 1997) uses of regression to estimate the conditional expected payoff to the option holder from continuation, and is readily applicable to path-dependent and multifactor financial instruments.  Finite differences transform the partial differential equation into a difference equation that can be solved numerically, and is the most commonly used numerical method for solving differential equations.  In this project, we discuss the explicit finite, implicit, and Crank-Nicolson methods for the one-factor model and the explicit and ADI methods for the two-factor model such as arithmetic average American Asian options and geometric average American Asian options.

Paul Bessire, Measuring Individual and Team Effectiveness in the NBA Through Multivariate Regression, June 3, 2005 (Michael Fry, Jeffrey Ohlmann [University of Iowa], David Kelton)

At the conclusion of the 2003-04 National Basketball season, the Detroit Pistons, without one player among the NBA's top ten scoring leaders, found themselves atop the NBA with a championship ring.  Conversely, Team USA, composed of the most individually talented players in the world, failed to win Gold in the 2004 Olympics.  How could this happen?  We believe that much of the variation found in a basketball team's success can be explained mathematically through looking at the interactions of the five players on the court and not just individual player abilities.  We examine several methods for rating individual NBA players and we utilize multivariate regression analysis to assist in building successful NBA teams.  We seek to predict the success of an NBA lineup consisting of the five players on a court at any time.  We measure success as the lineup's average scoring margin per minute.  In order to predict a lineup's success we consider a set of individual player attributes that serve as our explanatory variables.  We use two-way interactions between player abilities to help explain teamwork in the NBA.  Applications of the model include examining which players should play at each position, predicting the lineups that should have the greatest team success, and specifying which skill areas the coaching staff should seek to improve through the annual NBA draft, free agency, and trades.

 

 

Jason Crabtree, Construction and Tests of an Interactive Genetic Algorithm for New Product Design, June 3, 2005 (David Curry, David Kelton, Yan Yu)
Affinova IDEA(TM) is a commercial software product with marketing applications in the area of new product design.  At its core is an interactive genetic algorithm (GA), which provides certain advantages over traditional product design methods, such as conjoint analysis.  These advantages include the ability to handle products with many design features and levels to each feature, as well as nonlinear consumer utility functions involving complex effects.  The goal of this project is to construct and test an interactive genetic algorithm similar to Affinova.  The analysis portion of the project will test the GA over a variety of operating conditions and enlighten the strengths and weaknesses of a genetic-algorithm-based approach to product design.

Neelima M. Reddy, A Route-Sharing Tool for Optimization of Resource Allocation in Logistics Planning, June 3, 2005 (Uday Rao, Michael Fry, David Kelton)
The optimum allocation of resources is one of the biggest challenges faced by a third-party logistics firm during the planning phase of operations.  The problem becomes complicated with uncertainty of demand, outsourcing of resources, and dynamic constraints on the availability of resources.  The resources in this particular problem are tractors and drivers and they must be allocated to pre-designed routes such that all the routes are run at the design-specified times using a minimum of tractors.  Traditionally it has been a slow manual process taking a logistics planner about 2-3 days to come up with a feasible allocation of tractors, let alone an efficient allocation.  Also, every time a new route or a set of routes are added or route specifications are changed, the tractors have to be entirely reallocated.  The long cumbersome process does not allow comparative studies between scenarios and the possibility of choosing a best cost-effective scenario.  In this project, I have developed a software tool called the 'Route-Sharing Tool' for one such Logistics firm (Transfreight) that uses a heuristic approach to the resource-allocation problem and provides a good solution in minutes.  It creates a weekly tractor-route flow schedule and is all the more valuable when route specifications change frequently and the resources have to be reallocated.  The tool is also useful for comparative studies and can be used during route design to develop an efficient set of routes within the constraints, which reduces the idle time of resources.  The tool also gives a visual representation of tractor usage and idle time, which makes it easy to understand and implement the desired changes.

Kanampully Sunny Paul, Analysis of Some Finitized Distributions for Use in Simulation, May 27, 2005 (Martin Levy, David Kelton, Norman Bruvold)
Simulation modeling helps us to replicate real processes using computer programs that are helpful in determining various important parameters of the process.  As simulation modeling assumes greater significance today and finds applications in numerous fields, emphasis is being laid on generating accurate, efficient, and faster random-variate-generating algorithms.  A new methodology called finitization that converts an infinitely supported discrete power series distribution into another distribution having support of specified finite size has been proposed by Levy and Golnabi.  An essential feature of the finitized version is that it preserves the moments of the parent distribution up to the order of finitization.  In this paper we seek to explore the possible advantages of using such a finitized distribution in simulating random variates that belong to the family of discrete power series distributions.  We also check the accuracy of distributions derived by using the method of finitization compared to the theoretical distribution. We have studied the various methods of simulating random variates and the relative advantages with respect to computational times. We have carried out the simulation in SAS and compared the computational speed with respect to whatever conventional methods SAS uses to generate these distributions.  After analyzing the various processing times required for simulation, we could conclude that the method of finitization is advantageous in reducing the processing times by reducing an infinitely supported series into its finitely supported form.  We also could conclude that the advantages in processing times may also depend on other factors like the software used, the operating system, and the hardware configuration  of the computers used for carrying out simulations.

JianJian Cheng, Projecting the Charge-Off Rate for Consumer Loan Products at HSBC Household, May 23, 2005 (Martin Levy, Norman Bruvold, Yan Yu)

Consumer loan portfolios comprise millions of dollars of receivables at HSBC Household.  The ability to understand what the loss, mainly the charge-off, is going to be has become essential.  Yet today there are few models available that address this area at HSBC Household.  The focus of this paper is primarily on the consumer loan charge-off rate forecasts.  The goal is to predict monthly performance from two months ahead to four months ahead.  This paper is to answer the question faced by the senior management of HSBC Household 'how can we better project the charge-off for consumer loans?'  Given the absence of a formal forecasting model, this paper presents the forecasts of six models including cohort average, Winter's method, linear regression, simple ARIMA time series models, ARIMA intervention models, and ARIMAX models.  This case study concludes that, overall, the ARIMA intervention model and Winter's method provide very good forecasting for both two months ahead and four months ahead and they are recommended.  ARIMAX model forecasting accuracy is not stable.  It produces the best forecasting result for the two-months-ahead window, but is the second worst for the four-months forecasting window.  So this model should be used carefully.  Linear regression provides good results with stable accuracy.  It can be used as a benchmark for other alternative forecasting models, if the delinquency data are accessible.

Peter G. Donley, Intervention Forecasting: How to Forecast Appropriately for Categorical Demand when a New Wal-Mart Superstore Enters a Retail-Dominated Market, May 20, 2005 (Martin Levy, Norman Bruvold, David Rogers)
As Wal-Mart continues to saturate the retail market, other competitive retailers are trying to find ways to adjust for the inevitable changes that they will face in the future.  Consumers now have a wider selection of retailers to choose from than the usual local grocery store down the street.  As a new Wal-Mart Supercenter enters the market place, there is an obvious change, an intervention, in consumers' shopping patterns.  This project is focused on one appropriate method of forecasting consumer demand in a particular category, given that a Wal-Mart Supercenter has entered the marketplace.  Using ARIMA intervention modeling, the appropriate steps will be taken in finding an accurate model for forecasting categorical purchases when a Wal-Mart enters and the direct effects of consumer demand are sought.

Nelly Louise Jorgensen Shapero, Human Resources Forecasting Models for Small Companies, March 11, 2005 (David Rogers, Norman Bruvold, James Evans)
Small companies should make data collection for human resource measures a routine task.  A trend and regression analysis may work well for short-term forecasts of manpower requirements, even though it may be difficult to get a detailed forecast using these models.  A Markov model may be useful for analysis of how many people will be in each position at some future time.  The models are fitted to conditions at Transfreight LLC.  Two curves are fitted to the trend analysis, an exponential and a linear curve.  The trend analysis provided very reliable results for forecast using both of the models.  The analysis provided a forecast with an R-square of 0.981 for the linear model and 0.989 for the exponential.  A multiple regression analysis may work well for many small companies, but for Transfreight the results were not as good as the trend analysis.  Using stepwise regression, the only variable entered was time and an F-test of the single-variable linear model and the multiple-variable regression models does not favor the more complex model.  A Markov model was developed and used to describe the system but was not used for forecasting the employee numbers.  Many of transition probabilities are very small in this model.  The distribution of the standard errors, therefore, becomes very skewed and the normal assumptions necessary for accurate predictions were unreasonable.  Predictions made with this model may therefore contain large errors.  Several qualitative and quantitative models for human-resource planning are briefly described and evaluated for fit to small companies.

Anand Mathew, Work-In-Process Inventory Entitlement for the Aircraft-Engine Industry, March 11, 2005 (David Rogers, Amtiabh Raturi, Uday Rao)
Understanding, visualizing, and controlling inventory flow is one of the challenges faced by the modern manufacturing industry.  Too much or too little of inventory in any form - raw materials, work-in-process (WIP) or finished goods - is undesirable.  Of these three types of inventories, work-in-process inventory is an indication of the lack of coordination within the organization.  By constantly monitoring and properly managing the work-in-process inventory levels, an organization can substantially reduce its operating costs. Most of the parameters that affect work-in-process inventory are within the organization and hence projects related to work-in-process inventory require a significant amount of impetus and organizational restructuring to succeed.  Complexities of modern machineries, unstable and seasonal demand patterns, constant design alterations, and widely dispersed manufacturing locations have made visualization, analysis, and optimization of the work-in-process inventory flow cumbersome and time consuming.  This project was undertaken in order to develop a scenario-analysis platform for evaluating the impact of various design parameters upon work-in-process inventory.  This new application provides the user the ability to alter the demand schedule, bill of material, product cost, assembly levels, or cycle time of each component in order to analyze its impact on the work-in-process inventory levels.  Currently this tool is being used for inventory forecasting and resource allocation at one of the world's largest aircraft-engine manufacturers.

Kelly Herrmann, Optimal Portfolio Allocations for Hedge Funds with Asymmetric Returns, November 24, 2004 (Yan Yu, David Rogers, Norman Bruvold)
'Hedge fund' is a phrase describing a broad range of alternative investment strategies.  What they all have in common is a goal to create positive returns in any market environment.  They are unregulated and privately organized, allowing for very flexible investment styles (i.e., using leverage).  Non-normality and asymmetric returns are usually observed, which make traditional quantitative studies based on Guassian symmetric assumptions difficult to justify.  Portfolio allocation, for example, is greatly affected by asymmetric returns.  The goal of this project is to determine optimal allocation for a portfolio in hedge funds.  The hedge-fund universe is divided into eight strategy categories, and recommendations of the percentage of wealth invested in each strategy are given.  Strategies are represented through indices developed by hedge-fund Research.  Also, the non-normality of returns will be accounted for using two unique optimization methods, the modified value at risk through the Cornish-Fisher expansion, and Duarte's unifying formulation.  These methods will be explained and the portfolios they produce will be compared.

Sujan Balachandran, Bayes Estimation under Bounded Asymmetric BLINEX Loss in  a Direct-Mail Decision Problem, November 24, 2004 (Martin Levy, Norman Bruvold, David Rogers)
While unbounded symmetric loss functions, such as squared-error loss, are widely used in Bayesian statistical decision theory because of their mathematical convenience, there are many situations where a bounded and asymmetric loss, such as the BLINEX loss, is more desirable.  The aim of a direct-mail marketing problem is to maximize the profitability by increasing the order size and also to increase the market share by familiarizing the potential customers with our products.  However, we restrict our problem to the quantitave realm and present an application of Bayes estimation under BLINEX loss to a direct-mail decision problem in which maximum profit is the main decision goal and mailing size is the decision variable.  Our aim is to recreate a scenario using a real data set that is very similar to what was previously done using simulation.  A scenario to demonstrate and quantify how the profitability of Bayesian estimation can be improved by incorporating the intrinsic boundedness and asymmetry features of the direct-mail loss structure.  A algorithm is used to fit BLINEX based upon information elicited from decision makers in general circumstances.

Ning Shao, Semiparametric Estimation for Credit Scoring, November 16, 2004 (Liang Peng, Martin Levy, David Kelton)
Credit scoring is a statistical system used for assessing credit worthiness of potential borrowers and classifying customers into 'good' or 'bad' risk classes.  With the explosive growth in the consumer credit market, the credit scoring methods have become increasingly important.  They are now standard tools of credit card companies, banks, and mortgage companies, etc. to assess the loan applications, minimizing companies' costs of failure over risk groups.  Common classification and regression methods of credit scores are usually linear on explanatory variables.  However, in many applications, there is not always evidence of a generalized linear relationship.  Data-driven nonparametric/semiparametric modeling techniques such as the generalized additive models, generalized partially linear models, and generalized single-index models, emerge as promising alternatives that offer the flexibility of fitting the curvature and yet retain the ease of interpretability.  They are often considered important data-mining techniques in the initial stage of exploratory data analysis.  This project investigates various semiparametric modeling techniques on a French bank credit scoring data: generalized linear models (GLM), generalized additive models (GAM), generalized partially linear models (GPLM), generalized partially linear single-index models by P-splines (GPLSIM-P), and generalized partially linear single-index models by kernel smoothing (GPLSIM-K).  The response variable of interest is a binary variable indicating default/no default of a loan.  The predictors are variables based on the customers' information and credit history etc.  The goal of this project is to study different semiparametric models of most recent research using credit-scoring data, to reveal the relationship between variables, and to capture the curvature if any non-linearity exists.  Alternatively, methods such as classification and regression trees (CART) and neural network are also discussed.

Keli Feng, Identical Jobs Cyclic Scheduling: Formulation and Solution, October 8, 2004 (Uday Rao, Amtibah Raturi, Norman Bruvold)
We study the computationally-hard, re-entrant flow, cyclic scheduling problem considered by Graves et al. (1983) and Roundy (1992).  We present two problem formulations to minimize job flow time (work-in-process), given a target cycle length (throughput).  We describe an effcient method to solve the problem to optimality; in computational experiments this method was significantly faster than commercial optimization software (CPLEX 8.0) and solved 40% more of the test instances to optimality within the specified run time and memory limits.  We also develop a new ImproveAlignment (IA) heuristic algorithm, which we test against the optimal solution or bounds.  Numerical experiments indicate that the proposed IA heuristic quickly produced solutions whose flow times were, on average, (i) 22% better than those from the Graves et al. heuristic and (ii) within 14% of the optimal flow time.

Vladimir V. Pashkevich, The Role of Culture-Level Factors in Shaping Online Purchase Intentions: A Cross-Country Comparison, August 17, 2004 (David Curry, James Evans, Yan Yu)
The primary goal of this research is to enhance our understanding of the moderating role that culture-specific variables - individualism/collectivism and cultural context - play regarding an individual's intentions to use the Internet for obtaining product information and shopping.  Specifically, this research (a) operationalizes the concept of cultural context by constructing an index with formative indicators, (b) develops reliable and valid scales for measuring constructs comprising the theory of planned behavior (TPB), and (c) examines the boundary conditions and generalizability of the TPB in Internet-mediated consumption settings.  The proposed model is used to examine effects of variables, at the culture level, on the strength of relationships among individual attitudes, experience, subjective norms, and purchase intentions.  Predictions under TPB are evaluated across two samples drawn from the United States and Belarus.  Findings reveal that subjective norms tend to influence decisions in high context/high collectivist cultures, but not in high individualist/low context cultures.  The effects of attitudes and past behaviors on intentions were equal for the American and Belarusian cultures.  Results of the proposed study are expected to yield implications for marketing practices across cultures.

Rachna Jaison, Volatility of Demand and its Operational Consequences: A Simulation Study of Systems Dynamics in the Machine-Tool Industry, August 12, 2004 (Amitabh Raturi, David Rogers, Jeffrey D. Camm)
The machine-tool industry, a small but vital sector in U.S manufacturing, suffers high volatility in demand due to a combination of factors. Several trillion dollars worth of inventory lie wasted in the supply-chain pipelines when demand recedes; alternatively, major opportunity losses in business are incurred when firms are unable to deliver during periods of high demand.  Machine-tool firms, furthermore, have a severe organizational problem of maintaining a skilled labor force in this highly volatile scenario.  Many studies have tried to understand the sources of the volatility and to test alternate policies to reduce volatility, such as reducing order lead-time, information lead time, and capacity planning lead time, altering the work force, and encouraging smoother customer ordering policies.  In this study, I use systems dynamics and dynamic simulation to model the non-linear causal, delay, and feedback loops in the machine tool industry.  A simple model of a machine-tool maker and a customer is created using Vensim to test various strategies that firms can implement to mitigate the effect of volatility on the industry.  From my simulations, I conclude that: (1) the bullwhip effect and the investment-accelerator effect are the two main factors responsible for the extreme amplification of volatility in the machine tool industry, (2) a decrease in the volatility in product orders by the customer increases the average productivity of the machine tool builder significantly, (3) an increase in the customer-order volatility leads to a significant decrease in the average experience level of the machine tool maker's employees, (4) reducing the production lead-time reduces the backlog for the machine tool maker and benefits the entire supply chain.  However, the sensitivity tests reveal that reduction in lead time can have unexpected effects on the machine tool maker's production level and capacity, and (5) smoother customer order policies are the most effective vehicle for reducing order volatility significantly compared to other changes in the machine tool operating policies or parameters.

Severine Renault, Forecasting Residual Value Insurance Using Logistic Regression, May 24, 2004 (Martin Levy, Jeffrey Camm, Norman Bruvold)
The purpose of this paper is to develop and assess a logistic regression model to predict the probability of claim for a Residual Value Insurance (RVI) portfolio. This type of insurance is a highly specialized asset-management tool through which an insurance provider assumes the market risk associated with end values of leased assets, automobiles in this case. In the past, the vehicle type, either at the make or model level, has been used to segment data into different groups, for each of which a separate model was built. The focus here is to include a categorical variable representing these groups in the model itself in order to fit a single regression for the entire portfolio. Fitting the model involves looking for confounding and interaction between the categorical variables and other independent variables, testing the significance of each input variable in the model, and finally deciding which one of the vehicle make or model is relevant to represent the vehicle type as a risk factor. The log-likelihood ratio test and the Wald chi-square statistic were used at this stage, the former to compare different regression models and the latter to test individual coefficient estimates. Once a satisfying set of variables has been defined, the next step is to assess the model. We relied for this on commonly used statistics for logistic regression, namely the c statistic for the area under the ROC curve, the Hosmer and Lemeshow ? statistic for goodness-of-fit, and the Osius and Rojek normal approximation to the distribution of the Pearson chi-square statistic. Since this second stage led us to conclude that the model was not a good fit, this paper ends with a brief comparison with results obtained from models where the data were partitioned by vehicle type and the corresponding categorical variable removed.

Karen L. Bickel, Evaluating Intensive Care Unit Mortality: a Comparison of Risk-Adjustment Methods , April 2, 2004 (Norman Bruvold, David Rogers, David Kelton)
Adjusting for differences in patient characteristics present on admission to the intensive care unit (ICU) is essential when comparing ICU outcomes. Mortality risk-prediction models measure variation in patient outcomes for severity of illness and predicted risk of death. Much of the literature refers to the utilization of risk prediction models to evaluate clinical performance and cost-effectiveness of ICUs. Computerization of commonly used laboratory variables in conjunction with the often extraordinary costs associated with manual data entry presents opportunity for the development of an automated, risk-adjusted ICU mortality model. We compare the performance characteristics of two different risk-adjusted ICU mortality models; the National Veteran's Administration (VA) Surgical Quality Improvement Program (NSQIP) surgical risk model, a partially manual data-collection process that identifies pre-surgical risk factors and uses those risk factors in the development of a 30 day mortality model for major surgical procedures, and the Veteran's Administration Intensive Care Unit Risk Adjustment model, (VIR). Assessment of model fit was completed using the Hosmer-Lemeshow goodness of fit statistic, sensitivity and specificity measures, and the c-statistic as performance metrics in evaluating the behavior of each model. Our results indicate that the VIR automated, mortality risk-prediction model produced similar, if not improved, results in model performance vs. a highly used manual data-collection method obtained by the NSQIP model. These results demonstrate that the VIR computerized mortality risk-prediction method yields comparable results to the NSQIP mortality risk prediction model for these data and warrants further study.

Piu Bose, Analysis of Covariance Model to Evaluate the Impact of a $40 Million Ad Campaign in a test Market, Using Retailer-Level Data , March 19, 2004 (Norman Bruvold, Jeffrey Camm, Martin Levy)
Market researchers are concerned with the effects of different interventions or experimental conditions (treatments) on a set of consumers. These experiments are used to reject or affirm a hypothesis, and in case of rejection, provides support for an alternative conclusion. But in the real world, these treatments often get convoluted due to some extraneous factors that constantly play in the market place. As a result, the impact on consumers is a function of both the test treatment as well as the external factors. Therefore it becomes impossible for the researcher to evaluate the true impact of the test treatment and thereby accept or reject the hypothesis. This paper attempts to understand the methodology called ANCOVA or Analysis of Covariance that is used to evaluate a test-treatment while eliminating the influences of extraneous non-test factors. ANCOVA combines two statistical techniques called Regression Analysis and ANOVA. Here the dependent variable scores and treatment conditions constitute the data, but the model includes not only experimental conditions, but also one or more quantitative predictor variables. These quantitative predictors, known as covariates, represent sources of variation that are thought to influence the dependent variable but have not been controlled by the experimental procedures. ANCOVA determines the covariation (correlation) between the covariate(s) and the dependant variable and then removes that variance associated with the covariate(s) from the dependent variable scores, prior to determining whether the differences between the experimental condition (dependent variable score) means are significant.

Yahong Cui, Application of Multivariate Adaptive Regression Splines (MARS) in Direct Marketing, March 15, 2004 (Martin Levy, Jeffrey Camm, Yan Yu)
Increasing costs of direct marketing campaigns coupled with declining response rates have prompted many direct marketers to turn to more sophisticated techniques to model response behavior. The underlying premise is that even a small improvement in prediction accuracy can have significant implications for the bottom line. This study investigates the use of a recently developed technique, Multivariate Adaptive Regression Splines (MARS), together with logistic regression in the context of modeling direct response. In this study, we report a performance analysis among MARS models, logistic regression models, and expectation models, i.e. MARS and logistic regression combined. The MARS procedure builds flexible regression models by fitting separate splines to distinct intervals of the predictor variables. Specifically, our goal is to assess the relative effectiveness of MARS models vis-à-vis logistic regression with original predictor variables in modeling direct response behavior. Our analyses show that the expectation models and the MARS models outperform the logistic model in general, leading us to conclude that MARS offers a number of advantages over a logistic model and MARS can improve the performance of logistic regression models. Direct marketing strategy implications in variable selection, model evaluation, and error variation stabilization are also discussed in this study.

Hui Hui, Comparing Logistic Regression, Classification Trees, and Hybrid Tree-Logit Models on Building Scoring Models for Catalog Mailing Campaign Data, March 10, 2004 (Martin Levy, Norman Bruvold, Jeffrey Camm)
In the last few decades, the direct mailing campaign has become an important field of direct marketing. An effective direct mailing campaign aims at selecting those target groups, offer and communication elements (at the right time) that maximize the profits. Of these four components the list of customers to be selected is considered to be the most important. Therefore, a large amount of direct marketing research focuses on list segmentation or target selection techniques. The scoring model is an effective methodology to realize the purpose of target selection. It assigns every observation in a database a score indicating how likely someone is to respond to a mailing campaign. Thus, according to these scores, the direct marketer can pick a specific number of people to receive a particular offer so that the response to the mailing is maximized. The objective of this project is to compare the performance of three predictive methodologies, Logistic Regression, Classification Tree, and Hybrid Tree-Logit model on building the scoring models to distinguish between the likely responders and nonresponders. By applying these three methodologies to a catalog mailing campaign data set which has 106,284 records and 47 fields, I came to the conclusion that the hybrid model is the best one to create the scoring model in this project, since it can better fit the data, while maintaining similar good performance properties as logistic regression. From the analysis results, I also found that while a classification tree is not as good for building the scoring model, it is the best choice for the classification task here.

Rajesh Radhakrishnan, Interactive Route Builder for Logistics Planning, February 27, 2004 (Jeffrey Camm, Michael Fry, David Curry, Robert Martichenko)
Logistics can be described as the planning, organizing and managing of activities that provide goods or services. Route designing is at the core of the 'planning' phase of operations for a 3PL (Third Party Logistics) company. The first step involves the plotting of all the locations on a map to identify clusters of suppliers (based on their location and freight information). Then routes can be designed that send freight from the suppliers either directly to the plant or to a consolidation center called a 'crossdock.' Route design has traditionally been a slow manual process done on an Excel worksheet with the use of mapping software to print a map of the supplier locations. The designer also has to rely on his or her experience in coming up with a good design on the very first attempt since historically the time taken in the process is usually quite long to allow for multiple designs and comparative studies among them to choose the best one. In this project, I have designed a software tool named the 'Interactive Route Builder' (IRB) to facilitate route-grouping and route design in general. It has made the route design process quicker and more efficient (locations are added to a route by simply clicking on them on an embedded map). The IRB allows the route designer to quickly generate a number of routing scenarios and compare them based on different parameters (such as total cost, cost per cube, etc). This report also includes mathematical and simulation analysis of the parameters to use when geo-locating a crossdock. The solution to the objective function that minimizes the sum of the product of cube and distance of the suppliers from the proposed crossdock is recommended as compared to the cube-center-of-gravity solution.

Samir Kulkarni, An Exploration of the Resource Constrained Scheduling Capabilities of Microsoft Project, February 13, 2004 (Amitabh Raturi, Jeffrey Camm, Michael Fry)
The resource constrained scheduling problem (RCSP) is a significant challenge because of the mathematical complexities that exist within the problem's formulation. Over time, software packages have been developed to aid practitioners with solving the RCSP and programs became increasingly friendly to the user and versatile in how much data the software could incorporate. As the software became more complex, it is claimed that the mechanisms used to determine the best resource constrained schedule began to deviate from what had been proven academically. In this paper, we explore the gap between academic research and the capabilities of scheduling software, specifically in the software's ability to produce a schedule optimal to certain objectives. We study the RCSP literature and analyze the leveling capabilities of Microsoft Project to gain insight to the aforementioned gap. The exact goals of this paper are to: 1) Discuss the major developments in RCSP that have brought the field to where it is today, 2) Discuss the leveling capabilities of Microsoft Project, a leading scheduling software, and the methods that it uses to obtain a feasible resource constrained schedule, 3) Provide insight into the effectiveness of Microsoft Project's leveling algorithm by comparing the results of several problems implemented in both MSP and as mixed integer programs in AMPL/CPLEX.

Huiqing Zhou, Response Models In Direct Marketing, January 21, 2004 (Martin Levy, Jeffery Camm, Norman Bruvold)
Direct marketing (DM) is a key area where scientific methods are often applied to analyze a massive amount of business data. The core of the decision process in DM is a response model which is applied to assess the purchase propensity of each customer in the list prior to the mailing. A variety of approaches have been developed in the direct marketing industry to model response, i.e., RFM (Recency, Frequency, Monetary) variables, tree-structured automatic segmentation methods such as AID (Automatic Interaction Detection), CHAID (Chi-squared Automatic Interaction Detection) and CART (Classification and Regression Tree), and linear statistic models such as logistic regression, etc. In this paper, two popular models (i.e. logistic regression model and RFM model) are introduced, built and evaluated. It is shown that the logistic regression slightly outperforms RFM model, while each model has its own specific advantages.

Hong Gu, Using Data Mining Technology to Build a Predictive Model and to Gain Understanding of Customer Characteristics for a Multi-division Catalog Company, December 1, 2003 (Martin Levy, Jeffrey Camm, Yan Yu)
Data mining techniques enable companies to evaluate historical transaction data from consumer databases and to develop a good consumer model, grouping customers based on visit frequency, profitability, etc. In this project, the data are the catalog purchases from a multi-division company that mails different catalogs to a unified customer base. The dataset contains 96,551 customer records and each record has 163 fields, including life-to-date orders, dollars, items, payment method, and very minimal demographics. All the customers receive the Division D catalog. The project is aimed at identifying the characteristics of would-be responders and the construction of a model that can predict which customers are most likely to respond to their Division D catalog solicitation. The outcome variable, “buying from division D”, is binary, while the predictor variables are either continuous or categorical. Logistic regression, CHAID, and CART approaches are employed. Since there are 163 variables involved, reducing the variables to a manageable size prior to model building is an essential and big step. Due to the dominant number of non-responders in our dataset and the limitation of the software Answer Tree 1.0, logistic regression contributes a lot in variable screening. The final logistic regression model can correctly predict 60.8% of the total wouldbe responders; the CHAID can correctly predict 54.09% of the would-be responders, and the CART algorithm in Answer Tree 1.0 can correctly predict 66.58% of the would-be responders. In terms of prediction, the CART outperforms the other two. Furthermore, the tree maps provide an intuitive understanding of why certain segments respond better than others. However, the 15-node CART tree can only provide 15 different estimated probabilities. The logistic regression model has unique predicted ability for every record.

Snehlata Bomma, Conjoint Bridging and Optimization Project, November 24, 2003 (David Curry, Jeffrey Camm, Uday Rao)
Political polling, whether of public opinion about issues, such as gun control, or direct preference polling for political candidates, has traditionally relied on very distinct survey methodology. Respondents are asked to select their preferred candidate in a mock election or to answer “yes or no” regarding a specific issue. However, most of the supercharged issues of today are multidimensional. Their complexity is “dumbed down” by standard methods, a disservice to political constituents most affected by polling results. This thesis suggests an alternative technique for assessing public opinion that deals well with complexity. The basic method, called conjoint analysis, has been employed in marketing and psychological research for several decades. However, recent developments in conjoint bridging designs and conjoint optimization enhance the applicability of the overall “technique package” to political polling, yielding many insights unavailable with today's standard approaches. In this thesis, we analyze results from an online survey that involves conjoint analysis. We test a theory of “conjoint bridging” that pools parameter estimates between two conjoint exercises. Respondents are asked to react to various hypothetical candidates for US president based on the candidate's positions on several dimensions of Homeland Security policy. Output from the conjoint analysis is then used in a conjoint optimization phase to find an “optimal position on Homeland Security”. Optimal means that even though individual voters weight attributes differently and prefer different levels, there is a single combination of levels that will please the most voters.

Xuming Yang, Framingham Heart Study Data Analysis -- A Case Study for GLM, GPLSIM and GAM, November 12, 2003 (Yan Yu, Jeffrey Camm, Norman Bruvold)
One of the most important techniques in statistics is regression analysis. Applications lie in a variety of fields, such as finance, marketing, and many medical fields. Linear regression can provide useful and interpretable descriptions in the linear relationship between response and predictor variables. The generalized linear models are powerful in fitting the linear relationship between variables when the response is from a general exponential family, for instance, binomial or Poisson. Unfortunately, in many applications, there is not always evidence of a generalized linear relationship. Other data-driven nonparametric modeling techniques, e.g., the generalized single-index models, generalized additive models, emerge as promising alternatives that offer the flexibility of fitting the curvature and yet retain the ease-of-interpretability. This project focuses on the application of several different modeling techniques -- generalized linear models (GLM), generalized partially linear single-index models (GPLSIM) and generalized additive models (GAM) -- on Framingham Heart Study data. The response variable of interest is a binary variable indicating the occurrence of coronary heart disease. The predictors are the patients' age, cholesterol level, systolic blood pressure, and their smoking status. The objective of this project is to apply and compare different models using Framingham Heart Study data to reveal the relationship between variables and to capture the curvature if any non-linearity exists. From this case study we conclude that the logistic or probit regression performs well to fit the linear relationship. When the nonlinear relationship exists, generalized additive models and generalized partially linear single index models are better in terms of capturing the non-linearity. GAM are very helpful for a visual inspection of non-linearity. GPLSIM fit the Framingham data best and retain ease-of-interpretability.

Neil D. Eisner, A Daily Replenishment Production Scheduling and Inventory Minimization Simulation, October 15, 2003 (Michael Fry, David Kelton, David Rogers)
General Cable Corporation is a $1.6 billion manufacturer of industrial and specialty cable products, spread over seven major product groups. Within a major product group, products are initially subdivided into families, termed by management ”product lines.” The Portable Cord major product group is manufactured exclusively at the company's facility in Lincoln, Rhode Island. While the firm is relatively early in its implementation of more modern manufacturing practices, several cells are currently in operation at the Lincoln plant, with each cell dedicated to the manufacture of a particular group of product lines. This study addresses demand planning and production scheduling for a single cell involved in the manufacture of product lines 40, 42, 43, 46, P5, and Q5. A well-known advantage of cellular manufacturing configurations is the enhanced capability for quick and more effective response to highly variable demand. The daily aggregate bookings for these product lines, aggregated across the company's five distribution centers, demonstrate extreme variability (i.e., a coefficient of variation of 79.16). Current demand planning is simplistic and leads to excessively high inventory carrying costs. Using a quarterly planning horizon, the mean plus 2.06 standard deviations (corresponding to a 98% service level) of the previous quarter's demand is calculated, and production is scheduled for the upcoming quarter at a fixed daily rate sufficient to equal last quarter's demand. A simulation model is developed using the most recent two years of historical booking data. We provide an estimate of the inventory levels the firm would need to carry if a daily replenishment production scheduling system were to be implemented, maintaining the same 98% service level to customers. The product lines under investigation exhibit strong commonality in both their manufacturing processes and in their bills of material. Manufacturing cycle times are on the order of magnitude of hours; therefore, the cell dedicated to a particular product line is capable of a one-day turnaround time in response to bookings. Additionally, the individual distribution centers demonstrate their own individual and unique characteristics. The mean demands at the distribution centers (RDC's) differ widely (all with similar high variability), implying that the relative contribution of each is very different with respect to meeting the overall service level goal. Neither shipping lead times, nor shipping frequency, are the same for any two RDC's. Another complicating factor is the occasional need to make a large shipment from the plant directly to a customer. This study first identifies the forecasting method which will drive the daily production schedule. The proposed process through which products are distributed, manufactured, and replenished is mapped in detail. A simulation model of this system is built using Arena® discrete event simulation software. Variants of the model are explored, such as the sequence of RDC fulfillment, the daily production control limits, and the pallet (lot) size. Lastly, the potential effect of forecast accuracy on inventory levels is evaluated and described.

Marione P. Gonzales, A Model for Profiling Radio-Stations Listeners, Using Logistic Regression, CART, and CHAID for a Given Data Set, August 28, 2003 (Martin Levy, Norman Bruvold, Jeffrey Camm)
Data mining has drawn much attention in the business, marketing, and medical fields. Data-mining techniques can be used to find relationships and patterns in historical data for the purpose of predicting or classifying future observations. Practical applications include detecting credit worthiness of loan applicants and predicting a patient's risk of developing an illness. This paper attempts to develop a model for profiling radio-station listeners using statistical methods like logistic regression, CART (Classification And Regression Tree), and CHAID (Chi-Square Automatic Interaction Detection) for a given data set. More specifically, segmenting data provided by “RadioX” (radio station of interest) into listeners of RadioX and listeners of other radio stations was performed. The ability to analyze categorical data is the primary reason for choosing logistic regression, CART, and CHAID. Results from analysis using these three methods are compared and combined to profile RadioX listeners. The predictive performance of each statistical method can be measured by the misclassification rate, which measures the rate of correctly classifying the data. In terms of accurately classifying RadioX listeners from non-RadioX listeners, logistic regression, CART, and CHAID give almost the same total misclassification rate; but logistic regression gives a better misclassification rate for segmenting specific RadioX listeners. However, difficulties interpreting the results of interactions with logistic regression analysis exist. So, we use logistic regression mainly to isolate important variables, and then use CART and CHAID to determine categorical values.

Yanrong Cao, Penalized Spline Estimation For Functional Coefficient Regression Models for Nonlinear Time Series, July 25, 2003 (Yan Yu, Martin Levy, David Kelton)
A penalized spline approach is proposed to estimate functional coefficient regression models for nonlinear time series. The functional coefficient regression models assume the regression coefficients vary with certain lower dimensional covariates, providing appreciable flexibility in capturing the underlying dynamics in data and avoiding the so-called “curse of dimensionality” in multivariate nonparametric time series estimation. One of the appeals of the proposed model lies in the efficiency in estimation of the coefficient functions via the global smoothing method. In addition, different smoothness is allowed for different functional coefficients, which is enabled by assigning different penalty accordingly. The penalty terms, selected by minimizing generalized cross validation scores (GCV), balance the goodness-of-fit and smoothness. The number and location of knots are no longer crucial if the minimum number of knots is reached. The consistency and asymptotic normality of the penalized least squares estimators are obtained. Our penalized spline approach also enables multi-step-ahead forecasting with an explicit model expression in contrast to the local smoothing method. The proposed approach is demonstrated by both simulation examples and a real data application.

Sara Dziech, Exploratory Analysis of Horse Racing Data, May 27, 2003 (Martin Levy, Norman Bruvold, Yan Yu)
Gambling or wagering is big business and is becoming even bigger in the greater Cincinnati area. State lotteries and the gambling boats have brought legal betting back into the spotlight. This surge of renewed interest in gambling has brought more attention to one of the oldest forms of wagering, horse racing. With the increase in the use of home computers and the internet, now more than ever, and overwhelming amount of data are available on individual horse performance, track entries, and results. Using these data, an exploratory statistical study was conducted to look for trends in the data and to create predictive models to help select the “winners.” The study also demonstrated whether the results were truly random, or if there were commonalities that would allow the astute handicapper to have an advantage over the common bettor. The ability to predict which horses would finish in the money (first, second, or third) would be key to actually making money at the track, since the bigger payoffs come from the exotic bets such as the Daily Double or Exacta. The models and analysis presented here may prove useful in successfully selecting the horses that will finish in the top three or in the money. Basic statistics were reviewed and key elements presented. Weighted general linear models were created using the percent of finishes in the money as the dependent variable. The logistic models were developed using a binary dependent variable -- finished in the money or did not finish in the money. CHAID analysis using Answer Tree was also performed. Each type of analysis was conducted from two views -- all tracks combined with the emphasis on overall trends, between-track differences, and track-specific models. The resulting models and their appropriateness were compared. The final part of the project involved testing the predictive ability of the logistic model against the selection performance of a few average people to see if the model was more successful than random guesses.

Thushan Wijesinghe, A Comparison of Two Heuristic Solutions for Scheduling Time-Shared Jets, May 27, 2003 (Jeffrey Camm, Michael Magazine, David Rogers)
Fractional ownership of jets has been increasing its popularity over the last 3-4 years at an exponential rate. Under the concept of "Fractional ownership," customers become partial owners of an aircraft, which in return allows them to fly a predetermined number of hours per year. If the requesting customer has enough flying hours left, it then becomes the task of the scheduler of the airline company to assign a jet to fulfill the demand. There has been a rather limited amount of academic research done in the area of scheduling time-shared jets. So far, an Integer-Programming solution and a minimum cost-flow heuristic solution have been put forward. In this paper two heuristic approaches are presented for solving this problem. The two heuristic methods will be compared to an IP approach for a base case scenario. The first heuristic is for minimizing the relocation times for jets (which is the objective of the problem) by using a “one-step look ahead” rule (a “greedy heuristic”). The second heuristic rule is for allocating trips to jets based on the number of remaining trips that each jet should serve. The prime advantages of such a heuristic solution are the ease of formulation, minimum user intervention, fewer variables (compared to the IP solution), and the non-necessity of sophisticated software such as CPLEX.

Qiang Zhu, Building Credit-Scoring Models Using Logistic Regression, CART, and CHAID, May 27, 2003 (Martin Levy, Jeffrey Camm, Norman Bruvold)
Classification methods have been widely used in the identification of respondent profiles. One of the most important applications is in credit scoring, which is a method used by lenders to help decide whether or not an application is a good candidate for a loan. In this work, three classification techniques are applied and compared to analyze a complex data set of credit risks. The dependent variable is a binary variable indicating whether an applicant defaulted on a loan or not. The three classification techniques are logistic regression, Classification and Regression Tree (CART), and Chi-squared Automatic Interaction Detection (CHAID). It turns out, that in a given test sample, logistic regression technique achieves the best predictive performance. An additional two-step CHAID and logistic regression analysis is applied to produce a combined prediction in order to determine if the combination of two techniques will achieve better performance than one single technique. This prediction turns out to be slightly worse than the logistic regression model, but yields a better performance than the CHAID and CART model. Therefore, we propose CHAID as a method of enhancing interpretation of a logistic regression model through the examination of the significant predictors and interaction terms.

Wensui Liu, A Case Study Comparing CART and Neural Networks, March 21, 2003 (Martin Levy, Jeffrey Camm, Norman Bruvold)
Traditionally, statistical methods, such as logistic regression and discriminant analysis, have been widely used to do the classification analysis. However, when assumptions for statistical analysis are not met, alternatives need to be considered. In my project, two popular nonstatistical methods for classification, namely Classification and Regression Tree (CART) and Neural Networks (NNs), have been discussed. For Neural Networks, two widely-used paradigms in classification, which are Feed-forward Network trained by Back Propagation algorithm (BPN) and Generalized Regression Neural Network (GRNN), are covered. In order to evaluate the performance of these methods, I apply them to benchmark data for classification, build the models to make prediction, and then make comparison. After comparing six models from these two methods, we find that BPN outperforms the other methods for prediction performance. However, it needs more computational effort and longer training times. Moreover, its results are difficult to interpret. CART is easier to use and can be interpreted intuitively. And its result is almost as good as the ones from BPN. Therefore, we conclude that none of these models is apparently superior over the others and a possible compromise is to combine these two methods and make improvement in classification analysis.

WenWen Wu, A Study of Tree Models, November 26 2002 (Yan Yu, Martin Levy, Sung-Eun Kim)
Tree models, recursively partitioning data to more homogenous subsets, are widely used for data mining. The seven most popular methods are discussed in this work: Classification and Regression Tree (CART), Bayesian Classification and Regression Tree (Bayesian CART), FACT, Quick, Unbiased, Efficient Statistical Tree (QUEST), Treed models, Bayesian Treed models, and Multiple Additive Regression Trees (MART). CART, one of the exhaustive search methods, is considered as a base model. The best split in CART is on the variable that can minimize the impurity of the nodes. Mean values in terminal nodes are calculated as predicted values. Bayesian CART adds stochastic methods of parameter estimation and model selection on CART. FACT and QUEST only split nodes by selected variables with unbiased variable selection and fast computational speed. Treed models and Bayesian Treed models put a subset of the original data set in the terminal nodes and fit different statistical models for them. MART is an additive model of many small regression tree models, such as CART. Its strength is its robustness and predictivity. We apply the above-discussed methods to two simulated data sets (categorical and continuous response, respectively) and one real data set, the Boston Housing data. However, under the restriction of software availability, only CART, Bayesian Treed model and QUEST are applied in the project. Traditional statistical methods, Linear Regression and Logistic Regression, are also used for comparison purpose. When the response variable is categorical, the misclassification rate is used as the criterion for model comparison. When the response variable is numerical, root mean squared error (RMSE) is used. Residual plots are also involved for model fitting performance comparison. From the applications in this project, CART and QUEST are shown to be the best for the categorical response variable case. However, CART and Bayesian Treed model are optimal for the numerical response variable case. For the more complicated Boston Housing data, the Bayesian Treed model has the best performance. The results indicate that the strength of Bayesian Treed and QUEST is more obvious for data with complicated structure. In the general case, CART is strongly recommended. CART is not only easy to use (by built-in program in software, such as S-Plus) but also yields good results.

Yuan Yao, Analysis of Volatility Time Series Models and an Evaluation of their Forecasting Performance, November 8, 2002 (Martin Levy, Norman Bruvold, Yan Yu)
The coefficient of variation is a statistical measure of volatility. For example, it measures the standard deviation of the closing price from its simple moving average. In finance, volatility is an important input to some predictive or pricing models like Balck-Scholes, CAPM, and so on. Moreover, an accurate estimation of return volatilities may shed some light on the generating process of the returns. We develop a methodology with capability for good estimation and forecasting for volatility. This paper examines three types of time-series models for their performance in volatility forecasting of economic data and tries to compare and evaluate their forecasting performance. For US ten-years bond return, a set of Simple Moving Average models (SMA(M)) with different values of M are provided to estimate and capture its volatility. We introduce GARCH, a well-known time series model for economic volatility analysis, to produce another measure of volatility. Based on the strong similarity between the volatilities created by the two models, we find that SMA(M) is a good replacement for GARCH in some special situations since GARCH is more sophisticated than SMA(M). For the S&P 500 monthly excess return and its volatility with non-linear features, we build both linear time series models and nonlinear GARCH for estimation and forecasting. Although volatility has some non-linear features, some good linear models can still be developed to describe it because the statistical theory is a well-developed and computational tool for linear models. Finally we determine which model performs better in volatility forecasting based on RMSE (Root Mean Square Error).

Vivek Kalpande, On the Mean Length of Two-Component Systems Under Some Bivariate Survival Functions, October 22, 2002 (Martin Levy, Jeffrey Camm, David Rogers)
We examine the expected survival time of series and parallel systems whose components have bivariate distributed lifetimes. The mean lifetime of such systems is a function of the dependence structure of the component lifetimes. Results are extended to multi-component systems.

Derek H. Wang, Return Analysis, Volatility Estimation and Trading on the Shanghai Stock Market, July 26, 2002 (Yan Yu, Martin Levy, Michael Ferguson)
In this project, the intraday return behavior of the Shanghai stock market with five-minute index data is first examined. Some interesting intraday seasonal patterns are found. The standard variance ratio test is used to test the random-walk hypothesis in order to understand the Shanghai stock market's microstructure efficiency. Three volatility models, including a continuous-time model, a GARCH model, and a time-dependent coefficient diffusion model with Kernel regression estimation are applied to estimate and compare the expected returns and volatilities of five-minute data. In addition, this work presents a penalized spline approach to estimate time-dependent drift and volatility in term structure dynamics. The drift and diffusion (volatility) components are estimated iteratively with weighted least squares. Two other methods, moment matching and maximum likelihood estimation, are described. The new time-dependent diffusion model can be considered to be the extension of most term-structure models and a special case of the general time-dependent diffusion model. Compared with other estimation methods, the penalized spline estimation method is easy to implement and less time-expedient. Moreover, there is no problem of discontinuous coefficient estimates when estimating the time-dependent coefficients with logsplines. With different volatility estimates, a de-volatilization technique is used to resample the data into different de-volatilized series for trading.

Natasha Lukiantseva, Using Inverse Optimization for Calculating Link Penalties for Traffic Flow, July 2, 2002 (Jeffrey Camm, David Rogers, George Polak [Wright State University])
In this paper the application of the Inverse Optimization Technique to optimally adjust link penalty factors to closely match the historical multicommodity traffic flow is presented. When using various tools to simulate railway traffic flow, network link cost factors (impedance) are introduced to reflect preferred routes often times different from shortest paths. The Inverse Optimization Technique is illustrated with an example. The application of the technique to a real problem at CSX Transportation is discussed.

Usha Viriyala, Bayesian and Classical ARIMA Analysis of Time Series and Forecasting – A Comparison Study, June 28, 2002 (Martin Levy, Uday Rao, David Rogers)
The field of Time series analysis and forecasting is gaining increasing recognition in today's business world. It is becoming increasingly important for firms to learn how they performed in the past in order for them to plan ahead in the future and more importantly predict it with maximum accuracy. Many techniques and software tools have evolved over the past few years in time series analysis and forecasting. The objective of this research project was to consider the Bayesian method of time series analysis and forecasting, and draw a comparison with the conventional Box & Jenkins' ARIMA modeling. An authentic data set from a department store in Florida was used for the research. The analysis was performed using both the above-mentioned modeling techniques, and results were compared. Recommendations are provided for further possible explorations of the problem on hand. Part of the project involved exploring a new software; BATS (Bayesian Applied Time Series) which was used for Bayesian modeling and, is felt, will be valuable for future use in classroom instruction. The ARIMA modeling was done using SAS system software.

Junying Wu, A Heuristic Approach to a Product Design Problem Under Conjoint and Hybrid-Conjoint Analysis, June 3, 2002 (Jeffrey Camm, Uday Rao, George Polak [Wright State University])
Product design, where the objective is to design a single product that will maximize the market share of the producer by selecting the appropriate levels of the various attributes, is one of the key factors in determining the success of a new product. Increasing attention has been given to the problem when firms decide to introduce their new products. As a result, techniques that will lead to an optimal product design are of great interest to every firm in order to survive and succeed in a competitive business environment. However, the fact that product design problem using conjoint analysis data is an NP-hard problem makes the searching for optimal solutions within a reasonable amount of time impractical when data are of extremely large sizes. Consequently, heuristic techniques that try to identify sub-optimal product designs have been proposed. In this paper, the authors propose a new heuristic algorithm that can generate “good” (i.e., close to optimal) solutions for the product design problem. The paper focuses on (1) how the algorithm can be applied to the product design problem, (2) evaluating the overall performance of this algorithm in generating solutions to the product design problem and the comparative results between this algorithm versus the GA heuristic (Balakrishnan and Jacob [1996]), and (3) limitations and further improvement in the algorithm.

Raymond Mapuranga, The Factors Affecting Contrived Collegiality Among Teachers: An Exploratory Study Through a Path Analysis, May 29, 2002 (Wei Pan, Martin Levy, Jeffrey Camm)
This paper involves an empirical study of factors influencing contrived collegiality among teachers in schools. The research entails the development of a path analytic model that incorporates the core constructs of Hargreaves' (1961) model of combined features that contribute to contrived collegiality. The model links 4 exogenous variables (administrative regulation, compulsory in nature, implementation and time orientation) and 2 endogenous variables (predictable outcomes and collegiality). In order to construct this model, fifty teachers, education students and professionals were surveyed at random. The CALIS procedure in SAS and the AMOS package are the two statistical software programs used in this project. The study found that administrative regulation does not encourage collegiality and that when teachers are required to work together, the outcomes, in terms of collegiality, are not predictable.

Vikas Sharma, Revenue Maximization by Capacity Rationing in an Uncertain Environment, March 5, 2002 (Amitabh Raturi, Jeffrey Camm, David Rogers)
The paper discusses and implements a capacity rationing policy that allows manufacturing firms encountering expected total demand less than available capacity to discriminate between two classes of products, one yielding a higher profit contribution than the other. The result is a selective rationing of orders, yielding an increase in total profit when compared to the base case that implements no capacity rationing. Implementation of the policy requires forecasts of demand parameters. The result indicates that, on average, the rationing policy is quite robust in improving the profits.

Aaron M. Freed, Empirical Test of a Stock Portfolio Optimization Model, February 11, 2002 (Jeffrey Camm, Martin Levy, Brian Hatch)
This empirical study examined a stock-portfolio optimization model, which did not use mean-variance statistics like the classic Markowitz Model. Instead, this optimization model optimized the weights of a stock portfolio by maximizing the number of periods a portfolio meets or beats a stock market index. Using randomly sampled data sets generated from the population of Standard & Poor's 500 (S&P 500) stock components, this study empirically tested the model by benchmarking the optimized weighted portfolios against evenly weighted portfolios. Further, the study explored the difference between using 12 monthly periods and 60 monthly periods of stock return data in the optimization of weights procedure. This study employed the use of standard statistical hypothesis tests for paired data to determine significance at an alpha level of five percent. With respect to the difference of success between optimized and evenly weighted portfolios with 'future' data, the tests indicated no significance for 12 monthly periods and significance for 60 monthly periods at a p-value of .024.

Chris Christopherson and Nicole Howerton, Increasing Profits while Decreasing Scrap, August 21, 2001 (Michael Magazine, Jeffrey Camm, Robert Gould)
Increasing Profits while Decreasing Scrap is a project designed to assist Technicote, Inc. in its ability to minimize trim loss. Technicote purchases large rolls of raw adhesive backed labels, hereafter referred to as master rolls, and cuts them into smaller labels according to customer specifications. Customers will, in turn, graphically enhance and affix these labels either to finished products or to packaging materials. The inherent problem in this industry is the ability to minimize the trim loss associated with the series of cuts that make up a customer order. Since master rolls may be spliced together, roll length is not a factor in this problem. In a true cutting stock problem, roll length is a variable as well as roll width. As a result, this problem is a slight variation of the cutting-stock problem. The solution to this minimization problem comes in the form of an Excel spreadsheet. Inputs include the master roll widths and the customer specifications, or ordered widths. Utilizing the Solver Add-In, the spreadsheet returns the widths that are to be cut from each master roll, giving the least amount of trim loss.

Feng Jiao, Reexamination of Shumpeter's Hypothesis: Market Concentration and R&D Expenditure, June 5, 2001 (Norman Miller, Martin Levy, Yan Yu)
It is true that government is more interested in competitive environment rather than monopolistic structure, with good reason. However, technology advancement requires a more concentrated market structure. Shumpeter's hypothesis shows that monopolistic structure is more conducive for technology development. Previous studies do not conclusively show that there is a significant relationship between market concentration and technology advancement. In this study, I confirm that Shumpeter's hypothesis has supporting evidence. More than that, I show that more concentrated market structure is more conducive to technology development, measured by firms' R&D expenditure. Looking further, I show that product technology is more significantly related to market concentration. The conclusion contrasts the known study. Looking at industry's characteristics, I find evidence that entry barrier is an important determinant that some industries choose more concentrated market structure. The causality tests show that market concentration is significantly related to firms' R&D expenditure.

Xiaoling Sun, A Study of the Use of Multiple Additive Regression Trees for Caravan Insurance Policy Prediction, April 27, 2001 (Yan Yu, Norman Bruvold, David Kelton)
Multiple Additive Regression Trees (MART) is a novel methodology for predictive data mining and can be applied in many areas such as in credit card companies, insurance companies, as well as mortgage companies. MART makes additive expansions in decision-trees and realizes the numerical optimization in function space instead of parameter space. A major advantage of MART over other classical methods (logistic regression and Classification and Regression Trees (CART)) is its robustness, accuracy, and immunity to the adverse effects of wide tails and outliers in the distribution of the predictor variables. This work focuses on an application to caravan insurance policy prediction with MART. This problem is initially motivated by direct mailing problems faced by many companies and raised by a competition aiming at finding out why customers have a caravan insurance policy and how these customers are different from other customers. Companies desire to have a better understanding of their potential customers so that they could target the customers more accurately and reduce the waste and expenses. In this thesis, we propose to apply MART, logistic regression, CART, two-step logistic and CART, and two-step logistic and MART to the caravan insurance policy prediction problem and to compare the results.

Sriram Kannan, Finding all Optimal Solutions to Covering Problems, April 2, 2001 (Jeffrey Camm, James Cochran [Louisiana Tech University], Dennis Sweeney)
Set covering and maximal covering problems are widely used by managers to model decision-making problems. These two problems are closely related and are both modeled as binary integer programs. Some common applications are in reserve site selection, location of facilities, and the list selection problem in direct mail advertising. Managers are often interested in obtaining multiple optimal solutions to these problems when they exist. The advantage of having multiple optimal solutions is that managers have flexibility in choosing an optimal solution based on factors not considered in the model. The factors, when built into these models can make them difficult to solve to optimality and in many cases, it may not be possible to build a model with all the important factors. Existing methods rely on the cut generation approach to obtain all the alternate optimal solutions to a given problem instance. These methods are not efficient and frequently fail to generate all the alternate optimal solutions in cases where numerous optimal solutions exist. We propose an algorithm that works in two phases. In Phase I, the principle of divide and conquer is employed to reduce the size of the problem. In Phase II, a backtracking algorithm strengthened by lagrangian and logic-based bounds are used to generate all the alternate optimal solutions. We apply this algorithm to the generalized set cover and maximal cover data sets that we have available from facilities location problems.

Timothy J. Shockley, A Simulation Analysis of Supply Chain Fill-Rate Models, March 15, 2001 (David Rogers, David Kelton, Michael Magazine)
In today's business environment, a large emphasis is being placed on making the supply chain more efficient. As E-Commerce becomes more prevalent, many companies understand that remaining competitive relies upon how quickly they can get their product to the end-user. The efficiency of the supply chain not only constitutes the delivery of product to the consumer, but it also includes how much inventory to hold and where along the supply chain to hold this inventory. Many mathematical programs may be employed to define the amount and location of inventory to hold based upon certain parameters such as the demand distribution at the retailer level and the various holding and penalty costs along the supply chain. However, many real-world situations are not based upon the same assumption set that may be assumed for the mathematical model. Therefore, it is necessary to test the optimal solutions of mathematical programs prior to implementing them into an organization. Simulating a real-world scenario allows an objective view of the effects of a variety of inputs. Simulation generally does not seek to find an optimal solution, but it will allow for the testing of an optimal solution from a perhaps oversimplified mathematical programming model within a real-world environment. It is important to build the simulation to represent the real-world as closely as possible in order to establish the validity of the model. When a simulation attempts to recreate a mathematical program, it is important to make the same assumptions in the simulation as was made in the mathematical program. A simulation for inventories in a one-warehouse, n-retailer (non-identical) case will be performed and the results compared to those from a common nonlinear mathematical programming model.

Svetlana Nikolaeva, Trading Rules and Stock Returns: A Simulation Analysis, July 21, 2000 (David Kelton, David Rogers, Gary Raines)
In this paper are tests for three popular trading rules used for technical analysis of securities trading: Moving Averages, Relative Strength Index, and Lane's Stochastics. Trading indicators are applied to simulated stock-price time series generated for six different market environments. Standard statistical analysis was used to test stock returns following buy and sell signals. Overall, the results provide support for all studied trading strategies: the returns following buy signals are higher than returns following sell signals. Moreover, the absolute difference between the sell and buy returns is higher for more volatile markets. The method developed in the paper can be used for preliminary testing of any stock-trading rule in any specific market environment.

Qiang Lin, A Survey of Power Analysis in Design of Experiments, July 21, 2000 (Martin Levy, David Rogers, Jeffery Camm)
This is a technical report summarized from the book 'Statistical Power Analysis for the Behavioral Sciences' by Jacob Cohen.  The power of a statistical test is the probability that we can reject the null hypothesis based on the sample results when the null hypothesis is false.  We want the power to be high so that when we cannot reject the null hypothesis based on the sample results, we know the probability of accepting the true null hypothesis is very low.  For some statistical tests, power analysis and sample-size analysis can be very complicated.  This report summarizes the methods of computing power values and the sample-size values to obtain ideal powers for different tests.  For each test, the definitions of important parameters and the computational methods are followed by illustrative examples.  SAS IML programs are provided for each example.  The power tables are not reproduced in the report because using SAS programs to compute power can be much easier than looking at a table.  This report can be used as a handbook to obtain power and sample-size values for different statistical tests.

Boris A. Orlov, An Analysis of Impact of Price Protection on Supply Chain Profits in Short-Cycle Industries, June 6, 2000 (Nikhil Jain, Michael Magazine, Sean Willems)
In short life cycle industries such as the personal computer industry, price and costs of the product decline rapidly over the product life cycle. Declining prices increase the cost of holding a unit of inventory. Without price protection distributors would hold less inventory increasing unsatisfied demand. Price protection assumes that manufacturer compensate to the distributor the difference in price if it declines for a specified period of time or for a proportion of units unsold. A two-period inventory model is developed in order to measure the influence of price protection on channel profitability in short life cycle industries. The level of price protection that maximizes the individual profits of the manufacturer and the distributor is different from the optimal level of price protection that maximizes the total profit of the supply chain. Examples are given to illustrate the impact of change in profit margins on the optimal level of price protection. Some implications for supply chain management are discussed briefly.

Reeja Marath, Integration of E-Business into the Supply Chain, June 6, 2000 (Ram Ganeshan, Michael Magazine, Nikhil Jain)
The idea of doing business electronically has been around for some period of time. Today many companies are moving away from Phone and Fax to the Internet. Companies have started using Web to communicate, and to achieve real business value by incorporating Internet technology into their core businesses. E- Business is about better customer service, integrating with the suppliers and partners and being able to expand the physical market through electronic means. It is about streamlining the current business processes that would in turn add the value that is provided to the customers. Organizations that succeed in grasping and adopting the new elements of web based E-Business have an edge over their competitors. This study has focused on the aspect of integration of E-Business into the supply chain. Case studies of various companies, which market varied products and offer distinct services and have implemented Web based E-Business in their firms has been presented. In this project, we have conducted a case study analysis on a Company X, a distributor of specialty goods located in Cincinnati. We have looked into the aspect of implementing web based E-Commerce in the firm by analyzing the existing system and providing a proposal for implementation of a new system. The objectives are to ensure that the front end Order entry and the back end i.e. inventory management, supplier and customer relationship management, forecasting is coordinated effectively through efficient integration of Information systems. Another goal is to transact business directly between the customer and the Company, with minimum response time and negligible overheard costs. To achieve these objectives a detailed study has been conducted on the existing bottlenecks and a proposal to overcome these pitfalls has been presented. A cost comparison based on the resources required software and hardware requirements between the existing and the proposed system has been provided. The methodology for implementation of the proposed system and the areas where the benefits would be achieved has also been described in the present study.

Eric W. Kramer, A Heuristic Method for the Honors Plus Program Interview Scheduling at the University of Cincinnati, June 2, 2000 (Michael Magazine, Norman Baker, Jeffrey Camm)
Timetabling is an area that has often been difficult for which to generate solutions. Many universities often have difficulty scheduling courses, exams, and other activities requiring the scheduling of various entities during specified time periods. The Interview Scheduling System is a problem that requires the scheduling of a series of interviews between employers and Honors Plus students at the University of Cincinnati's College of Business Administration. The students are freshmen that will have completed their first year of study in June and are filling positions as interns with companies in the greater Cincinnati area. In January of each school year, scheduling of companies to interview students begins. After the companies have been assigned interview times, the students are assigned interview times with the companies. Finally, after the interviews are conducted in February, students are assigned as interns with the various companies. A heuristic method is developed based on a set of integer programs for solving an interview-scheduling problem. The problem is formulated in terms of reducing conflict between interview times and student course schedules. The heuristic leads to a solution of an otherwise difficult problem to solve.

Gautam Dalvi, Finding All Optimal Solutions for the Generalized Set Covering Problem, May 19, 2000 (Jeffrey Camm, David Kelton, James Cochran [Louisiana Tech University])
The generalized set covering problem (GSP) occurs in development of optimal network of land sites for conservation of natural and biological resources. Since development of conservation network may involve purchasing/leasing sites from existing owners, an optimal solution obtained by solving the GSP may not be always feasible to implement within budget constraints. Consequently, during negotiations with site owners, the decision-makers must be aware of alternative ways for developing the network and relative importance of the sites in ensuring an optimal network, which is represented by their irreplaceability indices (IRI). Since IRI is the percentage of all optimal solutions in which the given site is present, we must determine all optimal solutions to GSP to compute IRI for any site. In this project, we study the percent reservation problem, which is formulated as a GSP, for the New South Wales National Parks and Wildlife Service of Australia. We first explore computational issues involved in determining all optimal solutions for the percent reservation problem. We then present a problem reduction technique and two algorithms, namely restrictive enumeration algorithm and replacement site algorithm, for estimating IRI. Problem reduction decreases the solution space and identifies some sites that are absolutely essential for the optimal network. Restrictive enumeration algorithm allows generation of new optimal solutions in a controlled way. Replacement site algorithm algebraically generates large optimal solutions in very short time algebraically. We present computational results using the above three algorithms for data sets provided by the park services and evaluate the efficacy of the algorithms in determining all optimal solutions.

Linda A. Hirsch, Telephone vs. Internet Interviewing - A Comparison of Scale Usage, May 19, 2000 (Norman Bruvold, David Rogers, Martin Levy)
For many years, telephone interviewing has been the cornerstone for data collection on countless marketing research studies. Now, however, with the influx of telephone management options (voice mail, answering machines, Caller ID, etc.) and rising refusal rates among those who can be reached, the research community must explore alternative means for collecting quality data. The Internet has the potential to be at the forefront of the next generation of data collection for the marketing research industry. As such, it is important to assess the quality of the results obtained from this medium. This research, which was conducted among individuals with access to the Internet, examines the similarities and differences between data collected via the Internet and data collected via telephone interviewing. Specifically, it explores participation rates, scale usage, and the impact of offering respondents a 'Don't Know' response on Internet surveys. This study also compares Internet and telephone interviews from the respondent's perspective by examining the extent to which they enjoyed the interview experience and their likelihood to participate in similar studies in the future.

Dapeng Cui, Archetypal Analysis and Its Applications in Business Research, February 7, 2000 (James Cochran [Louisiana Tech University], Jeffrey Camm, Martin Levy)
There are multiple statistical methods for analyzing multivariate data. This paper discusses and illustrates a recently developed multivariate technique, archetypal analysis, and explores its applications to business problems. Archetypal analysis, developed by Cutler and Breiman (1994), results from the need to find archetypal patterns, a mixture of which could well represent each observations in a data set. It also requires that archetypal patterns must be a mixture of the observations in the same data set. Archetypes are constructed by minimizing the squared error that results from representing each individual as a mixture of archetypes. Two applications to survey data in this paper show that archetypal analysis is valuable because it aids in identifying archetypal patterns in the data and analyzing and understanding the heterogeneity of consumers in a market. Another application of archetypal analysis to conjoint data, however, indicates that archetypal analysis is not always helpful in all respondents' data analysis probably due to the vast heterogeneity of consumer behavior. Limitations of archetypal analysis are analyzed and discussed.

Zaizai Lu, Infant Feeding Behavior and its Impact on Child's Health in China, January 27, 2000 (Martin Levy, Marcia Bellas, Jeffrey Camm)
This study examined the factors affecting children's feeding behavior in China, and the impact of children's feeding behavior on their health and growth conditions, using the 1993 China Health and Nutrition Survey Data. I selected 330 children age 3 or younger for the final sample. I used 222 of the children with feeding information in the final analysis. The data showed that living area, household income, mother's age, educational level, occupation, smoking or drinking habits do not have any significant effects on a child's feeding behavior. Father's smoking habit and occupation have a significant effect on the feeding behavior. A child's gender also significantly affects feeding behavior in the expected direction. The data does not support the argument that breastfed children have a lower body weight and height, nor does it support the argument that breastfed children are healthier than non-breastfed children. The data shows that a child's feeding behavior or duration of breastfeeding does not have any significant effect on his/her health status or growth indices. I recommend that a more representative sample with more complete and clean data be used in future similar studies.

Kenneth W. Schmahl, Application of an Unconstrained Multi-Product Newsboy Model for a Style Goods Business, September 8, 1999 (Amitabh Raturi, David Rogers, Michael Magazine)
Inventory analysis is critical to the profit and loss of many businesses; this is especially true in the style goods retail market. The fickle nature of fashion and fads make it important to accurately estimate inventory requirements in order to be successful in this industry. Because of the seasonality of fashion goods, it's critical to find a balance between overestimating and underestimating the demand for each season's inventory. A common technique for such an analysis is the newsboy problem. This paper examines the inventory requirements of a maternity wear rental business, Classic Maternity Sales and Leasing. An applied analysis of the multi-product, no constraint, newsboy inventory model has been utilized to examine the inventory needs of Classic Maternity. I will discuss how the newsboy model has been modified to meet the criteria necessary for the inventory analysis of such a business. I will also provide some sensitivity analysis to show the effects of overestimating or underestimating the demand for the maternity wear and salvage value of the maternity wear. Included in the paper is a brief comparative study of other literature relating to the newsboy problem and the extensions that have been made to it.

Molina Beck, Approaches to Handling Missing Data, August 31, 1999 (Martin Levy, Norman Bruvold, Jeffrey Camm)
Missing data occur in statistical analysis in most practical situations. They present a problem since the units with missing data represent an absence of information, so that overall there is a loss of information. For example, model selection and estimation for time series is based on the assumption that the time series is complete. However, in practice, this is not usually the case. Incomplete series should not be fitted with models as this can lead to a serious lack of fit, especially when the number of missing observations is large. For the same reason, it is also not advisable to simply omit the missing observations from the series. Further, most common software packages that are used for estimation, such as SAS, SPSS, or RATS will cause errors in data analysis with missing observations, since their procedures expect input data sets to contain observations for a contiguous time sequence. This poses the question of how to estimate a model for such data and how to estimate the missing observations if these values are of interest in themselves. Historically, missing data has been estimated in an ‘ad hoc' manner. The traditional approaches to estimation consist of either discarding the observations with missing values, or imputing them by replacing these values with the means of available observations, or by regressing the missing values on the observed values for a case, and replacing the missing values by the predicted values thus obtained. In recent years, researchers have advocated the use of model based procedures. A model is defined for the missing data, and inferences are based on the likelihood under that model, with parameter estimation being done by such procedures as maximum likelihood. This approach has the advantage of flexibility and the avoidance of ad-hoc methods, since model assumptions are known and can be evaluated. In order to maximize the likelihood function for these models, several iterative algorithms such as the Newton-Raphson algorithm, the EM algorithm and the Kalman filter are discussed and evaluated, both for univariate and multivariate data. The application of the EM algorithm in evaluating means and covariance matrices, in multiple regression, and in time-series data is also discussed. This project compares the various methods of estimating missing data for the purpose of statistical analysis. The first part of the project is a discussion and comparison of the different ways of estimating missing data, and the latter part consists of the practical application of one or more of these methods to the available data.

Kemal H. Sahin, Development of Scheduling and Waste Minimization Techniques for Batch Processing Plants, August 27, 1999 (Amitabh Raturi, Jeffrey Camm, Amy Ciric)
Batch processing is the preferred option for industries that produce a wide range of products in small amounts. The scheduling of the available processing equipment to satisfy demand for all products has been investigated in detail in operations research. Many of these methods have concentrated on optimizing economic performance. However, waste recovery, which can contribute to very large costs, has not been analyzed in detail for a combination of both economic and scheduling concepts. The aim of this project is to develop a method that will include waste recovery considerations in the scheduling of batch processes. Two different approaches will be used to analyze the effect of waste treatment costs. An aggregated approach will simultaneously determine the optimal schedule for both processing and waste treatment, while disaggregated methods will develop waste recovery schedules for processes after the optimization of the production section is completed. Both simultaneous and continuous approaches will be used for comparison purposes. In case of simultaneous operation, every time a product is generated, waste has to be treated, while the continuous operation examines the common practice of continuous waste treatment. The models are developed for a single product/single waste process, as well as a multi-product/multi-waste operation. Case studies have been used for determining the efficiency of both methods. The aggregate approach results in cost savings in the range of 6% over the disaggregated approaches but takes seven times longer for even small problems. For larger problems, aggregate approaches may be too complex and time consuming for realistic implementation.

Meghna Sinha, An Evaluation of Combined Ranking, Selection and Multiple Comparison Procedures in an Industrial Application, August 25, 1999 (David Kelton, Jeffrey Camm, James Cochran [Louisiana Tech University])
In the simulation literature, ranking and selection procedures have often been recommended for comparing system designs, particularly when the goal is to select the best design.  However, in empirical research multiple comparison procedures are commonly employed.  For example, the researcher interested in making pair-wise comparisons among the groups can do so by constructing a confidence interval for the difference between the performance measures of the pair of results.  The difference between ranking and selection procedures and multiple comparison procedures is analogous to the difference between hypothesis testing and interval estimation.  The former results in a decision, rather than an estimate, so it is less informative.  Typically, ranking and selection procedures provide inference only about the design selected as the best or one of the best in some sense.  Two-stage sampling or sequential sampling is needed to attain a pre-specified probability of selecting the best design.  In contrast, multiple comparison procedures provide inference about relationships among all system designs and can be implemented in a single stage of sampling, but they do not guarantee a decision.  However, when using simulation experiments to estimate the expected performance, the best system can neither be selected nor the differences between the systems be bounded with certainty.  In 1995, Nelson and Matejcik presented procedures that simultaneously control the error in selecting the best and in bounding the differences.  These procedures combine the standard indifference-zone selection procedures, that control the error when choosing the best, and the standard multiple-comparison procedures that control the error in making simultaneous comparisons.  The procedures assume that data are normally distributed, but they do not assume known or equal variances across systems.  In this paper we apply the simulation ranking technique and the multiple comparison procedure simultaneously, as proposed by Nelson and Matejcik, to compare three product-mix scenarios in a manufacturing plant.  The objective is to determine the optimal mix, where an optimal mix is one that allows all machines to remain idle for a minimum amount of time.  The results will also determine how much better the best mix is relative to each alternative.  We also compare the Bonferroni selection procedure to Nelson and Matejcik's new procedure, NM.  Both procedures exploit the use of Common Random Numbers (CRN) to reduce variance and hence reduce computation efforts.

Xinxin Liu, A Comparative Study of Neural Networks and Statistical Models for Customer Choice Modeling, June 18, 1999 (David Rogers, David Kelton, Norman Bruvold)
This paper is an empirical study intended to be a bridge between the behavioral and statistical lines of research in customer choice behavior.  The relationship between retail store characteristics and customer buying behavior from a choice set of two stores is explored using the following approaches: the conditional logit model and the neural network (NN) model.  Using a data set of 400 survey responses, a NN was created using store characteristic variables and its accuracy checked with a holdout sample.  The same was done for the conditional logit model.  The comparison of results revealed that the NN outperformed the conditional logit model in terms of predictive accuracy.  Sensitivity analysis was conducted for the NN model and managerial implications were outlined.

Brian L. Sersion, An Application of Optimization for Establishing a Landfill Sampling Network, June 4, 1999 (Jeffrey Camm, Amitabh Raturi, David Rogers)
Waste-management operations require significant capital expenditure for ground-water sampling of sanitary landfills. High costs associated with outsourcing make internalization of this service an attractive proposition. The facilities-location problem, in this context, involves determining the optimal number and location of sampling teams to service landfill customers. The solution process includes the completion of a customer survey and linear regression to estimate demand for a two-stage mixed integer linear program. The results of this study support a managerial recommendation for Browning-Ferris Industries' landfill-sampling network.

Sanjay Chadha, Analysis of Salaries of College of Business Administration Professors at the University of Cincinnati, March 25, 1999 (Martin Levy, Norman Bruvold, David Rogers)
In this project a regression model has been formulated that explains 80% of the variation in the salaries in CBA with some exceptions. The following three research hypotheses have been tested using regression: 1) The newly hired professors in the College of Business Administration are being offered higher salaries in comparison to the professors who are serving over the last 5-15 years. 2) There is a variable difference in terms of national origin in terms of salaries of professors. 3) There is a gender variable difference in terms of salaries of professors. The results are that the first hypothesis is accepted, while the other two hypotheses are rejected.

Lubov Skurina, Exchange Rates and the Value of Foreign Operations, March 18, 1999 (Yong Kim, David Rogers, Martin Levy)
In this study I examine the effect of exchange rates on the value of foreign operations.  I perform a pooled cross-sectional and time-series regression analysis of company data and include exchange-rate trend and volatility as independent variables.  The results indicate that the exchange-rate trend does not have a significant effect on the value of foreign operations, but the volatility of exchange rates has a significant negative effect.

Chay Hoon Lee,  The Relationship of Team Members' Cognitive Decision Styles and Team Performance, January 12, 1999 (Charles Matthews, David Rogers, Martin Levy)
In most organizations, teams play a central role in planning and strategic decision making (Gilad & Gilad, 1986). Although many studies have examined the influence of demographic characteristics on team performance, few have examined the cognitive decision making styles of team members that can also influence team performance. Hackman and Morris (1975) proposed that the extent to which the team uses that knowledge and skills of its members can influence the quality of a team's performance. Therefore, understanding the team members' cognitive decision-making styles that influence the team's effectiveness seems critical because teams can shape an organization's future through the decisions they make. The challenge for any organization is to try to maximize the level of teams' effort and knowledge brought to bear on the team performance. Thus, this paper explores the influence of cognitive decision-making styles of team members on performance.

Jeffrey D. Rieder, Estimating Store-Level Promotion Effects from Market-Level Data, December 10, 1998 (Norman Bruvold, Martin Levy, David Rogers)
The debiasing procedure outlined in "Using Market-Level Data to Understand Promotion Effects in a Nonlinear Model" (Christen, et al., Journal of Marketing Research, August, 1997) attempts to quantify both the direction and magnitude of the bias associated with market-level promotion effects. Since merchandising response functions are typically non-linear, and market-level data are aggregated linearly over a set of heterogeneous stores, market-level estimates of these response functions are often severely biased. Christen, et al. claim to be able to estimate the bias and provide a mechanism for reducing the bias through the application of regression analysis. This research applies the methodology outlined by Christen, et al. to a real world data set and, after some modifications and assumptions are incorporated to fit the methodology to the available data parameters, produces some encouraging results. Using regression analysis, the market-level bias is found to be a function of the marketing environment. The resulting regression model is then used to predict future merchandising responses.

Girish Kulkarni, Determination of the Optimal Routing for the Consumer Products Division of the University of Cincinnati, October 16, 1998 (Ram Ganeshan, Jeffrey Camm, George Polak [Wright State University])
The Master Plan for the University of Cincinnati envisages a pedestrian-friendly campus with open spaces to affect positively the quality student life on campus by creating an environment for an educational experience. One of the major issues is to reduce the conflict zones between pedestrians and service-vehicle traffic. The Consumer Product Division supplies soft-drink cans to vending machines to nearly 40 buildings on the West Campus. Duties include servicing these machines via three trucks on a predetermined schedule. This division, therefore, needed to re-investigate and realign their servicing and routing scheme to the new Master Plan. Using quantitative techniques, we helped the Consumer Products Division by: (1) Performing an efficiency analysis for the available vending-machine-demand data and made recommendations for a servicing schedule. (2) Using optimization techniques we recommended a servicing route that works with the above schedule, resulting in shorter travel times for the vehicles.

Shailesh Kulkarni, An Optimal Clustering Model for Cellular Manufacturing, August 31, 1998 (David Rogers, Jeffrey Camm, James Cochran [Louisiana Tech University])
In this paper the problem of simultaneous clustering of parts into part families and machines into machine cells in a cellular manufacturing context is addressed. A mixed integer linear programming model is developed for addressing the problem. This model is then solved using conventional branch-and-bound procedures for small-sized problems. Considering the NP-complete nature of this class of problems, a genetic algorithm-based solution procedure is developed to solve realistically-sized problems of larger dimensions. Two problems from the literature are solved using the genetic algorithm. The attractiveness of the proposed model and the solution procedure to provide simultaneous grouping of parts and machines is evaluated on the basis of grouping efficacy.

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

Amanda R. Angle, Fill-Rate Optimization Models for Supply Chain Systems, June 24, 1998 (David Rogers, Michael Magazine, Ramk Ganeshan)
Multi-echelon inventory management is very important when attempting to influence the performance of a supply chain. Formulation of a complete inventory model often requires more than just attempting to achieve minimal inventory levels to reduce holding costs. Customer satisfaction must be taken into consideration or the cost of lost sales could outweigh any inventory costs. In this paper, four models of multi-echelon inventory systems in which several finished goods are produced from a common component are considered. These models are for optimizing base stock levels when there is a penalty cost of having a backorder. Fill-rate consideration is employed to measure the customer service level. The first model is for maximizing the fill-rate subject to a budget constraint on holding costs. The second model is for minimizing the expected number of backorders subject to a budget constraint and a fill-rate constraint. The third model is for minimizing the penalty costs of backorders and inventory holding costs subject to a fill-rate constraint. In the last model, penalty costs of backorders and inventory holding costs are minimized subject to a budget constraint on holding costs and a fill-rate constraint. In the results section we will assume that demand is to be normally distributed in all the models and a non-linear optimization model is used to determine base stock levels.

Joga R. Palutla, Minimizing Maximum Lateness In a Family Single Machine Scheduling Problem, June 11, 1998 (Michael Magazine, James Cochran [Louisiana Tech University], Amitabh Raturi)
This paper studies the problem of scheduling jobs on a single machine in order to minimize the maximum lateness.  The jobs are grouped according to processing requirements in families.  The problem is NP-hard and computationally intensive.  Heuristics are the only feasible means of solving large problems.  The paper describes several existing heuristics and analyzes heuristic performance relative to one another and the optimal.  Lower bounds are developed in place of the optimal solution in this analysis.  The paper attempts to determine the best heuristic for a given set of problem parameters and its closeness to the optimal solution.

Nawal K. Roy, Risk Management: Exploration of Value at Risk, May 29, 1998 (David Rogers, Martin Levy, Ram Ganeshan)

Risk management is the fastest growing field in the investment and financial industry. This paper covers the most sophisticated methodology of risk management, i.e. Value at Risk modeling. As an overview paper, it deals with all the issues related with Value at Risk modeling: different methodologies for estimating the VaR parameters, its highlights and shortfalls, and the regulatory status. It also discusses the statistical model of J.P. Morgan's RiskMetrics and the expected recent development (the course of future research) in the field of Value at Risk modeling.

Ronald N. Gnau, A Comparison of Logistic Regression and Discriminant Analysis as Classification Techniques, May 26, 1998 (Martin Levy, David Rogers, Norman Bruvold)
Strategic marketing in modern business organizations involves three key elements: segmenting, targeting, and positioning. The development of a sound marketing strategy in today's competitive environment is barely possible without the use of multivariate statistical analysis. Two multivariate techniques that can be useful in assigning customers to the most appropriate market segment are logistic regression analysis and discriminant analysis. Each technique makes assumptions about the type of data used in variables. The independent variables in logistic regression models can be categorical, whereas discriminant models generally require that the data for independent variables come from normal populations with identical covariance matrices. This empirical study applies both techniques to the same data to classify customers to market segments, and compares the performance of each technique on the basis of classification accuracy.

Glenn A. Dahl, Core Carrier Selection: A Comparison of Solution Approaches, May 8, 1998 (Jeffrey Camm, Martin Levy, David Rogers)
In this work the problem of choosing preferred transportation companies for shipping, called core carriers, is examined. Optimal selection is treated as an extension of the Maximal Set Covering Problem. Three versions are examined. In the first model the desired core coverage is expressed as a percentage of total coverage, and all decision variables are binary. The second model is a relaxation of the binary restriction in the first model for the variables that represent lane assignments. A simple rounding heuristic is used to convert fractional solutions to integer. In the third version is a goal-programming weighting method: total core load is treated as a goal, allowing for the removal of the coverage constraint from the model.

Detelina Marinova, Between Strategic Intent and Inertia: Tracing Individual Knowledge Structure Evolution in Organizations, April 15, 1998 (Martin Levy, Murali Chandrashekaran, David Rogers)
Though organizations often employ multifunctional teams in strategic decision making to ensure maximal information dissemination in the organization, the alleged benefits of teams are seldom realized.  The central objective of this paper is to explore the process underlying individual learning in group settings, and to secure an understanding of why groups often do no produce extensive collaborative efforts.  Accordingly, we develop a conceptual model that traces individual behavior as well as knowledge structure evolution in group settings.  Our central thesis is that despite the strategic intent of each decision maker to make a 'good' decision and choose the 'best' course of action from a set of alternatives, communication with group members is likely to be shaped by the balance of intent and inertia.  As a result, communication flow in groups and, hence, individual learning is likely to proceed in a selective fashion.  We further identify possible drivers of inertia and propose hypotheses about their effect on individual knowledge structure evolution as well as on communication and influence in groups.  Econometric analyses of data obtained from a longitudinal field experiment converge to strongly support our conceptual model.

Jun Zhou, Low Birth Rate Prediction Models for the State of Ohio, April 3, 1998 (David Rogers, Martin Levy, Edward Donovan)

Low Birth Weight (LBW) prediction models were built based on the Ohio birth certificate data from 1993 and 1994. Maternal age, education level, smoking, alcohol consumption, pre-pregnancy weight, race, fetus gender, marital status, and pre-term medical complication were found significant in the logistic model. The study also showed that some interaction terms between the main factors made significant contribution to LBW. Two logistic models were built and they were validated by the 1995 birth certificate data. The model provided a quantitative tool to direct the limited resource to high-risk population of LBW in order to achieve a more cost-efficient and economical prevention result.

Srilatha S. Sekaripuram, Distribution Planning in Supply Chains - The Equal Periods of Supply (EPOS) Approach, March 26, 1998 (Ram Ganeshan, David Rogers, Michael Magazine)
One of the important challenges facing a distribution manager is the effective control of inventory.  Inventory is necessary and useful but too much inventory is expensive.  If improperly managed, inventories become a significant liability, resulting in a reduction of profit and possible erosion of the competitive advantage of the firm.  Hence, determining the proper inventory-management technique is important for the firm.  Distribution resource planning (DRP) is a computerized tool that has been aiding distributors for planning and solving some of the inherent problems in statistical ordering techniques.  Equal periods of supply (EPOS) is a DRP approach to schedule replenishments for multiple products.  Use of EOPS with DRP helps to reduce overall costs, keep inventory in check, and make planning convenient.  In this paper a heuristic method by which a distribution planner can incorporate the EPOS approach into DRP will be presented.  Using this method results in optimal costs in situations where the transportation costs dominate.

Timothy J. Cantor, Evaluating A Taxonomy of Supply Chain Management Research, November 7, 1997 (Ram Ganeshan, Michael Magazine, Amitabh Raturi)
As we approach the twenty-first century, the evolution of emerging management practices continues to unfold. Supply-chain management is one of the more rigorously debated movements. Supply-chain management covers the flow of goods from supplier through manufacturing and distribution chains to the end user. While not difficult to define, its complexity makes for uncertain boundaries and abstract scope. The areas where the discipline has been researched and those where opportunities exist must be identified. By providing a taxonomical understanding, it is determined that at least four such opportunities exist. The taxonomy is of a hierarchical nature consisting of two principal levels. At the strategic level, papers generally deal with the means by which objectives and policies should be developed. At the operational level, authors explore the efficient operation of an established aspect of the chain. Both of these principal levels can then be divided horizontally. At the strategic level, this split resulted in the sub-levels designated explanatory essays and system representations. At the operational level, the sub-levels coordination analysis and material flow analysis resulted. Finally, each of these sub-levels can be further segregated into categories that by which a biased selection of current supply chain management literature are classified.

William Pordan, Evaluating NFL Quarterback Performance Efficiency Using Data Envelopment Analysis, July 10, 1997 (Michael Magazine, Jeffrey Camm, James Evans)
Managers are often faced with evaluating the performance of numerous operating units which produce multiple products and services. Comparison analyses can be desirable for identifying which units are performing at an efficient level, and which units are utilizing resources in an inefficient manner. This task becomes difficult when there exists no proper valuation mechanism for determining the worth of one product relative to another, or when expended resources are not readily priceable. A mathematical programming method known as data envelopment analysis (DEA) has been applied to such situations in performance assessment. DEA allows each operating unit to assign a unique set of weighting factors to its outputs and inputs so as to maximize its efficiency ratio. Constraints on the weight selections lead to the identification of relatively efficient and relatively inefficient units. This research project presents an overview of the theory and formulation of data envelopment analysis, and offers an application of its use in evaluating the performance efficiency of 1996 National Football League (NFL) quarterbacks. The production of each is ranked based on his DEA efficiency score, and a comparison is made with the NFL passer rating system currently used by the league.

Christopher M. Lynd, Heuristic Solution to a Baseball Scheduling Problem, July 2, 1997 (Michael Magazine, Jeffrey Camm, James Evans)
Heuristic techniques and mathematical programming have often been at odds with one another. The mathematical-programming camp preaches global optimization, whereas the heuristic camp preaches tradeoffs. The question of which method to use should be decided on an individual problem basis. Some problems, especially large combinatorial problems, lend themselves to heuristic techniques. For instance, mathematical-programming techniques such as branch and bound and dynamic programming perform essentially no better than does complete enumeration for NP-hard problems like the traveling salesman: (N-1)!/2. Users and developers must weigh the costs of global optimization, whether it be computing time, software or development dollars with the resulting benefits. In this paper, I define an NP-hard baseball-scheduling problem. I outline three different approaches to solving the problem: two heuristic techniques and one mathematical-programming technique. The two heuristics employed are tabu search and genetic algorithms. The mathematical-programming technique being used is integer programming. I present the results and outline the advantages and disadvantages of each technique.

Ian Clough, Body Image2: Data Analyses, July 1, 1997 (Martin Levy, Terri Byczkowski [Cincinnati Children's Hospital], David Rogers)
This document is a report of a number of statistical analyses performed on a variety of data sets. Programs and computer output have been included in the appendices. The work was performed over a ten-month period.

Amy M. Anneken, Applying GIS and Benders' Partitioning to the Uncapacitated Facility Location Problem, June 12, 1997 (Dennis Sweeney, Jeffrey Camm, David Curry)
Facility location problems are very important and practical in business decision making today. The facility location model concerns finding locations to serve customers in an economical and high quality way. This project aims at providing a way to solve these types of problems in a manner that incorporates both objective and subjective means. The objective of this project is to explore the potential for an algorithm that involves both human and mathematical iterations. The problem studied is the Uncapacitated Facility Location problem. A Geographical Information System is used to assist the human decision maker in selecting good solutions. A Benders' partitioning algorithm is used to generate bounds and to suggest alternatives for the decision maker. A geographic computer interface that serves as a front end to an Operations Research algorithm has many advantages. Finding an optimal solution to a problem is the best alternative, but many companies never do this because they do not have the time or the expertise to do so. The results from this project can provide many benefits to both the business and OR community.

Angela Bansal, Discounts/Premiums on Country Funds - Time Series and Multivariate Analysis, June 3, 1997 (Martin Levy, David Rogers, Yong Kim)
In this paper, the time-series behavior of discounts/premiums of closed-end country funds is examined by using the models of Hardouvelis, La Porta, and Wizman(1993). The results show that most of the funds of emerging markets trade at a premium. This premium has predictive power for fund return but not for its nest asset value returns. Results also show that country funds are good diversification tool for US investors and at least three local stock markets are cointegrated with the US market.

Rajdeep Grewal, The Long Run Advertising-Sales Relationship: Incorporating the Impact of Economic and Political-Legal Environments, May 12, 1997 (Martin Levy, Jeff Mills, Raj Mehta)
A methodological framework for investigating marketing parameter functions with time varying coefficients is adopted, to investigate the relationship between market performance (e.g. sales, market share), marketing effort (e.g. advertising, sales promotion), and environmental conditions (e.g. market growth, inflation). The nine-step framework relies on recent methodological developments in the econometric and time series (ETS) literature to present a sequence of statistical tests and estimation techniques. The authors elaborate on the framework to provide a rationale for expecting specific behavior by marketing performance variables, marketing effort variables, and environmental variables. Further, the authors illustrate the framework for the famous case of the Lydia Pinkham Medicine Company.

Jennie Bao Jin, A Markov Chain Analysis of the New York Stock Exchange Composite Index, May 2, 1997 (David Rogers, Martin Levy, Norman Bruvold)
The behavior of stock-market prices has been researched extensively via different empirical methods (Fama 1970, Poterba and Summers 1988, Fama and French 1988, Fama 1991). Whether certain price trends and patterns exist to enable the investor to make better predictions of the expected values of future stock market prices is still debatable. A number of researchers have shown that both the relative strength of a security in the market and the nature of its successive price movements may be interpreted with the framework of Markov theory (Dryden 1969, Fielitz and Bhargava 1973, Fielitz 1975, Mcqueen and Thorley 1991) and these studies are modeled in such a way as to provide useful information to the individual investors and portfolio managers concerning stock-market movements. While most of the previous work in the area has been done in the individual-security setting, I investigate the relevant Markovian behavior with the entire stock market, which is represented by the NYSE Composite in this project. Relatively new data (from 1985 to 1995) are used to test and formulate both a first-order three-state (up, unchanged, and down) and a first-order two-state (up and down) Markov-chain model based on daily price changes of NYSE Composite. Statistical inferences are conducted to test whether the NYSE Composite movements are random, which means the probabilities for the stock-market price's going up or down on a daily basis are the same. The organization of the paper is as follows: Section II is a brief review of the literature on Markov chain analysis of security prices. Section III is a description of the methodology and data used in this project. In Section IV the three-state Markov chain model is formulated and estimated. In Section V the two-state Markov chain model is estimated and a statistical inference test regarding the hypothesis of randomness of stock market movements is conducted. Section VI is a summary and conclusion of the paper.

Himani Mohan, Application of Simulation Techniques in Operations Analysis and Facility Design, December 4, 1996 (David Kelton, David Rogers, Jeffrey Camm)

Marie D. Lane, Capacity Planning in the Machine Tool Environment: A Case Study of Ahaus Tool & Engineering, Inc., August 15, 1996 (David Rogers, Jeffrey Camm, Amitabh Raturi)
Issues that affect resource and production planning in the machine-tool industry are discussed in this paper. One company and its particular operating characteristics will be the focus of the paper. Suggestions are made on improvement possibilities to their production-planning system. These suggestions are made based on a literature search of the variety of production-planning systems and models that are available, as well as this researcher's opinions from observations of the company's operating practices and discussions with the company's management. A goal-programming model was developed that can be used as a part of the production-planning process.

Laura Miser, Enrollment Projection Models at the University of Cincinnati, August 12, 1996 (Martin Levy, Jeffrey Camm, Corey Brewer)

Gregory A. Graman, The Effect of Variation in the Intermediate Delay on the Solution to the Multi-Echelon Inventory Problems with Newsboy-Style Results and Backorder Optimization, March 4, 1996 (David Rogers, Jeffrey Camm, Martin Levy)
The statistics of variance and standard deviation are used in many disciplines to provide a measure of the level of uncertainty that exists in a wide variety of situations and studies. The uncertainty of the intermediate delay in a multi-level inventory problem with newsboy-style results and a objective of minimizing backorders is examined. An expression for the standard deviation is derived, and implementation of these results is revealed.

Thomas Osterhus, Development and Testing of an Integrated Model of Conservation Behavior, July 1995 (Martin Levy, Jeffrey Camm)

Mary J. Frey, A Discussion and Analysis of Mathematical Modeling Techniques for the Location of Retail Establishments Using Geodemographic Data, Autumn Quarter 1994 (Jeffrey Camm, David Curry, Dennis Sweeney)

Bernard B. O'Bryan, An Evaluation of Software System Designs Using Data Flow Diagrams, Data Dictionaries and Mini-Specifications, 1993 (Roger Pick)
An evaluator for an experiment involving software engineering discusses his part in the project.  The experiment had the evaluator -- without prior knowledge of the experiment (blind) -- rate some data-flow diagrams, data dictionaries, and mini-specifications of some software project performed in teams.  The 'blind' evaluations were then used to rate the effectiveness of using computer-assisted software engineering (CASE) technologies.  Ten three-person teams, composed of undergraduate information-systems majors, independently developed a software product -- a Pascal pretty printer.  Four teams used the same automated CASE software, while the remaining teams did not use an automated CASE software.  The major results of this experiment were (1) those teams that used the automated CASE software were able to code the programs in less time than those who did not, (2) all of the teams using automated CASE software were able to meet more of the requirements than those who did not use the software, and (3) the quality measures of the CASE-group designs were rate superior to the non-CASE-group design.  Also, some literature is reviewed to give the reader a point or reference on data-flow-diagrams, data dictionaries, and CASE tools in general.  Further, some biographical data on the 'blind' evaluator (the author) is also included.

Patricia Laber, ACL Knee Brace Design Study: Data Analysis, September 3, 1993 (Martin Levy,  Jeffrey Camm)

Stephen E. Kelley, A Multiple Regression Model Used to Predict Indicated Airspeed, May 24, 1993 (Jeffrey Camm, Martin Levy, David Rogers)

William Milligan, Assessment of Collective Bargaining Issues with Sample Survey Methods - Design, August 3, 1992 (Martin Levy, Thomas Innis [Adjunct Associate Professor])

Karen Averbeck, Assessment of Collective Bargaining Issues with Sample Survey Methods - Analysis, August 3, 1992 (Martin Levy, Thomas Innis [Adjunct Associate Professor])

Deryck Lampe, On the Analysis of a Repeated Measures Design, June 11, 1992 (Martin Levy, David Rogers)

Jo A. Gallagher, Rating of Designs for a Study on Computer Assisted Software Engineering, July 17, 1991 (Roger Pick, Jeffrey Camm, Timothy Sale)

Barbara C. Zellner, Using Aggregation Methods to Solve Single-Commodity Transportation Problems, May 1990 (James Evans, David Rogers, Jeffrey Camm)
Many companies must routinely solve transportation problems.  However, because of time and hardware constraints, these problems are often not solved to optimality.  In many cases, the problems are not modeled.  This paper examines single-commodity transportation problems solved to optimality using a personal computer.  Aggregation is used to convert the original problem into a two-source transportation problem.  After solving the modified problem to optimality, the solution is disaggregated and used as a starting solution to the original problem.  The time to reach optimality using this two-step method is compared to the computational time of using a poor starting solution in the original problem and solving to optimality in one step.  Various methods of aggregation are used and discussed.

Calvin Taylor, Multivariate Analyses of Telephone Company Data, August 2, 1989 (John Bryant, Martin Levy)

Yiching Lee, Bayesian Approach to Testing Equilibrium in a Segmented Line Model, 1987 (John Bryant, Martin Levy, Jeffrey Camm)

Steve Nielsen, Cost Reduction of Paper Manufacturing Through Quality Control of Pulp Production, December 11, 1987 (Jeffrey Camm, David Anderson)

Mark Kleinhenz, An Algorithm Using Aggregation to Solve a Large Scale Linear Programming Problem to Optimality, June 26, 1986 (James Evans, David Rogers, Jeffrey Camm)
As Lasdon has remarked, the solution of linear-programming problems is often hampered by size -- the problem is simply too big. The cost of supercomputers and technological limitations are two reasons for the difficulty in solving such problems. Supercomputers, such as those manufactured by Cray of Minnesota, can cost between five and 15 million dollars. But even supercomputers are limited in the size of problems that they can solve, although these limitations are continually being extended by technological advances. The need for solutions to large-scale linear-programming problems has inspired the development of several solution strategies. One such strategy is that of aggregation. After developing a smaller but similar problem to the original problem through the clustering of the latter's columns or rows, this aggregate problem is solved obtaining a solution that is 'close' to that of the original problem. Techniques have been described for reformulating the aggregate problem and improving its accuracy of solution. In this paper the technique of aggregation is applied as a step in an algorithm to solve a large-scale general linear-programming problem to optimality. The format of this paper is as follows. The experimental design and the method of problem generation are presented in Section I. Section II describes the algorithm and a sample problem is solved to illustrate the working of the algorithm. Section III details the computer software and hardware employed in the research project: computational results are presented as well. Section IV is an evaluation of the algorithm using the quality of the basis as the criterion. A related issue is addressed in Section V: the question of whether the use of even-weighting or the use of weighting provides better quality of solution after solving the aggregate problem. In the final section, conclusions are drawn and further work is suggested.

Joel I. Kahn, Analysis of Automatic Warehousing System Operating Policies, June 1981 (James Evans)

Sharon Hannig-Smith, An Airport Passenger Processing Simulation Model, January 1981 (James Evans)

News And Events