Working papers

Itai Gurvich and Jan A. Van Mieghem, "Collaboration and multitasking in networks: Aligning task priorities and collaboration levels,"
Motivated by the trend towards more collaboration in team work, we study networks where some tasks require the simultaneous processing by multiple types of multitasking human or indivisible resources. The capacity of such networks is generally smaller than the bottleneck capacity. In Gurvich and Van Mieghem (2015) we proved that both capacities are equal in networks with a hierarchical collaboration architecture, which define a collaboration level for each task depending on how many types of resources it requires relative to other network tasks. This paper studies how task prioritization impacts the capacity of such hierarchical networks using a conceptual queuing framework that formalizes coordination and switching idleness. To maximize the capacity of a team, highest priority should be given to the tasks that require the most collaboration. Otherwise, a mismatch between priority levels and collaboration levels inevitably inflicts a capacity loss. We demonstrate this essential tradeoff between task prioritization and capacity in a basic collaborative network and in parallel networks. To manage this tradeoff, we present a hierarchical threshold priority policy that balances switching and coordination idleness. View paper.

Lu Wang, Itai Gurvich, Jan A. Van Mieghem and Kevin O'Leary "Productivity Losses due to Coordination: An Empirical Study of Generalists in a Hospital,"
Professional work often involves generalists who act as hubs, discussing and consulting with specialists to gather information and make decisions. Coordination is critical when tasks, like consultations, require the simultaneous collaboration of multiple people. Professionals typically have their individual tasks that may be preempted to switch to collaborative tasks. Task switching can introduce setup times, often mental and unobservable, when resuming the preempted task and reduce a generalist’s labor productivity. We analyze the productivity loss from coordination in a field study of hospital medicine general physicians– “hospitalists”–at Northwestern Memorial Hospital in Chicago. A hospitalist’s patientcare routine includes visiting patients and consulting with specialized care professionals to guide patient diagnosis and treatment. We introduce an episodal workflow model that captures the coordination dynamics — each switch and the episode of work it preempts. A rigorous empirical analysis is presented using a data set assembled from direct observation of physician activity and pagerlog data. We show that more switches between documentation and communication causally lead to longer documenting time, and that the cause of the switch matters: when the switch is triggered by the hospitalist the setup impact is smaller. We estimate that a hospitalist incurs a total setup time of 5 min per patient per day. This is substantial: Caring for 14 patients per day, a hospitalist spends more than one hour each day on mental setups, which translates to about 20% productivity loss. In contrast to a static snapshot of collaboration, our episodal workflow model could be deployed in other professional settings where resource coordination is paramount. View paper.

Junfei Wang and Itai Gurvich "Beyond heavytraffic regimes: Universal bounds and controls for the singleserver queue,"
Brownian approximations are widely studied because of their tractability relative to the original queueing models. Their stationary distributions are used as proxies for those of the original queues that they model and the convergence of suitably scaled and centered processes provides mathematical support for the use of these Brownian models. To establish convergence, one must impose assumptions directly on the primitives or, indirectly, on the parameters of a related optimization problem. These assumptions reflect an interpretation of the underlying parametersa classification into socalled heavytraffic regimes that specify a scaling relationship between the utilization and the arrival rate. From a heuristic point of view, though, there is an almost immediate Brownian analogue of the queueing model that is derived directly from the primitives and requires no (limit) interpretation of the parameters. In this paper we prove that for the fundamental M/GI/1+GI queue, the direct intuitive (limitless approach) in fact works. The Brownian model is universally (i.e., across regimes and patience distributions) accurate. It maintains the tractability and appeal of the limit approximations while avoiding many of the assumptions that facilitate them. In the process of building mathematical support for the accuracy of this model, we introduce a framework built around ``queue families'' that allows us to treat various patience distributions simultaneously, and uncovers the role of a concentration property of the queue. View paper.

Itai Gurvich, Jan A. Van Mieghem, Lu Wang and Nicholas Soulakis " Digital and Physical Team Interaction: How Team Size and Stability Determine Individual Productivity,"
Teamwork involves interpersonal communication and digital documentation. We quantify and benchmark the impact of interruptions, team size and team stability on a team member’s productivity. We present a linear “team evolution model” of the productivity of a member who acts as an information hub. In our healthcare field study, this focal member is a hospitalist who coordinates patient care among a physical team, captured by observational study, and a digital team, captured in a patient’s Electronic Health Record. Our model predicts that 19.5min (73%) of a hospitalist’s time to process an average patient of 26.7min per day is determined by teamwork characteristics. As a benchmark, eliminating interruptions could improve the 26.7min by 3.5min (13%) while minimizing team size and maximizing team stability improves by 7.8min (29%). Paper available upon request.

Seung Bum Soh and Itai Gurvich " Duality in staffing problems: Between holding costs and waiting constraints,"
There are two alternative ways to capture the tension between capacity expenses and customerdelay costs in staffing problems. The cost paradigm assigns a price tag to customer delay and optimizes the combined costs of staffing and waiting. The constraint paradigm, in contrast, replaces the waitingtime cost with constraints and seeks to minimize staffing costs subject to these constraints. The duality of these two formulations is important for both the implementation of delay costs through constraints (e.g., specifying constraints in a contract) and for the reverse engineering of the dollar value that a provider, solving a given constraint formulation, assigns implicitly to customer delay. In the singleclass queue, this duality is a simple matter: the optimal tradeoff of capacity and delay can be implemented via a staffing problem with, for example, average waiting constraints. Given the waitingtime constraints that a provider uses, we can figure out the underlying implicit delay costs. In the multiclass case—where one must determine both the optimal staffing and the optimal prioritization—things become more involved. Strictly convex costs can be reliably implemented by any strictly convex constraints. Linear waiting constraints, while common in practice, do not provide a “safe” implementation of any simple cost structure. They can be made safe, however, by augmenting them with a variance constraint. When seeking to reverse engineer constraints to costs, strictly convex constraints are straightforward. Linear constraints are, however, not uniquely reversible, and strictly concave constraints cannot be an implementation of any strictly increasing waiting costs. Finally, since strictly convex costs have multiple implementations through constraints, it is desirable to propose a “best” implementation. We numerically study the robustness of different implementations. View paper.
Publications

Dennis J. Zhang, Itai Gurvich, Jan A. Van Mieghem, Eric Park, Mark Williams and Robert Young "Hospital Readmission Reduction Program: An Economic and Operational Analysis," forthcoming in Management Science.
 1st place in the 2014 POMS College of Healthcare Operations Management Best Paper Award.
The Hospital Readmissions Reduction Program (HRRP), a part of the US Patient Protection and Affordable Care Act, requires the Centers for Medicare and Medicaid Services (CMS) to penalize hospitals with excess readmissions. We take an economic and operational (patient flow) perspective to analyze the effectiveness of this policy in encouraging hospitals to reduce readmissions. We develop a gametheoretic model that captures the competition among hospitals inherent in HRRP's benchmarking mechanism. We show that this competition can be counterproductive: it increases the number of nonincentivized hospitals, which prefer paying penalties over reducing readmissions in any equilibrium. We calibrate our model with a dataset of more than 3,000 hospitals in the United States and show that under the current policy, and for a large set of parameters, 4% to 13% of the hospitals remain nonincentivized to reduce readmissions. We also validate our model against the actual performance of hospitals in the three years since the introduction of the policy. We draw several policy recommendations to improve this policy's outcome. For example, localizing the benchmarking process  comparing hospitals against similar peers  improves the performance of the policy. View paper.

Seung Bum Soh and Itai Gurvich "Callcenter staffing: Servicelevel constraints and index priorities," forthcoming in Operations Rsearch
Services, and call centers in particular, often reflect the relative importance of different customer segments through qualityofservice (QoS) targets. These targets, in turn, affect how customers are prioritized as well as the minimal capacity required to satisfy them. Some callcenters heuristically rely on the flexible family of index rules for customer prioritization: the specific parameters of the index rule are adjusted to reflect the QoS targets. In this paper we study the fit of heuristic to QoSconstrained staffing problems. We prove that, despite their appeal, index rules are overconstraining (and, hence, suboptimal) for an interesting reason. While QoS targets are typically marginally defineda target per segmentrestricting attention to index rules is equivalent to imposing a demanding constraint on the joint distribution: requiring that a VIP customer always waits less than a regular customer who arrives at the same time. View paper.

Itai Gurvich and Jan A. Van Mieghem "Collaboration and Multitasking in Networks: Architectures, Bottlenecks and Throughput," Manufacturing and Service Operations Management (M&SOM), 17(1), 1633.
Motivated by the trend toward more collaboration in work flows, we study networks where some activities require the simultaneous processing by multiple types of multitasking human resources. Collaboration imposes constraints on the capacity of the process because multitasking resources have to be simultaneously at the right place. We introduce the notions of collaboration architecture and unavoidable bottleneck idleness to study the maximal throughput or capacity of such networks. Collaboration and multitasking introduce synchronization requirements that may inflict unavoidable idleness of the bottleneck resources: even when the network is continuously busy (processing at capacity), bottleneck resources can never be fully utilized. The conventional approach that equates network capacity with bottleneck capacity is then incorrect because the network capacity is below that of the bottlenecks. In fact, the gap between the two can grow linearly with the number of collaborative activities. Our main result is that networks with nested collaboration architectures have no unavoidable bottleneck idleness. Then, regardless of the processing times of the various activities, the standard bottleneck procedure correctly identifies the network capacity. We also prove necessity in the sense that, for any nonnested architecture, there are values of processing times for which unavoidable idleness persists. The fundamental tradeoff between collaboration and capacity does not disappear in multiserver networks and has important ramifications to servicesystem staffing. Yet, even in multiserver networks, a nested collaboration architecture still guarantees that the bottleneck capacity is achievable. Finally, simultaneous collaboration, as a process constraint, may limit the benefits of flexibility. We study the interplay of flexibility and unavoidable idleness and offer remedies derived from collaboration architectures. View paper.

Itai Gurvich "Diffusion models and steadystate approximations for exponentially ergodic Markovian queues," The Annals of Applied Probability 24(6), 25272559.
Motivated by queues with many servers, we study Brownian steadystate approximations for continuous time Markov chains (CTMCs). Our approximations are based on diffusion models (rather than a diffusion limit) whose steadystate, we prove, approximates that of the Markov chain with notable precision. Strong approximations provide such “limitless” approximations for process dynamics. Our focus here is on steadystate distributions, and the diffusion model that we propose is tractable relative to strong approximations. Within an asymptotic framework, in which a scale parameter n is taken large, a uniform (in the scale parameter) Lyapunov condition imposed on the sequence of diffusion models guarantees that the gap between the steadystate moments of the diffusion and those of the properly centered and scaled CTMCs shrinks at a rate of √n. Our proofs build on gradient estimates for solutions of the Poisson equations associated with the (sequence of) diffusion models and on elementary martingale arguments. As a byproduct of our analysis, we explore connections between Lyapunov functions for the fluid model, the diffusion model and the CTMC. View paper.

Itai Gurvich and Amy Ward "On the dynamic control of matching queues," Stochastic Systems 4, 145.
We consider the optimal control of matching queues with random arrivals. In this model, items arrive to dedicated queues, and wait to be matched with items from other (possibly multiple) queues. A match type corresponds to the set of item classes required for a match. Once a decision has been made to perform a match, the matching itself is instantaneous and the matched items depart from the system. We consider the problem of minimizing finitehorizon cumulative holding costs. The controller must decide which matchings to execute given multiple options. In principle, the controller may choose to wait until some “inventory” of items builds up to facilitate more profitable matches in the future. We introduce a multidimensional imbalance process, that at each time t, is given by a linear function of the cumulative arrivals to each of the item classes. A nonzero value of the imbalance at time t means that no control could have matched all the items that arrived by time t. A lower bound based on the imbalance process can be specified, at each time point, by a solution to an optimization problem with linear constraints. While not achievable in general, this lower bound can be asymptotically approached under a dedicated item condition (an analogue of the local traffic condition in bandwidth sharing networks). We devise a myopic discretereview matching control that asymptotically–as the arrival rates become large–achieves the imbalancebased lower bound. View paper.

Itai Gurvich, Junfei Huang and Avishai Mandelbaum "Excursionbased universal approximations for the ErlangA queue in steadystate," Mathematics of Operations Research 39(2), 325373.
We revisit manyserver approximations for the wellstudied ErlangA queue. This is a system with a single pool of i.i.d. servers that serve one class of impatient i.i.d. customers. Arrivals follow a Poisson process and service times are exponentially distributed as are the customers’ patience times. We propose a diffusion approximation that applies simultaneously to all existing manyserver heavytraffic regimes: quality and efficiency driven, efficiency driven, quality driven, and nondegenerate slowdown. We prove that the approximation provides accurate estimates for a broad family of steadystate metrics. Our approach is “metricfree” in that we do not use the specific formulas for the steadystate distribution of the ErlangA queue. Rather, we study excursions of the underlying birthanddeath process and couple these to properly defined excursions of the corresponding diffusion process. Regenerative process and martingale arguments, together with derivative bounds for solutions to certain ordinary differential equations, allow us to control the accuracy of the approximation. We demonstrate the appeal of universal approximation by studying two staffing optimization problems of practical interest. View paper.

Rami Atar and Itai Gurvich "Scheduling parallel servers in the nondegenerate slowdown diffusion regime: Asymptotic optimality results," The Annals of Applied Probability, 24(2), 760810.
We consider the problem of minimizing queuelength costs in a system with heterogenous parallel servers, operating in a manyserver heavytraffic regime with nondegenerate slowdown. This regime is distinct from the wellstudied heavy traffic diffusion regimes, namely the (single server) conventional regime and the (manyserver) Halfin–Whitt regime. It has the distinguishing property that waiting times and service times are of comparable magnitudes. We establish an asymptotic lower bound on the cost and devise a sequence of policies that asymptotically attain this bound. As in the conventional regime, the asymptotics can be described by means of a Brownian control problem, the solution of which exhibits a state space collapse. View paper.

Itai Gurvich Ohad Perry "Overflow networks: approximations and implications to call center outsourcing," Operations Research, 60(4), pp. 9961009.
Motivated by call center cosourcing problems, we consider a service network operated under an overflow mechanism. Calls are first routed to an inhouse (or dedicated) service station that has a finite waiting room. If the waiting room is full, the call is overflowed to an outside provider (an overflow station) that might also be serving overflows from other stations. We establish approximations for overflow networks with many servers under a resourcepooling assumption that stipulates, in our context, that the fraction of overflowed calls is nonnegligible. Our two main results are (i) an approximation for the overflow processes via limit theorems and (ii) asymptotic independence between each of the inhouse stations and the overflow station. In particular, we show that, as the system becomes large, the dependency between each inhouse station and the overflow station becomes negligible. Independence between stations in overflow networks is assumed in the literature on call centers, and we provide a rigorous support for those useful heuristics. View paper.

Baris Ata and Itai Gurvich "On optimality gaps in the HalfinWhitt regime ," The Annals of Applied Probability, 22(1), 407455.
We consider optimal control of a multiclass queue in the Halfin–Whitt regime, and revisit the notion of asymptotic optimality and the associated optimality gaps. The existing results in the literature for such systems provide asymptotically optimal controls with optimality gaps of o(√n) where n is the system size, for example, the number of servers. We construct a sequence of asymptotically optimal controls where the optimality gap grows logarithmically with the system size. Our analysis relies on a sequence of Brownian control problems, whose refined structure helps us achieve the improved optimality gaps. View paper.

Sarang Deo and Itai Gurvich "Centralized vs. Decentralized Ambulance Diversion: A Network Perspective ," Management Science, 57(7), pp. 13001319.
In recent years, growth in the demand for emergency medical services along with decline in the number of hospitals with emergency departments (EDs) has led to overcrowding. In periods of overcrowding, an ED can request the Emergency Medical Services (EMS) agency to divert incoming ambulances to neighboring hospitals, a phenomenon known as “ambulance diversion”. The EMS agency will accept this request provided that at least one of the neighboring EDs is not on diversion. From an operations perspective, properly executed ambulance diversion should result in resource pooling and reduce the overcrowding and delays in a network of EDs. Recent evidence indicates, however, that this potential benefit is not always realized. In this paper, we provide one potential explanation for this discrepancy and suggest potential remedies. Using a queueing game between two EDs that aim to minimize their own waiting time, we find that decentralized decisions regarding diversion explain the lack of pooling benefits. Specifically, we find the existence of a defensive equilibrium, wherein each ED does not accept diverted ambulances from the other ED. This defensiveness results in a depooling of the network and, in turn, in delays that are significantly higher than when a social planner coordinates diversion. The social optimum is, itself, difficult to characterize analytically and has limited practical appeal as it depends on problem parameters such as arrival rates and length of stay. Instead, we identify an alternative solution that is more amenable to implementation and can be used by the EMS agencies to coordinate diversion decisions even without the exact knowledge of these parameters. We show that this solution is approximately optimal for the social planner’s problem. Moreover, it is Pareto improving over the defensive equilibrium whereas the social optimum, in general, might not be. View paper.

Itai Gurvich, James Luedtke and Tolga Tezcan "Staffing CallCenters With Uncertain Demand Forecasts: A ChanceConstraints Approach ," Management Science, 56(7), pp. 10931115.
We consider the problem of staffing callcenters with multiple customer classes and agent types operating under qualityofservice (QoS) constraints and demand rate uncertainty. We introduce a formulation of the staffing problem that requires that the QoS constraints are met with high probability with respect to the uncertainty in the demand rate. We contrast this chanceconstrained formulation with the averageperformance constraints that have been used so far in the literature. We then propose a twostep solution for the staffing problem under chance constraints. In the first step, we introduce a Random Static Planning Problem (RSPP) and discuss how it can be solved using two different methods. The RSPP provides us with a firstorder (or fluid) approximation for the true optimal staffing levels and a staffing frontier. In the second step, we solve a finite number of staffing problems with known arrival rates–the arrival rates on the optimal staffing frontier. Hence, our formulation and solution approach has the important property that it translates the problem with uncertain demand rates to one with known arrival rates. The output of our procedure is a solution that is feasible with respect to the chance constraint and nearly optimal for large call centers. View paper.

Itai Gurvich "Validity of heavytraffic steadystate approximations in multiclass queueing networks: The case of queueratio disciplines," Mathematics of Operations Research, 39(1), 121162.
A class of stochastic processes known as semimartingale reflecting Brownian motions (SRBMs) is often used to approximate the dynamics of heavily loaded queueing networks. In two influential papers, Bramson (1998) and Williams (1998) laid out a general and structured approach for proving the validity of such heavytraffic approximations, in which an SRBM is obtained as a diffusion limit from a sequence of suitably normalized workload processes. However, for multiclass networks it is still not known in general whether the steadystate distribution of the SRBM provides a valid approximation for the steadystate distribution of the original network. In this paper we study the case of queueratio disciplines and provide a set of sufficient conditions under which the above question can be answered in the affirmative. In addition to standard assumptions made in the literature towards the stability of the pre and postlimit processes and the existence of diffusion limits, we add a requirement that solutions to the fluid model are attracted to the invariant manifold at linear rate. For the special case of staticpriority networks such linear attraction is known to hold under certain conditions on the network primitives. The analysis elucidates interesting connections between stability of the pre and postlimit processes, their respective fluid models and statespace collapse, and identifies the respective roles played by all of the above in establishing validity of heavytraffic steadystate approximations. View paper.

Gad Allon, Achal Bassamboo and Itai Gurvich "We Will be Right with You": Managing Customers with Vague Promises," . Operations Research,59(6), pp. 13821394.
Delay announcements informing customers about anticipated service delays are prevalent in serviceoriented systems. How delay announcements can influence customers in service systems is a complex problem that depends on both the dynamics of the underlying queueing system and on the customers’ strategic behavior. We examine this problem of information communication by considering a model in which both the firm and the customers act strategically: the firm in choosing its delay announcement while anticipating customer response, and the customers in interpreting these announcements and in making the decision about when to join the system and when to balk. We characterize the equilibrium language that emerges between the service provider and her customers. The analysis of the emerging equilibria provides new and interesting insights into customerfirm information sharing. We show that even though the information provided to customers is nonverifiable, it improves the profits of the firm and the expected utility of the customers. The robustness of the results is illustrated via various extensions of the model. In particular, studying models with incomplete information on the system parameters allows us also to highlight the role of information provision in managing customer expectations regarding the congestion in the system. Further, the information could be as simple as “high congestion”/“low congestion” announcements, or it could be as detailed as the true state of the system. We also show that firms may choose to shade some of the truth by using intentional vagueness to lure customers. View paper.

Itai Gurvich and Ward Whitt "QueueandIdlenessRatio Controls in ManyServer Service Systems ," Mathematics of Operations Research,34(2), pp. 363396.
Motivated by call centers, we study largescale service systems with multiple customer classes and multiple agent pools, each with many agents. We propose a family of routing rules called QueueandIdlenessRatio (QIR) rules. A newly available agent next serves the customer from the head of the queue of the class (from among those he is eligible to serve) whose queue length most exceeds a specified statedependent proportion of the total queue length. An arriving customer is routed to the agent pool whose idleness most exceeds a specified statedependent proportion of the total idleness. We identify regularity conditions on the network structure and system parameters under which QIR produces an important statespace collapse (SSC) result in the QualityandEfficiencyDriven (QED) manyserver heavytraffic limiting regime. The SSC result is applied here to prove stochasticprocess limits and in subsequent papers to solve important staffing and control problems for largescale service systems. View paper.

Itai Gurvich and Ward Whitt "Scheduling Flexible Servers with Convex Delay Costs In ManyServer Service Systems ," Manufacturing and Service Operations Management (M&SOM), 11(2), pp. 237253.
In a recent paper we introduced the fixedqueueratio (FQR) family of routing rules for manyserver service systems with multiple customer classes and server pools. A newly available server next serves the customer from the head of the queue of the class (from among those he is eligible to serve) whose queue length most exceeds a specified proportion of the total queue length. Under fairly general conditions, FQR produces an important statespace collapse as the total arrival rate and the numbers of servers increase in a coordinated way. That statespace collapse was previously used to delicately balance service levels for the different customer classes. In this sequel, we show that a special version of FQR stochastically minimizes convex holding costs in a finitehorizon setting when the service rates are restricted to be pooldependent. Under additional regularity conditions, the special version of FQR reduces to a simple policy: Linear costs produce a prioritytype rule, in which the leastcost customers are given low priority. Strictly convex costs (plus other regularity conditions) produce a manyserver analogue of the generalizedcµ (Gcµ) rule, under which a newly available server selects a customer from the class experiencing the greatest marginal cost at that time. View paper.

Gad Allon and Itai Gurvich "Pricing and Dimensioning Competing LargeScale Service Providers, ," Manufacturing and Service Operations Management (M&SOM), 12(3), pp. 449469.
The literature on manyserver approximations provides significant simplifications towards the optimal capacity sizing of largescale monopolists but falls short of providing similar simplifications for a competitive setting in which each firm’s decision is affected by its competitors’ actions. In this paper, we introduce a framework that combines manyserver heavytraffic analysis with the notion of epsilonNash equilibrium and apply it to the study of equilibria in a market with multiple largescale service providers that compete on both prices and response times. In an analogy to fluid and diffusion approximations for queueing systems, we introduce the notions of fluid game and diffusion game. The proposed framework allows us to provide firstorder and secondorder characterization results for the equilibria in these markets. We use our results to provide insights into the price and servicelevel choices in the market and, in particular, into the impact of the market scale on the interdependence between these two strategic decisions. View paper.

Itai Gurvich and Ward Whitt "ServiceLevel Differentiation in ManyServer Service System Via QueueRatio Routing ," Operations Research, 58(2), pp. 316328.
Motivated by telephone call centers, we study largescale service systems with multiple customer classes and multiple agent pools, each with many agents. To minimize staffing costs subject to servicelevel constraints, where we delicately balance the service levels (SLs) of the different classes, we propose a family of routing rules called FixedQueueRatio (FQR) rules. With FQR, a newly available agent next serves the customer from the head of the queue of the class (from among those he is eligible to serve) whose queue length most exceeds a specified proportion of the total queue length. The proportions can be set to achieve desired SL targets. The FQR rule achieves an important statespace collapse (SSC) as the total arrival rate increases, in which the individual queue lengths evolve as fixed proportions of the total queue length. In the current paper we consider a variety of servicelevel types and exploit SSC to construct asymptotically optimal solutions for the staffingandrouting problem. The key assumption in the current paper is that the service rates depend only on the agent pool. View paper.

Itai Gurvich and Mor Armony and Costis Maglaras "CrossSelling in a Call Center with a Heterogeneous Customer Population ," Operations Research, 57(2), pp. 299313.
Crossselling is becoming an increasingly prevalent practice in call centers, due, in part, to its unique capability to allow firms to dynamically segment their callers and customize their product offerings accordingly. This paper considers a call center with crossselling capability that serves a pool of customers that are differentiated in terms of their revenue potential and delay sensitivity. It studies the operational decisions of staffing, call routing, and crossselling under various forms of customer segmentation. It derives nearoptimal controls in each of the settings analyzed, and characterizes the impact of a more refined customer segmentation on the structure of these policies and the center’s profitability. View paper.

Itai Gurvich and Mor Armony "When Promotions Meet Operations: Cross Selling and Its Effect on CallCenter Performance ," Manufacturing and Service Operations Management (M&SOM), 12(3), pp. 470488.
We study crossselling operations in call centers. The following question is addressed: How many customerservice representatives are required (staffing) and when should crossselling opportunities be exercised (control) in a way that will maximize the expected profit of the center while maintaining a prespecified service level target. We tackle this question by characterizing control and staffing schemes that are asymptotically optimal in the limit, as the system load grows large. Our main finding is that a threshold priority (TP) control, in which crossselling is exercised only if the number of callers in the system is below a certain threshold, is asymptotically optimal in great generality. The asymptotic optimality of TP reduces the staffing problem to a solution of a simple deterministic problem, in one regime, and to a simple search procedure in another. We show that our joint staffing and control scheme is nearly optimal for large systems. Furthermore, it performs extremely well even for relatively small systems. View paper.

Itai Gurvich and Mor Armony and Avishai Mandelbaum "Service Level Differentiation in Call Centers with Fully Flexible Servers ," Management Science, 54(2), 279294.
We study largescale service systems with multiple customer classes and many statistically identical servers. The following question is addressed: How many servers are required (staffing) and how does one match them with customers (control) in order to minimize staffing cost, subject to class level quality of service constraints? We tackle this question by characterizing scheduling and staffing schemes that are asymptotically optimal in the limit, as system load grows to infinity. The asymptotic regimes considered are consistent with the Efficiency Driven (ED), Quality Driven (QD) and Quality and Efficiency Driven (QED) regimes, first introduced in the context of a single class service system. Our main findings are: a) Decoupling of staffing and control, namely (i) Staffing disregards the multiclass nature of the system and is analogous to the staffing of a single class system with the same aggregate demand and a single global quality of service constraint, and (ii) Class level service differentiation is obtained by using a simple Idle server based ThresholdPriority (ITP) control (with stateindependent thresholds), b) Robustness of the staffing and control rules: Our proposed SingleClass Staffing (SCS) rule and ITP control are approximately optimal under various problem formulations and model assumptions. Particularly, although our solution is shown to be asymptotically optimal for large systems, we numerically demonstrate that it performs well also for relatively small systems. View paper.