Gary Qiurui Ma
Ph.D .in Computer Science , Harvard University
qiurui_ma [at] g.harvar.edu
Ph.D .in Computer Science , Harvard University
qiurui_ma [at] g.harvar.edu
We model a delivery platform facilitating transactions among three sides: buyers, stores, and drivers. In addition to buyers paying store-specific purchase prices and drivers receiving store-buyer-specific delivery compensation from the platform, each buyer has the option to directly tip for delivery from a specific store. An equilibrium consists of prices, compensations, tips, and transactions that clears the market, such that buyers receive deliveries from preferred stores considering prices and tips they pay, and drivers deliver preferred orders considering the compensations and tips they receive.
We illustrate the role of tips in pricing: with tips an equilibrium always exists, but without tips an equilibrium is only guaranteed to exist when drivers outnumber the minimum number of buyers and stores. From an efficiency perspective, the optimal with-tip equilibrium welfare is always weakly larger than the optimal without-tip equilibrium welfare. However, we show that even with tips, efficient equilibria may not exist, and that calculating the optimal equilibrium welfare is NP-hard. To address these challenges, we identify conditions on market structure that ensure the existence of efficient with-tip equilibria and allow these efficient equilibria to be computed in polynomial time.
We model the role of an online platform disrupting a market with unit-demand buyers and unit-supply sellers. Each seller can transact with a subset of the buyers whom she already knows, as well as with any additional buyers to whom she is introduced by the platform. Given these constraints on trade, prices and transactions are induced by a competitive equilibrium. The platform's revenue is proportional to the total price of all trades between platform-introduced buyers and sellers.
In general, we show that the platform's revenue-maximization problem is computationally intractable. We provide structural results for revenue-optimal matchings and isolate special cases in which the platform can efficiently compute them. Furthermore, in a market where the maximum increase in social welfare that the platform can create is ΔW, we prove that the platform can attain revenue Ω(ΔW/log(min{n,m})), where n and m are the numbers of buyers and sellers, respectively. When ΔW is large compared to welfare without the platform, this gives a polynomial-time algorithm that guarantees a logarithmic approximation of the optimal welfare as revenue. We also show that even when the platform optimizes for revenue, the social welfare is at least an O(log(min{n,m}))-approximation to the optimal welfare. Finally, we prove significantly stronger bounds for revenue and social welfare in homogeneous-goods markets.
We introduce the theoretical study of a Platform Equilibrium in a market with unit-demand buyers and unit-supply sellers. Each seller can join a platform and transact with any buyer or remain off-platform and transact with a subset of buyers whom she knows. Given the constraints on trade, prices form a competitive equilibrium and clears the market. The platform charges a transaction fee to all on-platform sellers, in the form of a fraction of on-platform sellers' price. The platform chooses the fraction to maximize revenue. A Platform Equilibrium is a Nash equilibrium of the game where each seller decides whether or not to join the platform, balancing the effect of a larger pool of buyers to trade with, against the imposition of a transaction fee.
Our main insights are: (i) In homogeneous-goods markets, pure equilibria always exist and can be found by a polynomial-time algorithm; (ii) When the platform is unregulated, the resulting Platform Equilibrium guarantees a tight Θ(log(min(m,n)))-approximation of the optimal welfare in homogeneous-goods markets, where n and m are the number of buyers and sellers respectively; (iii) Even light regulation helps: when the platform's fee is capped at α∈[0,1), the price of anarchy is 2-α/1-α for general markets. For example, if the platform takes 30 percent of the seller's revenue, a rather high fee, our analysis implies the welfare in a Platform Equilibrium is still a 0.412-fraction of the optimal welfare. Our main results extend to markets with multiple platforms, beyond unit-demand buyers, as well as to sellers with production costs.
We study the behavior of an economic platform (e.g., Amazon, Uber Eats, Instacart) under shocks, such as COVID-19 lockdowns, and the effect of different regulation considerations imposed on a platform. To this end, we develop a multi-agent Gym environment of a platform economy in a dynamic, multi-period setting, with the possible occurrence of economic shocks. Buyers and sellers are modeled as economically-motivated agents, choosing whether or not to pay corresponding fees to use the platform. We formulate the platform’s problem as a partially observable Markov decision process, and use deep reinforcement learning to model its fee setting and matching behavior. We consider two major types of regulation frameworks: (1) taxation policies and (2) platform fee restrictions, and offer extensive simulated experiments to characterize regulatory tradeoffs under optimal platform responses. Our results show that while many interventions are ineffective with a sophisticated platform actor, we identify a particular kind of regulation—fixing fees to optimal, pre-shock fees while still allowing a platform to choose how to match buyer demands to sellers—as promoting the efficiency, seller diversity, and resilience of the overall economic system.
In empirical game-theoretic analysis (EGTA), game models are extended iteratively through a process of generating new strategies based on learning from experience with prior strategies. The strategy exploration problem in EGTA is how to direct this process so to construct effective models with minimal iteration. A variety of approaches have been proposed in the literature, including methods based on classic techniques and novel concepts. Comparing the performance of these alternatives can be surprisingly subtle, depending sensitively on criteria adopted and measures employed. We investigate some of the methodological considerations in evaluating strategy exploration, defining key distinctions and identifying a few general principles based on examples and experimental observations. In particular, we emphasize the fact that empirical games create a space of strategies that should be evaluated as a whole. Based on this fact, we suggest that the minimum regret constrained profile (MRCP) provides a particularly robust basis for evaluating a space of strategies, and propose a local search method for MRCP that outperforms previous approaches. However, the computation of MRCP is not always feasible especially in large games. In this scenario, we highlight consistency considerations for comparing across different approaches. Surprisingly, we find that recent works violate these considerations that are necessary for evaluation, which may result in misleading conclusions on the performance of different approaches. For proper evaluation, we propose a new evaluation scheme and demonstrate that our scheme can reveal the true learning performance of different approaches compared to previous evaluation methods.
Autonomous driving systems have a pipeline of perception, decision, planning, and control. The decision module processes information from the perception module and directs the execution of downstream planning and control modules. On the other hand, the recent success of deep learning suggests that this pipeline could be replaced by end-to-end neural control policies, however, safety cannot be well guaranteed for the data-driven neural networks. In this work, we propose a hybrid framework to learn neural decisions in the classical modular pipeline through end-to-end imitation learning. This hybrid framework can preserve the merits of the classical pipeline such as the strict enforcement of physical and logical constraints while learning complex driving decisions from data. To circumvent the ambiguous annotation of human driving decisions, our method learns high-level driving decisions by imitating low-level control behaviors. We show in the simulation experiments that our modular driving agent can generalize its driving decision and control to various complex scenarios where the rule-based programs fail. It can also generate smoother and safer driving trajectories than end-to-end neural policies. Demo and code are available at https://decisionforce.github.io/modulardecision/.
May, 2025, starting internship at Microsoft Research New England, EconCS group
May, 2025, presenting our paper at Tenth Marketplace Innovation Workshop
May, 2025, presented our paper at Harvard EconCS seminar
July, 2024, presented our papers at 25th-ACM conference on Economics and Computation
May, 2024, attended the Nineth Marketplace Innovation Workshop
Dec 11, 2023: attended workshop "From Matching to Markets. A tale of Mathematics, Economics and Computer Science", CIRM, Marseille, France
Reviewers for ACM EC'25, ACM TheWebConf' 25, ACM TheWebConf' 24
Fall 2023, Teaching Fellow for CS136 Economics and Computation, Harvard
Fall 2023 until now, Teaching Fellow for Data-Driven Marketing, Harvard Business Analytical Program, Harvard Business School