Gary Qiurui Ma

Gary Qiurui Ma
Ph.D .in Computer Science , Harvard University
qiurui_ma [at] g.harvard.edu

I am a rising 5-th year PhD Student at Harvard University, where I am fortunate to be advised by Prof. David C. Parkes and Prof. Yannai A. Gonczarowski.

My research lies at the intersection of game theory, computer science and economics. I love using theoretical computer science frameworks like graph theory and stable matching theory to analyze economic systems. A topic that particularly interests me is the gig economy, where a digital platform mediates an online buyer-seller-driver market. I am looking for a postdoc position from Fall 2026.

Papers and Publications (* for alphabetical order)

Gonczarowski, Y.A., Ma, G. Q., & Parkes, D.C. *(2025). Pricing with Tips in Three-Sided Delivery Platforms
presented at Harvard EconcS seminar, 05/2025; Marketplace Innovation Workshop, 05/2025; Revenue Management and Pricing Conference 07/2025;
To be presented at Informs'25; presentation

We model a delivery platform facilitating transactions among three sides: buyers, stores, and couriers. In addition to buyers paying store-specific purchase prices and couriers receiving store--buyer-specific delivery compensation from the platform, each buyer has the option to directly tip for delivery from a specific store. An equilibrium consists of prices, compensations, tips, and transactions that clear the market, such that buyers receive deliveries from preferred stores considering the prices and tips they pay, and couriers deliver preferred orders considering the compensations and tips they receive.

We illustrate the role of tips in pricing: Without tips, an equilibrium is only guaranteed to exist when there are at least as many couriers as buyers or stores. In contrast, with tips an equilibrium always exists. From an efficiency perspective, the optimal with-tip equilibrium welfare is always weakly larger than the optimal without-tip equilibrium welfare. However, we show that even with tips, efficient equilibria may not exist, and calculating the optimal equilibrium welfare is NP-hard. To address these challenges, we identify natural conditions on market structure that ensure the existence of efficient with-tip equilibria and allow these efficient equilibria to be computed in polynomial time.

Gonczarowski, Y.A., D’Amico-Wong, L., Ma, G. Q., & Parkes, D.C. (2024). Disrupting Bipartite Trading Networks: Matching for Revenue Maximization
Accepted at the Twenty-Fifth ACM Conference on Economics and Computation (EC'24), 2024
presented at EC'24, Yale, US, 07/2024;
Earlier Version: D’Amico-Wong, L., Ma, G. Q., & Parkes, D.C. (2024).
Strategic Recommendation: Revenue Optimal Matching for Online Platforms,
Best Student Abstract In AAAI-24 Student Abstract and Poster, 3min presentation

We model the role of an online platform disrupting a market with unit-demand buyers and unit-supply sellers. Each seller can transact with a subset of the buyers whom she already knows, as well as with any additional buyers to whom she is introduced by the platform. Given these constraints on trade, prices and transactions are induced by a competitive equilibrium. The platform's revenue is proportional to the total price of all trades between platform-introduced buyers and sellers.

In general, we show that the platform's revenue-maximization problem is computationally intractable. We provide structural results for revenue-optimal matchings and isolate special cases in which the platform can efficiently compute them. Furthermore, in a market where the maximum increase in social welfare that the platform can create is ΔW, we prove that the platform can attain revenue Ω(ΔW/log(min{n,m})), where n and m are the numbers of buyers and sellers, respectively. When ΔW is large compared to welfare without the platform, this gives a polynomial-time algorithm that guarantees a logarithmic approximation of the optimal welfare as revenue. We also show that even when the platform optimizes for revenue, the social welfare is at least an O(log(min{n,m}))-approximation to the optimal welfare. Finally, we prove significantly stronger bounds for revenue and social welfare in homogeneous-goods markets.

Eden, A., Ma, G. Q., & Parkes, D. C. *(2023). Platform Equilibrium: Analyzing Social Welfare in Online Market Places
Accepted at the Twenty-Fifth ACM Conference on Economics and Computation (EC'24), 2024
presented at EC'24, Yale, US, 07/2024; CIRM workshop "From Matching to Markets." Marseille, France, 12/2023;
Invited for Presentation at MSOM Supply Chain Management SIG 2025;

We introduce the theoretical study of a Platform Equilibrium in a market with unit-demand buyers and unit-supply sellers. Each seller can join a platform and transact with any buyer or remain off-platform and transact with a subset of buyers whom she knows. Given the constraints on trade, prices form a competitive equilibrium and clears the market. The platform charges a transaction fee to all on-platform sellers, in the form of a fraction of on-platform sellers' price. The platform chooses the fraction to maximize revenue. A Platform Equilibrium is a Nash equilibrium of the game where each seller decides whether or not to join the platform, balancing the effect of a larger pool of buyers to trade with, against the imposition of a transaction fee.
Our main insights are: (i) In homogeneous-goods markets, pure equilibria always exist and can be found by a polynomial-time algorithm; (ii) When the platform is unregulated, the resulting Platform Equilibrium guarantees a tight Θ(log(min(m,n)))-approximation of the optimal welfare in homogeneous-goods markets, where n and m are the number of buyers and sellers respectively; (iii) Even light regulation helps: when the platform's fee is capped at α∈[0,1), the price of anarchy is 2-α/1-α for general markets. For example, if the platform takes 30 percent of the seller's revenue, a rather high fee, our analysis implies the welfare in a Platform Equilibrium is still a 0.412-fraction of the optimal welfare. Our main results extend to markets with multiple platforms, beyond unit-demand buyers, as well as to sellers with production costs.

Wang, X., Ma, G. Q., Eden, A., Li, C., Trott, A., Zheng, S., & Parkes, D. (2023, April). Platform Behavior under Market Shocks: A Simulation Framework and Reinforcement-Learning Based Study
In Proceedings of the ACM Web Conference 2023 (pp. 3592-3602).

We study the behavior of an economic platform (e.g., Amazon, Uber Eats, Instacart) under shocks, such as COVID-19 lockdowns, and the effect of different regulation considerations imposed on a platform. To this end, we develop a multi-agent Gym environment of a platform economy in a dynamic, multi-period setting, with the possible occurrence of economic shocks. Buyers and sellers are modeled as economically-motivated agents, choosing whether or not to pay corresponding fees to use the platform. We formulate the platform’s problem as a partially observable Markov decision process, and use deep reinforcement learning to model its fee setting and matching behavior. We consider two major types of regulation frameworks: (1) taxation policies and (2) platform fee restrictions, and offer extensive simulated experiments to characterize regulatory tradeoffs under optimal platform responses. Our results show that while many interventions are ineffective with a sophisticated platform actor, we identify a particular kind of regulation—fixing fees to optimal, pre-shock fees while still allowing a platform to choose how to match buyer demands to sellers—as promoting the efficiency, seller diversity, and resilience of the overall economic system.

Wang, Y., Ma, Q., & Wellman, M. P. (2022, May). Evaluating Strategy Exploration in Empirical Game-Theoretic Analysis
In Proceedings of the 21st International Conference on Autonomous Agents and Multiagent Systems (pp. 1346-1354).

In empirical game-theoretic analysis (EGTA), game models are extended iteratively through a process of generating new strategies based on learning from experience with prior strategies. The strategy exploration problem in EGTA is how to direct this process so to construct effective models with minimal iteration. A variety of approaches have been proposed in the literature, including methods based on classic techniques and novel concepts. Comparing the performance of these alternatives can be surprisingly subtle, depending sensitively on criteria adopted and measures employed. We investigate some of the methodological considerations in evaluating strategy exploration, defining key distinctions and identifying a few general principles based on examples and experimental observations. In particular, we emphasize the fact that empirical games create a space of strategies that should be evaluated as a whole. Based on this fact, we suggest that the minimum regret constrained profile (MRCP) provides a particularly robust basis for evaluating a space of strategies, and propose a local search method for MRCP that outperforms previous approaches. However, the computation of MRCP is not always feasible especially in large games. In this scenario, we highlight consistency considerations for comparing across different approaches. Surprisingly, we find that recent works violate these considerations that are necessary for evaluation, which may result in misleading conclusions on the performance of different approaches. For proper evaluation, we propose a new evaluation scheme and demonstrate that our scheme can reveal the true learning performance of different approaches compared to previous evaluation methods.

Huang, J., Xie, S., Sun, J., Ma, Q., Liu, C., Lin, D., & Zhou, B. (2021, October). Learning a Decision Module by Imitation Driver's Control Behaviors
In Conference on Robot Learning (pp. 1-10). PMLR.

Autonomous driving systems have a pipeline of perception, decision, planning, and control. The decision module processes information from the perception module and directs the execution of downstream planning and control modules. On the other hand, the recent success of deep learning suggests that this pipeline could be replaced by end-to-end neural control policies, however, safety cannot be well guaranteed for the data-driven neural networks. In this work, we propose a hybrid framework to learn neural decisions in the classical modular pipeline through end-to-end imitation learning. This hybrid framework can preserve the merits of the classical pipeline such as the strict enforcement of physical and logical constraints while learning complex driving decisions from data. To circumvent the ambiguous annotation of human driving decisions, our method learns high-level driving decisions by imitating low-level control behaviors. We show in the simulation experiments that our modular driving agent can generalize its driving decision and control to various complex scenarios where the rule-based programs fail. It can also generate smoother and safer driving trajectories than end-to-end neural policies. Demo and code are available at https://decisionforce.github.io/modulardecision/.

Events

July, 2025, presenting our paper at 2025 Informs Revenue Management and Pricing Section Conference
May, 2025, starting internship at Microsoft Research New England, EconCS group
May, 2025, presenting our paper at Tenth Marketplace Innovation Workshop
May, 2025, presented our paper at Harvard EconCS seminar
July, 2024, presented our papers at 25th-ACM conference on Economics and Computation
May, 2024, attended the Nineth Marketplace Innovation Workshop
Dec 11, 2023: attended workshop "From Matching to Markets. A tale of Mathematics, Economics and Computer Science", CIRM, Marseille, France

Professional Service

Reviewers for ACM EC'25, ACM TheWebConf' 25, ACM TheWebConf' 24

Teachings

Fall 2023, Teaching Fellow for CS136 Economics and Computation, Harvard
Fall 2023 until Spring 2025, Teaching Fellow for Data-Driven Marketing, Harvard Business Analytical Program, Harvard Business School

Page updated

Google Sites

Report abuse

Gary Qiurui MaPh.D .in Computer Science , Harvard Universityqiurui_ma [at] g.harvard.edu