Research
Groundbreaking work and published results in peer reviewed journals across disciplines.
Title
Topic
-
‘Somewhere Randomness Extraction and Security Against Bounded-Storage Mass Surveillance’
“Consider a state-level adversary who observes and stores large amounts of encrypted data from all users on the Internet, but does not have the capacity to store it all. Later, it may target certain ‘persons of interest.’ … We would like to guarantee that, if the adversary’s storage capacity is only (say) 1% of the total encrypted data size, then even if it can later obtain the decryption keys of arbitrary users, it can only learn something about the contents of (roughly) 1% of the ciphertexts.” Find the paper and the full list of authors in the Cryptology EPrint Archive.
-
‘A Map of Witness Maps: New Definitions and Connections’
“A witness map deterministically maps a witness w of some NP statement x into computationally sound proof that x is true. … A unique witness map (UWM) ensures that for any fixed statement x, the witness map should output the same unique proof for x, no matter what witness w it is applied to. … In this work, we study [compact witness maps] and UWMs as primitives of independent interest and present a number of interesting connections to various notions in cryptography.” Find the paper and the full list of authors in the Cryptology EPrint Archive.
-
‘Boosting Batch Arguments and RAM Delegation’
“We show how to generically improve the succinctness of non-interactive publicly verifiable batch argument (BARG) systems. In particular, we show (under a mild additional assumption) how to convert a BARG that generates proofs of length poly (m)· k1−є, where m is the length of a single instance and k is the number of instances being batched, into one that generates proofs of length poly (m, logk), which is the gold standard for succinctness of BARGs.” Find the paper and the full list of authors in the Proceedings of the 55th Annual ACM Symposium on Theory of Computing.
-
‘Doubly Efficient Private Information Retrieval and Fully Homomorphic RAM Computation From Ring LWE’
“A (single server) private information retrieval (PIR) allows a client to read data from a public database held on a remote server, without revealing to the server which locations she is reading. In a doubly efficient PIR (DEPIR), the database is first preprocessed, but the server can subsequently answer any client’s query in time that is sub-linear in the database size. … In this work we construct the stronger unkeyed notion of DEPIR, where the preprocessing is a deterministic procedure that the server can execute on its own.” Find the paper and full list of authors in the STOC 2023 proceedings.
-
‘Identification of Novel Anti-Amoebic Pharmacophores From Kinase Inhibitor Chemotypes’
“Acanthamoeba species, Naegleria fowleri, and Balamuthia mandrillaris are opportunistic pathogens that cause a range of brain, skin, eye, and disseminated diseases in humans and animals. These pathogenic free-living amoebae (pFLA) are commonly misdiagnosed and have sub-optimal treatment regimens which contribute to the extremely high mortality rates (>90%) when they infect the central nervous system. … Herein, we report the activity of the compounds against the trophozoite stage of each of the three amoebae, ranging from nanomolar to low micromolar potency.” Find the paper and the full list of authors at Frontiers in Microbiology.
-
‘Exploratory Thematic Analysis of Crowdsourced Photosensitivity Warnings’
“Films often include sequences of flashing lights for visual effect that may inadvertently trigger seizures when viewed by individuals with photosensitive epilepsy (PSE). Warnings about photosensitive risk in films can help people with PSE make informed decisions about their personal safety, but little is known about how to design such warnings and what information to include. To better understand the design space for photosensitive risk warnings, we conducted a qualitative analysis of 265 crowdsourced warnings about flashing lights in films.” Find the paper and the full list of authors at Conference on Human Factors in Computing Systems 2023 proceedings.
-
‘Is “Categorical Imperative” Metaversal?: A Kantian Ethical Framework for Social Virtual Reality’
“The increasing adoption of social virtual reality (VR) environments for socializing and collaborating with others has led to a growing concern about ethical issues in these immersive environments. Beyond the introduction of some practical guidelines, theoretical work on this topic has been scant. In this paper, we propose an ethical framework for social VR based on Kant’s Theory of Morality. In so doing, we argue that the Kantian concept of categorical imperative does apply to social VR.” Find the paper and the full list of authors in the Conference on Human Factors in Computing Systems 2023 proceedings.
-
‘Noise Stability Optimization for Flat Minima With Optimal Convergence Rates’
“We consider finding flat, local minimizers by adding average weight perturbations. Given a nonconvex function f:ℝd→ℝ and a d-dimensional distribution P which is symmetric at zero, we perturb the weight of f and define F(W)=𝔼[f(W+U)], where U is a random sample from P. This injection induces regularization through the Hessian trace of f for small, isotropic Gaussian perturbations. … Still, convergence rates are not known for finding minima under the average perturbations of the function F. This paper considers an SGD-like algorithm that injects random noise before computing gradients while leveraging the symmetry of P to reduce variance.” Find the paper and the full list of authors at ArXiv.
-
How cephalopods can inspire new technologies
A paper in ECS Sensors Plus details how the unique, natural sensors in cephalopod biology have inspired—and will continue to inspire—scientific innovation.
-
‘Synthesis of Distributed Protocols by Enumeration Modulo Isomorphisms’
“Synthesis of distributed protocols is a hard, often undecidable, problem. Completion techniques provide partial remedy by turning the problem into a search problem. However, the space of candidate completions is still massive. In this paper, we propose optimization techniques to reduce the size of the search space by a factorial factor by exploiting symmetries (isomorphisms) in functionally equivalent solutions. We present both a theoretical analysis of this optimization as well as empirical results that demonstrate its effectiveness in synthesizing both the Alternating Bit Protocol and Two Phase Commit.” Find the paper and the full list of authors at ArXiv.
-
‘”Who is the Right Homeless Client?”: Values in Algorithmic Homelessness Service Provision and Machine Learning Research’
“Homelessness presents a long-standing problem worldwide. Like other welfare services, homeless services have gained increased traction in Machine Learning (ML) research. Unhoused persons are vulnerable and using their data in the ML pipeline raises serious concerns about the unintended harms and consequences of prioritizing different ML values. … Unhoused persons were lost (i.e., humans were deprioritized) at multi-level ML abstraction of predictors, categories and algorithms. Our findings illuminate potential pathways forward … by situating humans at the center to support this vulnerable community.” Find the paper and full list of authors in the Conference on Human Factors in Computing Systems,…
-
‘Why, When and From Whom: Considerations for Collecting and Reporting Race and Ethnicity Data in HCI’
“Engaging diverse participants in HCI research is critical for creating safe, inclusive, and equitable technology. However, there is a lack of guidelines on when, why, and how HCI researchers collect study participants’ race and ethnicity. Our paper aims to take the first step toward such guidelines by providing a systematic review and discussion of the status quo of race and ethnicity data collection in HCI.” Find the paper and full list of authors in the Conference on Human Factors in Computing Systems, 2023, proceedings.
-
‘How to Combine Membership-Inference Attacks on Multiple Updated Machine Learning Models’
“A large body of research has shown that machine learning models are vulnerable to membership inference (MI) attacks that violate the privacy of the participants in the training data. Most MI research focuses on the case of a single standalone model, while production machine-learning platforms often update models over time, on data that often shifts in distribution, giving the attacker more information. This paper proposes new attacks that take advantage of one or more model updates to improve MI.” Find the paper and the full list of authors in the Proceedings on Privacy Enhancing Technologies Symposium.
-
‘TMI! Finetuned Models Leak Private Information from their Pretraining Data’
“Transfer learning has become an increasingly popular technique in machine learning as a way to leverage a pretrained model … to assist with building a finetuned model. … There are reasons to believe that the data used for pretraining is still sensitive, making it essential to understand how much information the finetuned model leaks about the pretraining data. In this work we propose a new membership-inference threat model where the adversary only has access to the finetuned model and would like to infer the membership of the pretraining data.” Find the paper and full list of authors at ArXiv.
-
‘Differentially Private Medians and Interior Points for Non-Pathological Data’
“We construct differentially private estimators with low sample complexity that estimate the median of an arbitrary distribution over ℝ satisfying very mild moment conditions. Our result stands in contrast to the surprising negative result of Bun et al. (FOCS 2015) that showed there is no differentially private estimator with any finite sample complexity that returns any non-trivial approximation to the median of an arbitrary distribution.” Find the paper and the full list of authors at ArXiv.
-
‘DIALITE: Discover, Align and Integrate Open Data Tables’
“We demonstrate a novel table discovery pipeline called DIALITE that allows users to discover, integrate and analyze open data tables. DIALITE has three main stages. First, it allows users to discover tables from open data platforms using state-of-the-art table discovery techniques. Second, DIALITE integrates the discovered tables to produce an integrated table. Finally, it allows users to analyze the integration result by applying different downstreaming tasks over it. Our pipeline is flexible such that the user can easily add and compare additional discovery and integration algorithms.” Find the paper and the full list of authors at ArXiv.
-
‘A Statistical Approach for Finding Property-Access Errors’
“We study the problem of finding incorrect property accesses in JavaScript where objects do not have a fixed layout, and properties (including methods) can be added, overwritten, and deleted freely throughout the lifetime of an object. Since referencing a non-existent property is not an error in JavaScript, accidental accesses to non-existent properties … can go undetected without thorough testing, and may manifest far from the source of the problem. We propose a two-phase approach for detecting property access errors based on the observation that, in practice, most property accesses will be correct.” Find the paper and full list of authors…
-
‘Safe Environmental Envelopes of Discrete Systems’
“A safety verification task involves verifying a system against a desired safety property under certain assumptions about the environment. However, these environmental assumptions may occasionally be violated due to modeling errors or faults. Ideally, the system guarantees its critical properties even under some of these violations, i.e., the system is robust against environmental deviations. This paper proposes a notion of robustness as an explicit, first-class property of a transition system that captures how robust it is against possible deviations in the environment.” Find the paper and the full list of authors at ArXiv.
-
Hoitash and Pei win 2023 Ronald Copeland Best Paper Awards
“Udi Hoitash, Lilian L. and Harry A. Cowan Professor of Accounting, and Amy Pei, Assistant Professor, Marketing, have authored papers awarded the 2023 Copeland Best Paper Awards. Each Group Research Committee nominated a paper published in 2022 by a faculty member in the Group to be considered for a Ronald Copeland Best Paper Award. Two papers were selected as this year’s winners as recommended by the DMSB Research Committee and confirmed by Dean Emery Trahan. Each D’Amore-McKim author will receive a $1,000 stipend.”
-
‘Mississippi River Low-Flows: Context, Causes and Future Projections’
“The Mississippi River represents a major commercial waterway, and periods of anomalously low river levels disrupt riverine transport. These low-flow events occur periodically, with a recent event in the fall of 2022 slowing barge traffic and generating sharp increases in riverine transportation costs. Here we … evaluate historical trends and future projections. … Model simulations from the LENS2 dataset show that, under a moderate-high emissions scenario (SSP3-7.0), the severity and duration of low-flow events is projected to decrease through to the end of the 21st century. ” Find the paper and the full list of authors in Environmental Research: Climate.
-
‘Just-in-Time for Supply Chains in Turbulent Times’
“The Covid-19 pandemic and other recent disruptions in the early 2020s led to sections in the business press blaming just-in-time (JIT) practices for operational failings. … Some scholars argue that JIT is not resilient, while others maintain that JIT can continue providing superior performance even with disruptions. Motivated by this debate, we discuss various misconceptions about JIT that underlie this debate … and argue that companies can improve their supply chain performance if JIT supply chain segments are chosen fittingly—even more so—during disruptions.” Find the paper and the full list of authors at Production and Operations Management.
-
‘Why Not Yet: Fixing a Top-k Ranking That is Not Fair to Individuals’
“This work considers why-not questions in the context of top-k queries and score-based ranking functions. Following the popular linear scalarization approach for multi-objective optimization, we study rankings based on the weighted sum of multiple scores. A given weight choice may be controversial or perceived as unfair. … We introduce various notions of such why-not-yet queries and formally define them as satisfiability or optimization problems, whose goal is to propose alternative ranking functions that address the placement of the entities of interest.” Find the paper and the full list of authors at the Association for Computing Machinery.
-
‘Fast Optimal Locally Private Mean Estimation via Random Projections’
“We study the problem of locally private mean estimation of high-dimensional vectors in the Euclidean ball. Existing algorithms for this problem either incur sub-optimal error or have high communication and/or run-time complexity. We propose a new algorithmic framework, ProjUnit, for private mean estimation that yields algorithms that are computationally efficient, have low communication complexity, and incur optimal error up to a 1+o(1)-factor. Our framework is deceptively simple: each randomizer projects its input to a random low-dimensional subspace, normalizes the result, and then runs an optimal algorithm.” Find the paper and the full list of authors at ArXiv.
-
‘Online and Streaming Algorithms for Constrained k-Submodular Maximization’
“Constrained k-submodular maximization is a general framework that captures many discrete optimization problems such as ad allocation, influence maximization, personalized recommendation, and many others. In many of these applications, datasets are large or decisions need to be made in an online manner, which motivates the development of efficient streaming and online algorithms. In this work, we develop single-pass streaming and online algorithms for constrained k-submodular maximization with both monotone and general (possibly non-monotone) objectives subject to cardinality and knapsack constraints.” Find the paper and the full list of authors at ArXiv.
-
‘Poisoning Network Flow Classifiers’
“As machine learning (ML) classifiers increasingly oversee the automated monitoring of network traffic, studying their resilience against adversarial attacks becomes critical. This paper focuses on poisoning attacks, specifically backdoor attacks, against network traffic flow classifiers. We investigate the challenging scenario of clean-label poisoning where the adversary’s capabilities are constrained to tampering only with the training data—without the ability to arbitrarily modify the training labels or any other component of the training process.” Find the paper and the full list of authors at ArXiv.
-
‘Unleashing the Power of Randomization in Auditing Differentially Private ML’
“We present a rigorous methodology for auditing differentially private machine learning algorithms by adding multiple carefully designed examples called canaries. … First, we introduce Lifted Differential Privacy (LiDP) that expands the definition of differential privacy to handle randomized datasets. … Second, we audit LiDP by trying to distinguish between the model trained with K canaries versus K−1 canaries in the dataset, leaving one canary out. … Third, we introduce novel confidence intervals that take advantage of the multiple test statistics by adapting to the empirical higher-order correlations. Together, this new recipe demonstrates significant improvements in sample complexity.” Find the paper and the full list…
-
‘Freaky Leaky SMS: Extracting User Locations by Analyzing SMS Timings’
“Short Message Service (SMS) remains one of the most popular communication channels since its introduction. … In this paper, we demonstrate that merely receiving silent SMS messages regularly opens a stealthy side-channel that allows other regular network users to infer the whereabouts of the SMS recipient. The core idea is that receiving an SMS inevitably generates Delivery Reports whose reception bestows a timing attack vector at the sender. We conducted experiments across various countries, operators, and devices to show that an attacker can deduce the location of an SMS recipient.” Find the paper and the full list of authors at ArXiv.
-
‘How Creative Versus Technical Constraints Affect Individual Learning in an Online Innovation Community’
“Online innovation communities allow for a search for novel solutions within a design space bounded by constraints. Past research has focused on the effect of creative constraints on individual projects, but less is known about how constraints affect learning from repeated design submissions and the effect of the technical constraints that are integral to online platforms. How do creative versus technical constraints affect individual learning in exploring a design space in online communities? … We find that creative constraints lead to high rates of learning only if technical constraints are sufficiently relaxed.” Find the paper and full list of authors…
-
‘Homo in Machina: Improving Fuzz Testing Coverage via Compartment Analysis’
“Fuzz testing is often automated, but also frequently augmented by experts who insert themselves into the workflow in a greedy search for bugs. In this paper, we propose Homo in Machina, or HM-fuzzing, in which analyses guide the manual efforts, maximizing benefit. As one example … we introduce compartment analysis. Compartment analysis uses a whole-program dominator analysis to estimate the utility of reaching new code, and combines this with a dynamic analysis indicating drastically under-covered edges guarding that code.” Find the paper and full list of authors in the proceedings of the 2023 IEEE Conference on Software Testing, Verification and…