Research

Groundbreaking work and published results in peer reviewed journals across disciplines.

Title

Topic

  • ‘SABRE: Robust Bayesian Peer-to-Peer Federated Learning’

    “We introduce SABRE, a novel framework for robust variational Bayesian peer-to-peer federated learning. We analyze the robustness of the known variational Bayesian peer-to-peer federated learning framework (BayP2PFL) against poisoning attacks and subsequently show that BayP2PFL is not robust against those attacks. The new SABRE aggregation methodology is then devised to overcome the limitations of the existing frameworks. SABRE works well in non-IID settings, does not require the majority of the benign nodes over the compromised ones, and even outperforms the baseline algorithm in benign settings.” Find the paper and full list of authors at ArXiv.

    Learn more

    ,
  • ‘Investigating Large Language Models’ Perception of Emotion Using Appraisal Theory’

    “As more people interact with [Large Language Models like ChatGPT], improving our understanding of these black box models is crucial, especially regarding their understanding of human psychological aspects. In this work, we investigate their emotion perception through the lens of appraisal and coping theory using the Stress and Coping Process Questionaire (SCPQ). … The results show that LLMs’ responses are similar to humans in terms of dynamics of appraisal and coping, but their responses did not differ along key appraisal dimensions as predicted by the theory and data.” Find the paper and full list of authors at ArXiv.

    Learn more

    ,
  • ‘An Efficient Algorithm for Fair Multi-Agent Multi-Armed Bandit With Low Regret’

    “Recently a multi-agent variant of the classical multi-armed bandit was proposed to tackle fairness issues in online learning. Inspired by a long line of work in social choice and economics, the goal is to optimize the Nash social welfare instead of the total utility. Unfortunately previous algorithms either are not efficient or achieve sub-optimal regret in terms of the number of rounds. We propose a new efficient algorithm with lower regret than even previous inefficient ones.” Find the paper and full list of authors in the Proceedings of the AAAI Conference on Artificial Intelligence.

    Learn more

  • ‘Streaming Submodular Maximization With Differential Privacy’

    “In this work, we study the problem of privately maximizing a submodular function in the streaming setting. … When the size of the data stream drawn from the domain of the objective function is large or arrives very fast, one must privately optimize the objective within the constraints of the streaming setting. We establish fundamental differentially private baselines for this problem and then derive better trade-offs between privacy and utility for the special case of decomposable submodular functions.” Find the paper and full list of authors in Proceedings of Machine Learning Research.

    Learn more

  • ‘DomainNet: Homograph Detection and Understanding in Data Lake Disambiguation’

    , ,

    “Modern data lakes are heterogeneous in the vocabulary that is used to describe data. … How can we determine if a data value occurring more than once in the lake has different meanings and is therefore a homograph? While word and entity disambiguation have been well studied, … we show that data lakes provide a new opportunity for disambiguation of data values. … We introduce DomainNet, which efficiently represents this network, and investigate to what extent it can be used to disambiguate values without requiring any supervision.” Find the paper and full list of authors in ACM Transactions on Database…

    Learn more

  • ‘Black-box Attacks Against Neural Binary Function Detection’

    ,

    “Binary analyses based on deep neural networks (DNNs), or neural binary analyses (NBAs), have become a hotly researched topic in recent years. DNNs have been wildly successful at pushing the performance and accuracy envelopes in the natural language and image processing domains. … [However,] in this paper, we empirically demonstrate that the current state of the art in neural function boundary detection is vulnerable to both inadvertent and deliberate adversarial attacks.” Find the paper and full list of authors in the Proceedings of the 26th International Symposium on Research in Attacks, Intrusions and Defenses.

    Learn more

    ,
  • ‘Accelerating Finite Field Arithmetic for Homomorphic Encryption on GPUs’

    “Fully homomorphic encryption (FHE) is a rapidly developing technology that enables computation directly on encrypted data, making it a compelling solution for security in cloud-based systems. In addition, modern FHE schemes are believed to be resistant to quantum attacks. Although FHE offers unprecedented potential for security, current implementations suffer from prohibitively high latency. … The parallel processing capabilities provided by modern GPUs make them compelling candidates to target these highly parallelizable workloads. In this article, we discuss methods to accelerate polynomial multiplication with GPUs, with the goal of making FHE practical.” Find the paper and list of authors at IEEE…

    Learn more

  • ‘Mapping the Typographic Latent Space of Digits’

    ,

    “Since the advancement of handwritten text to typefaces on a computer, the human mind has evolved towards corresponding various typefaces as norms of comprehension. … Currently, the PANOSE system, developed in 1998, is the most widely used and accepted method for classifying typefaces based on 10 visual attributes. In this work, we employ Disentangled Beta-VAE’s, in an unsupervised learning approach, to map the latent feature space with a dataset of MNIST Style Typographic Images (TMNIST-Digit) of 0-9 digits across 2990 unique font styles.” Find the paper and full list of authors at Open Review.

    Learn more

  • ‘StereoVoxelNet: Real-Time Obstacle Detection Based on Occupancy Voxels from a Stereo Camera Using Deep Neural Networks’

    , ,

    “Obstacle detection is a safety-critical problem in robot navigation, where stereo matching is a popular vision-based approach. While deep neural networks have shown impressive results in computer vision, most of the previous obstacle detection works only leverage traditional stereo matching techniques to meet the computational constraints for real-time feedback. This paper proposes a computationally efficient method that employs a deep neural network to detect occupancy from stereo images directly. … Our approach extracts the compact obstacle distribution based on volumetric representations.” Find the paper and full list of authors in the 2023 IEEE International Conference on Robotics and Automation proceedings.

    Learn more

    ,
  • ‘Lilac: A Modal Separation Logic for Conditional Probability’

    ,

    “We present Lilac, a separation logic for reasoning about probabilistic programs where separating conjunction captures probabilistic independence. Inspired by an analogy with mutable state where sampling corresponds to dynamic allocation, we show how probability spaces over a fixed, ambient sample space appear to be the natural analogue of heap fragments, and present a new combining operation on them such that probability spaces behave like heaps and measurability of random variables behaves like ownership. This combining operation forms the basis for our model of separation, and produces a logic with many pleasant properties.” Find the paper and authors list at ArXiv.

    Learn more

  • ‘Style2Fab: Functionality-Aware Segmentation for Fabricating Personalized 3D Models With Generative AI’

    “With recent advances in Generative AI, it is becoming easier to automatically manipulate 3D models. However, current methods tend to apply edits to models globally, which risks compromising the intended functionality of the 3D model when fabricated in the physical world. For example, modifying functional segments in 3D models, such as the base of a vase, could break the original functionality of the model, thus causing the vase to fall over. We introduce a method for automatically segmenting 3D models into functional and aesthetic elements.” Find the paper and full list of authors at ArXiv.

    Learn more

    ,
  • ‘High-Throughput Microscopy Image Deblurring With Graph Reasoning Attention Network’

    “High-quality (HQ) microscopy images afford more detailed information for modern life science research and quantitative image analyses. However, in practice, HQ microscopy images are not commonly available or suffer from blurring artifacts. Compared with natural images, such low-quality (LQ) microscopy ones often share some visual characteristics: more complex structures, less informative background, and repeating patterns. … To address those problems, we collect HQ electron microscopy and histology datasets and propose a graph reasoning attention network (GRAN).” Find the paper and full list of authors in the 2023 IEEE 20th International Symposium on Biomedical Imaging proceedings.

    Learn more

  • ‘MultiPL-E: A Scalable and Polyglot Approach to Benchmarking Neural Code Generation’

    “Large language models have demonstrated the ability to generate both natural language and programming language text. Although contemporary code generation models are trained on corpora with several programming languages, they are tested using benchmarks that are typically monolingual. The most widely used code generation benchmarks only target Python, so there is little quantitative evidence of how code generation models perform on other programming languages. We propose MultiPL-E, a system for translating unit test-driven code generation benchmarks to new languages.” Find the paper and full list of authors at IEEE Transactions on Software Engineering.

    Learn more

  • ‘NeRFInvertor: High Fidelity NeRF-GAN Inversion for Single-Shot Real Image Animation’

    “Nerf-based Generative models have shown impressive capacity in generating high-quality images with consistent 3D geometry. Despite successful synthesis of fake identity images randomly sampled from latent space, adopting these models for generating face images of real subjects is still a challenging task due to its so-called inversion issue. In this paper, we propose a universal method to surgically finetune these NeRF-GAN models in order to achieve high-fidelity animation of real subjects only by a single image.” Find the paper and full list of authors in the 2023 IEEE/CVF Conference on Computer Vision and Pattern Recognition proceedings.

    Learn more

  • ‘Viper: A Fast Snapshot Isolation Checker’

    “Snapshot isolation (SI) is supported by most commercial databases and is widely used by applications. However, checking SI today — given a set of transactions, checking if they obey SI — is either slow or gives up soundness. We present viper, an SI checker that is sound, complete and fast. Viper checks black-box databases and hence is transparent to both users and databases. To be fast, viper introduces BC-polygraphs, a new representation of transaction dependencies.” Find the paper and full list of authors in the Proceedings of the Eighteenth European Conference on Computer Systems.

    Learn more

  • ‘PAC-Learning for Strategic Classification’

    ‘The study of strategic or adversarial manipulation of testing data to fool a classifier has attracted much recent attention. Most previous works have focused on two extreme situations where any testing data point either is completely adversarial or always equally prefers the positive label. In this paper, we generalize both of these through a unified framework by considering strategic agents with heterogenous preferences, and introduce the notion of strategic VC-dimension (SVC) to capture the PAC-learnability in our general strategic setup. SVC provably generalizes the recent concept of adversarial VC-dimension (AVC).” Find the paper and full list of authors at JMLR.

    Learn more

  • ‘Framing Frames: Bypassing Wi-Fi Encryption by Manipulating Transmit Queues’

    “Wi-Fi devices routinely queue frames at various layers of the network stack before transmitting, for instance, when the receiver is in sleep mode. In this work, we investigate how Wi-Fi access points manage the security context of queued frames. By exploiting power-save features, we show how to trick access points into leaking frames in plaintext, or encrypted using the group or an all-zero key. We demonstrate resulting attacks against several open-source network stacks.” Find the paper and full list of authors in the 32nd USENIX Security Symposium proceedings.

    Learn more

    ,
  • ‘One Tree to Rule Them All: Poly-Logarithmic Universal Steiner Tree’

    “A spanning tree T of graph G is a ρ-approximate universal Steiner tree (UST) for root vertex r if, for any subset of vertices S containing r, the cost of the minimal subgraph of T connecting S is within a ρ factor of the minimum cost tree connecting S in G. … We settle [several] open questions by giving polynomial-time algorithms for computing both O(log7n)-approximate USTs and poly-logarithmic strong sparse partition hierarchies.” Find the paper and full list of authors at ArXiv.

    Learn more

  • ‘OASIS: Optimal Arrangements for Sensing in SLAM’

    ,

    “The number and arrangement of sensors on an autonomous mobile robot dramatically influence its perception capabilities. Ensuring that sensors are mounted in a manner that enables accurate detection, localization and mapping is essential for the success of downstream control tasks. However, when designing a new robotic platform, researchers and practitioners alike usually mimic standard configurations or maximize simple heuristics like field-of-view (FOV) coverage to decide where to place exteroceptive sensors. … We conduct an information-theoretic investigation of this overlooked element of mobile robotic perception in the context of simultaneous localization and mapping.” Find the paper and authors list at ArXiv.

    Learn more

  • ‘La Independiente: Designing Ubiquitous Systems for Latin American and Caribbean Women Crowdworkers’

    “Since 2018, Venezuelans have contributed to 75% of leading AI crowd work platforms’ total workforce. … Few initiatives have investigated the impact of such work in the Global South through the lens of feminist theory. … We surveyed 55 LAC women on the crowd work platform Toloka to understand their personal goals, professional values and hardships. … Most participants shared a desire to hear the experiences of other women crowdworkers.” Find the paper and full list of authors in the Adjunct Proceedings of the 2023 ACM International Joint Conference on Pervasive and Ubiquitous Computing and the International Symposium on Wearable…

    Learn more

    ,
  • ‘SCORE: A Second-Order Conic Initialization for Range-Aided SLAM’

    “We present a novel initialization technique for the range-aided simultaneous localization and mapping (RA-SLAM) problem. In RA-SLAM we consider measurements of point-to-point distances in addition to measurements of rigid transformations to landmark or pose variables. Standard formulations of RA-SLAM approach the problem as non-convex optimization, which requires a good initialization to obtain quality results. The initialization technique proposed here relaxes the RA-SLAM problem to a convex problem which is then solved to determine an initialization for the original, non-convex problem.” Find the paper and full list of authors in the IEEE International Conference on Robotics and Automation proceedings.

    Learn more

  • ‘Towards Automated Pain Assessment Using Embodied Conversational Agents’

    “Narrative accounts are the ultimate authoritative source for pain assessment, and face-to-face encounters provide a rich context in which nonverbal conversational behavior can be used to enrich the detail in these descriptions. Embodied Conversational Agents—animated characters that simulate face-to-face conversation—can provide a medium for automated pain assessment in which multimodal pain narratives are elicited, clarified and grounded. … We describe work towards a conversational agent that elicits various aspects of a pain experience, followed by an empathic summary.” Find the paper and full list of authors in the Companion Publication of the 25th International Conference on Multimodal Interaction.

    Learn more

    ,
  • ‘Improving Multiparty Interactions With a Robot Using Large Language Models’

    “Speaker diarization is a key component of systems that support multiparty interactions of co-located users, such as meeting facilitation robots. The goal is to identify who spoke what, often to provide feedback, moderate participation, and personalize responses by the robot. … We leverage large language models (LLMs) to identify speaker labels from transcribed text and observe an exact match of 77% and a word level accuracy of 90%.” Find the paper and full list of authors in the Extended Abstracts of the 2023 CHI Conference on Human Factors in Computing Systems.

    Learn more

    ,
  • ‘NPM-Follower: A Complete Dataset Tracking the NPM Ecosystem’

    ,

    “Software developers typically rely upon a large network of dependencies to build their applications. … However, prior work on NPM dataset construction typically has two limitations: 1) only metadata is scraped, and 2) packages or versions that are deleted from NPM can not be scraped. … We present npm-follower, a dataset and crawling architecture which archives metadata and code of all packages and versions as they are published and is thus able to retain data which is later deleted.” Find the paper and full list of authors at ArXiv.

    Learn more

  • ‘Sublinear Time Algorithms and Complexity of Approximate Maximum Matching’

    “Sublinear time algorithms for approximating maximum matching size have long been studied. Much of the progress over the last two decades on this problem has been on the algorithmic side. … A more recent algorithm by [Behnezhad, Roghani, Rubinstein, and Saberi; SODA’23] obtains a slightly-better-than-1/2 approximation in O(n1+є) time (for arbitrarily small constant ε>0). … Proving any super-linear in n lower bound, even for (1−є)-approximations, has remained elusive. … In this paper, we prove the first super-linear in n lower bound for this problem.” Find the paper and authors list in the Proceedings of the 55th Annual ACM Symposium on…

    Learn more

  • ‘Testing Methods of Neural Systems Understanding’

    “Neuroscientists apply a range of analysis tools to recorded neural activity in order to glean insights into how neural circuits drive behavior in organisms. … Can the tools of neuroscience be applied to artificial neural networks (ANNs) and if so what would this process tell us about ANNs, brains, and – most importantly – the tools themselves? Here we argue that applying analysis methods from neuroscience to ANNs will provide a much-needed test of the abilities of these tools.” Find the paper and full list of authors at Cognitive Systems Research.

    Learn more

    ,
  • ‘Flexible and Optimal Dependency Management via Max-SMT’

    ,

    “Package managers such as NPM have become essential for software development. The NPM repository hosts over 2 million packages and serves over 43 billion downloads every week. Unfortunately, the NPM dependency solver has several shortcomings. … Although existing tools try to address these problems they are either brittle, rely on post hoc changes to the dependency tree, do not guarantee optimality, or are not composable. We present Pacsolve, a unifying framework and implementation for dependency solving which allows for customizable constraints and optimization goals.” Find the paper and full list of authors at the International Conference on Software Engineering proceedings.

    Learn more

  • ‘Active Learning for Classifying 2D Grid-Based Level Completability’

    “Determining the completability of levels generated by procedural generators such as machine learning models can be challenging, as it can involve the use of solver agents that often require a significant amount of time to analyze and solve levels. Active learning is not yet widely adopted in game evaluations, although it has been used successfully in natural language processing, image and speech recognition, and computer vision, where the availability of labeled data is limited or expensive. In this paper, we propose the use of active learning for learning level completability classification.” Find the paper and full list of authors at ArXiv.

    Learn more

  • ‘Persistent Memory Research in the Post-Optane Era’

    “After over a decade of researcher anticipation for the arrival of persistent memory (PMem), the first shipments of 3D XPoint-based Intel Optane Memory in 2019 were quickly followed by its cancellation in 2022. Was this another case of an idea quickly fading from future to past tense, relegating work in this area to the graveyard of failed technologies? … Without persistent memory itself, is future PMem research doomed? We offer two arguments for why reports of the death of PMem research are greatly exaggerated.” Find the paper and authors list in the Proceedings of the 1st Workshop on Disruptive Memory…

    Learn more

  • ‘The Digital-Safety Risks of Financial Technologies for Survivors of Intimate Partner Violence’

    “Digital technologies play a growing role in exacerbating financial abuse for survivors of intimate partner violence (IPV). … Scant research has examined how consumer-facing financial technologies can facilitate or obstruct IPV-related attacks on a survivor’s financial well-being. … We simulated both close-range and remote attacks commonly used by IPV adversaries. We discover that mobile banking and peer-to-peer payment applications are generally ill-equipped to deal with user-interface bound (UI-bound) adversaries, permitting unauthorized access to logins, surreptitious surveillance and harassing messages and system prompts.” Find the paper and full list of authors in the 32nd USENIX Security Symposium proceedings.

    Learn more

    ,