Research

Groundbreaking work and published results in peer reviewed journals across disciplines.

Title

Topic

  • ‘Shotgun … Synthesis Approach Enables the Discovery of Small-Molecule Inhibitors Against Pathogenic Free-Living Amoeba Glucokinases’

    “Pathogenic free-living amoebae (pFLA) can cause life-threatening central nervous system (CNS) infections and warrant the investigation of new chemical agents to combat the rise of infection from these pathogens. … Herein, we used our previously demonstrated multifragment kinetic target-guided synthesis (KTGS) screening strategy to identify inhibitors against pFLA glucokinases. … This work demonstrates the utility of KTGS to identify small-molecule binders for biological targets where resolved X-ray crystal structures are not readily accessible.” Find the paper and full list of authors at ACS Infectious Diseases.

    Learn more

    ,
  • ‘Mainstream News Articles Co-Shared With Fake News Buttress Misinformation Narratives’

    “Most prior and current research examining misinformation spread on social media focuses on reports published by ‘fake’ news sources. These approaches fail to capture another potential form of misinformation with a much larger audience: factual news from mainstream sources (‘real’ news) repurposed to promote false or misleading narratives. … We find that certain articles from reliable outlets are shared by a disproportionate number of users who also shared fake news on Twitter. … We show that co-shared articles contain existing misinformation narratives at a significantly higher rate.” Find the paper and full list of authors at ArXiv.

    Learn more

    , ,
  • ‘From 5G Sniffing to Harvesting Leakages of Privacy-Preserving Messengers’

    “We present the first open-source tool capable of efficiently sniffing 5G control channels, 5GSniffer and demonstrate its potential to conduct attacks on users privacy. 5GSniffer builds on our analysis of the 5G RAN control channel exposing side-channel leakage. We note that decoding the 5G control channels is significantly more challenging than in LTE. … We devise a set of techniques to achieve real-time control channels sniffing (over three orders of magnitude faster than brute-forcing).” Find the paper and full list of authors at the 2023 IEEE Symposium on Security and Privacy.

    Learn more

    ,
  • ‘A Computational Model of Coping and Decision Making in High-Stress, Uncertain Situations: An Application to Hurricane Evacuation Decisions’

    “Modeling and predicting people’s behavior in [stressful, emotion-evoking situations] is a critical research topic. To that end, we propose a computational model of coping that casts Lazarus’s theory of coping into a Partially Observable Markov Decision Process (POMDP) framework. … We evaluated the model’s assumptions in the context of a high-stress situation, hurricanes. … The results support the model’s assumptions showing that the proposed features are significantly associated with the evacuation decisions and people change their beliefs and goals to cope with the situation.” Find the paper and full list of authors at IEEE Transactions on Affective Computing.

    Learn more

  • ‘SABRE: Robust Bayesian Peer-to-Peer Federated Learning’

    “We introduce SABRE, a novel framework for robust variational Bayesian peer-to-peer federated learning. We analyze the robustness of the known variational Bayesian peer-to-peer federated learning framework (BayP2PFL) against poisoning attacks and subsequently show that BayP2PFL is not robust against those attacks. The new SABRE aggregation methodology is then devised to overcome the limitations of the existing frameworks. SABRE works well in non-IID settings, does not require the majority of the benign nodes over the compromised ones, and even outperforms the baseline algorithm in benign settings.” Find the paper and full list of authors at ArXiv.

    Learn more

    ,
  • ‘Investigating Large Language Models’ Perception of Emotion Using Appraisal Theory’

    “As more people interact with [Large Language Models like ChatGPT], improving our understanding of these black box models is crucial, especially regarding their understanding of human psychological aspects. In this work, we investigate their emotion perception through the lens of appraisal and coping theory using the Stress and Coping Process Questionaire (SCPQ). … The results show that LLMs’ responses are similar to humans in terms of dynamics of appraisal and coping, but their responses did not differ along key appraisal dimensions as predicted by the theory and data.” Find the paper and full list of authors at ArXiv.

    Learn more

    ,
  • ‘An Efficient Algorithm for Fair Multi-Agent Multi-Armed Bandit With Low Regret’

    “Recently a multi-agent variant of the classical multi-armed bandit was proposed to tackle fairness issues in online learning. Inspired by a long line of work in social choice and economics, the goal is to optimize the Nash social welfare instead of the total utility. Unfortunately previous algorithms either are not efficient or achieve sub-optimal regret in terms of the number of rounds. We propose a new efficient algorithm with lower regret than even previous inefficient ones.” Find the paper and full list of authors in the Proceedings of the AAAI Conference on Artificial Intelligence.

    Learn more

  • ‘Streaming Submodular Maximization With Differential Privacy’

    “In this work, we study the problem of privately maximizing a submodular function in the streaming setting. … When the size of the data stream drawn from the domain of the objective function is large or arrives very fast, one must privately optimize the objective within the constraints of the streaming setting. We establish fundamental differentially private baselines for this problem and then derive better trade-offs between privacy and utility for the special case of decomposable submodular functions.” Find the paper and full list of authors in Proceedings of Machine Learning Research.

    Learn more

  • ‘DomainNet: Homograph Detection and Understanding in Data Lake Disambiguation’

    , ,

    “Modern data lakes are heterogeneous in the vocabulary that is used to describe data. … How can we determine if a data value occurring more than once in the lake has different meanings and is therefore a homograph? While word and entity disambiguation have been well studied, … we show that data lakes provide a new opportunity for disambiguation of data values. … We introduce DomainNet, which efficiently represents this network, and investigate to what extent it can be used to disambiguate values without requiring any supervision.” Find the paper and full list of authors in ACM Transactions on Database…

    Learn more

  • ‘Black-box Attacks Against Neural Binary Function Detection’

    ,

    “Binary analyses based on deep neural networks (DNNs), or neural binary analyses (NBAs), have become a hotly researched topic in recent years. DNNs have been wildly successful at pushing the performance and accuracy envelopes in the natural language and image processing domains. … [However,] in this paper, we empirically demonstrate that the current state of the art in neural function boundary detection is vulnerable to both inadvertent and deliberate adversarial attacks.” Find the paper and full list of authors in the Proceedings of the 26th International Symposium on Research in Attacks, Intrusions and Defenses.

    Learn more

    ,
  • ‘Accelerating Finite Field Arithmetic for Homomorphic Encryption on GPUs’

    “Fully homomorphic encryption (FHE) is a rapidly developing technology that enables computation directly on encrypted data, making it a compelling solution for security in cloud-based systems. In addition, modern FHE schemes are believed to be resistant to quantum attacks. Although FHE offers unprecedented potential for security, current implementations suffer from prohibitively high latency. … The parallel processing capabilities provided by modern GPUs make them compelling candidates to target these highly parallelizable workloads. In this article, we discuss methods to accelerate polynomial multiplication with GPUs, with the goal of making FHE practical.” Find the paper and list of authors at IEEE…

    Learn more

  • ‘Mapping the Typographic Latent Space of Digits’

    ,

    “Since the advancement of handwritten text to typefaces on a computer, the human mind has evolved towards corresponding various typefaces as norms of comprehension. … Currently, the PANOSE system, developed in 1998, is the most widely used and accepted method for classifying typefaces based on 10 visual attributes. In this work, we employ Disentangled Beta-VAE’s, in an unsupervised learning approach, to map the latent feature space with a dataset of MNIST Style Typographic Images (TMNIST-Digit) of 0-9 digits across 2990 unique font styles.” Find the paper and full list of authors at Open Review.

    Learn more

  • ‘StereoVoxelNet: Real-Time Obstacle Detection Based on Occupancy Voxels from a Stereo Camera Using Deep Neural Networks’

    , ,

    “Obstacle detection is a safety-critical problem in robot navigation, where stereo matching is a popular vision-based approach. While deep neural networks have shown impressive results in computer vision, most of the previous obstacle detection works only leverage traditional stereo matching techniques to meet the computational constraints for real-time feedback. This paper proposes a computationally efficient method that employs a deep neural network to detect occupancy from stereo images directly. … Our approach extracts the compact obstacle distribution based on volumetric representations.” Find the paper and full list of authors in the 2023 IEEE International Conference on Robotics and Automation proceedings.

    Learn more

    ,
  • ‘Lilac: A Modal Separation Logic for Conditional Probability’

    ,

    “We present Lilac, a separation logic for reasoning about probabilistic programs where separating conjunction captures probabilistic independence. Inspired by an analogy with mutable state where sampling corresponds to dynamic allocation, we show how probability spaces over a fixed, ambient sample space appear to be the natural analogue of heap fragments, and present a new combining operation on them such that probability spaces behave like heaps and measurability of random variables behaves like ownership. This combining operation forms the basis for our model of separation, and produces a logic with many pleasant properties.” Find the paper and authors list at ArXiv.

    Learn more

  • ‘Style2Fab: Functionality-Aware Segmentation for Fabricating Personalized 3D Models With Generative AI’

    “With recent advances in Generative AI, it is becoming easier to automatically manipulate 3D models. However, current methods tend to apply edits to models globally, which risks compromising the intended functionality of the 3D model when fabricated in the physical world. For example, modifying functional segments in 3D models, such as the base of a vase, could break the original functionality of the model, thus causing the vase to fall over. We introduce a method for automatically segmenting 3D models into functional and aesthetic elements.” Find the paper and full list of authors at ArXiv.

    Learn more

    ,
  • ‘High-Throughput Microscopy Image Deblurring With Graph Reasoning Attention Network’

    “High-quality (HQ) microscopy images afford more detailed information for modern life science research and quantitative image analyses. However, in practice, HQ microscopy images are not commonly available or suffer from blurring artifacts. Compared with natural images, such low-quality (LQ) microscopy ones often share some visual characteristics: more complex structures, less informative background, and repeating patterns. … To address those problems, we collect HQ electron microscopy and histology datasets and propose a graph reasoning attention network (GRAN).” Find the paper and full list of authors in the 2023 IEEE 20th International Symposium on Biomedical Imaging proceedings.

    Learn more

  • ‘MultiPL-E: A Scalable and Polyglot Approach to Benchmarking Neural Code Generation’

    “Large language models have demonstrated the ability to generate both natural language and programming language text. Although contemporary code generation models are trained on corpora with several programming languages, they are tested using benchmarks that are typically monolingual. The most widely used code generation benchmarks only target Python, so there is little quantitative evidence of how code generation models perform on other programming languages. We propose MultiPL-E, a system for translating unit test-driven code generation benchmarks to new languages.” Find the paper and full list of authors at IEEE Transactions on Software Engineering.

    Learn more

  • ‘NeRFInvertor: High Fidelity NeRF-GAN Inversion for Single-Shot Real Image Animation’

    “Nerf-based Generative models have shown impressive capacity in generating high-quality images with consistent 3D geometry. Despite successful synthesis of fake identity images randomly sampled from latent space, adopting these models for generating face images of real subjects is still a challenging task due to its so-called inversion issue. In this paper, we propose a universal method to surgically finetune these NeRF-GAN models in order to achieve high-fidelity animation of real subjects only by a single image.” Find the paper and full list of authors in the 2023 IEEE/CVF Conference on Computer Vision and Pattern Recognition proceedings.

    Learn more

  • ‘Viper: A Fast Snapshot Isolation Checker’

    “Snapshot isolation (SI) is supported by most commercial databases and is widely used by applications. However, checking SI today — given a set of transactions, checking if they obey SI — is either slow or gives up soundness. We present viper, an SI checker that is sound, complete and fast. Viper checks black-box databases and hence is transparent to both users and databases. To be fast, viper introduces BC-polygraphs, a new representation of transaction dependencies.” Find the paper and full list of authors in the Proceedings of the Eighteenth European Conference on Computer Systems.

    Learn more

  • ‘PAC-Learning for Strategic Classification’

    ‘The study of strategic or adversarial manipulation of testing data to fool a classifier has attracted much recent attention. Most previous works have focused on two extreme situations where any testing data point either is completely adversarial or always equally prefers the positive label. In this paper, we generalize both of these through a unified framework by considering strategic agents with heterogenous preferences, and introduce the notion of strategic VC-dimension (SVC) to capture the PAC-learnability in our general strategic setup. SVC provably generalizes the recent concept of adversarial VC-dimension (AVC).” Find the paper and full list of authors at JMLR.

    Learn more

  • ‘Framing Frames: Bypassing Wi-Fi Encryption by Manipulating Transmit Queues’

    “Wi-Fi devices routinely queue frames at various layers of the network stack before transmitting, for instance, when the receiver is in sleep mode. In this work, we investigate how Wi-Fi access points manage the security context of queued frames. By exploiting power-save features, we show how to trick access points into leaking frames in plaintext, or encrypted using the group or an all-zero key. We demonstrate resulting attacks against several open-source network stacks.” Find the paper and full list of authors in the 32nd USENIX Security Symposium proceedings.

    Learn more

    ,
  • ‘One Tree to Rule Them All: Poly-Logarithmic Universal Steiner Tree’

    “A spanning tree T of graph G is a ρ-approximate universal Steiner tree (UST) for root vertex r if, for any subset of vertices S containing r, the cost of the minimal subgraph of T connecting S is within a ρ factor of the minimum cost tree connecting S in G. … We settle [several] open questions by giving polynomial-time algorithms for computing both O(log7n)-approximate USTs and poly-logarithmic strong sparse partition hierarchies.” Find the paper and full list of authors at ArXiv.

    Learn more

  • ‘OASIS: Optimal Arrangements for Sensing in SLAM’

    ,

    “The number and arrangement of sensors on an autonomous mobile robot dramatically influence its perception capabilities. Ensuring that sensors are mounted in a manner that enables accurate detection, localization and mapping is essential for the success of downstream control tasks. However, when designing a new robotic platform, researchers and practitioners alike usually mimic standard configurations or maximize simple heuristics like field-of-view (FOV) coverage to decide where to place exteroceptive sensors. … We conduct an information-theoretic investigation of this overlooked element of mobile robotic perception in the context of simultaneous localization and mapping.” Find the paper and authors list at ArXiv.

    Learn more

  • ‘La Independiente: Designing Ubiquitous Systems for Latin American and Caribbean Women Crowdworkers’

    “Since 2018, Venezuelans have contributed to 75% of leading AI crowd work platforms’ total workforce. … Few initiatives have investigated the impact of such work in the Global South through the lens of feminist theory. … We surveyed 55 LAC women on the crowd work platform Toloka to understand their personal goals, professional values and hardships. … Most participants shared a desire to hear the experiences of other women crowdworkers.” Find the paper and full list of authors in the Adjunct Proceedings of the 2023 ACM International Joint Conference on Pervasive and Ubiquitous Computing and the International Symposium on Wearable…

    Learn more

    ,
  • ‘SCORE: A Second-Order Conic Initialization for Range-Aided SLAM’

    “We present a novel initialization technique for the range-aided simultaneous localization and mapping (RA-SLAM) problem. In RA-SLAM we consider measurements of point-to-point distances in addition to measurements of rigid transformations to landmark or pose variables. Standard formulations of RA-SLAM approach the problem as non-convex optimization, which requires a good initialization to obtain quality results. The initialization technique proposed here relaxes the RA-SLAM problem to a convex problem which is then solved to determine an initialization for the original, non-convex problem.” Find the paper and full list of authors in the IEEE International Conference on Robotics and Automation proceedings.

    Learn more

  • ‘Towards Automated Pain Assessment Using Embodied Conversational Agents’

    “Narrative accounts are the ultimate authoritative source for pain assessment, and face-to-face encounters provide a rich context in which nonverbal conversational behavior can be used to enrich the detail in these descriptions. Embodied Conversational Agents—animated characters that simulate face-to-face conversation—can provide a medium for automated pain assessment in which multimodal pain narratives are elicited, clarified and grounded. … We describe work towards a conversational agent that elicits various aspects of a pain experience, followed by an empathic summary.” Find the paper and full list of authors in the Companion Publication of the 25th International Conference on Multimodal Interaction.

    Learn more

    ,
  • ‘Improving Multiparty Interactions With a Robot Using Large Language Models’

    “Speaker diarization is a key component of systems that support multiparty interactions of co-located users, such as meeting facilitation robots. The goal is to identify who spoke what, often to provide feedback, moderate participation, and personalize responses by the robot. … We leverage large language models (LLMs) to identify speaker labels from transcribed text and observe an exact match of 77% and a word level accuracy of 90%.” Find the paper and full list of authors in the Extended Abstracts of the 2023 CHI Conference on Human Factors in Computing Systems.

    Learn more

    ,
  • ‘NPM-Follower: A Complete Dataset Tracking the NPM Ecosystem’

    ,

    “Software developers typically rely upon a large network of dependencies to build their applications. … However, prior work on NPM dataset construction typically has two limitations: 1) only metadata is scraped, and 2) packages or versions that are deleted from NPM can not be scraped. … We present npm-follower, a dataset and crawling architecture which archives metadata and code of all packages and versions as they are published and is thus able to retain data which is later deleted.” Find the paper and full list of authors at ArXiv.

    Learn more

  • ‘Sublinear Time Algorithms and Complexity of Approximate Maximum Matching’

    “Sublinear time algorithms for approximating maximum matching size have long been studied. Much of the progress over the last two decades on this problem has been on the algorithmic side. … A more recent algorithm by [Behnezhad, Roghani, Rubinstein, and Saberi; SODA’23] obtains a slightly-better-than-1/2 approximation in O(n1+є) time (for arbitrarily small constant ε>0). … Proving any super-linear in n lower bound, even for (1−є)-approximations, has remained elusive. … In this paper, we prove the first super-linear in n lower bound for this problem.” Find the paper and authors list in the Proceedings of the 55th Annual ACM Symposium on…

    Learn more

  • ‘Testing Methods of Neural Systems Understanding’

    “Neuroscientists apply a range of analysis tools to recorded neural activity in order to glean insights into how neural circuits drive behavior in organisms. … Can the tools of neuroscience be applied to artificial neural networks (ANNs) and if so what would this process tell us about ANNs, brains, and – most importantly – the tools themselves? Here we argue that applying analysis methods from neuroscience to ANNs will provide a much-needed test of the abilities of these tools.” Find the paper and full list of authors at Cognitive Systems Research.

    Learn more

    ,