Research
Groundbreaking work and published results in peer reviewed journals across disciplines.
Title
Topic

‘From 5G Sniffing to Harvesting Leakages of PrivacyPreserving Messengers’
“We present the first opensource tool capable of efficiently sniffing 5G control channels, 5GSniffer and demonstrate its potential to conduct attacks on users privacy. 5GSniffer builds on our analysis of the 5G RAN control channel exposing sidechannel leakage. We note that decoding the 5G control channels is significantly more challenging than in LTE. … We devise a set of techniques to achieve realtime control channels sniffing (over three orders of magnitude faster than bruteforcing).” Find the paper and full list of authors at the 2023 IEEE Symposium on Security and Privacy.

‘A Computational Model of Coping and Decision Making in HighStress, Uncertain Situations: An Application to Hurricane Evacuation Decisions’
“Modeling and predicting people’s behavior in [stressful, emotionevoking situations] is a critical research topic. To that end, we propose a computational model of coping that casts Lazarus’s theory of coping into a Partially Observable Markov Decision Process (POMDP) framework. … We evaluated the model’s assumptions in the context of a highstress situation, hurricanes. … The results support the model’s assumptions showing that the proposed features are significantly associated with the evacuation decisions and people change their beliefs and goals to cope with the situation.” Find the paper and full list of authors at IEEE Transactions on Affective Computing.

‘SABRE: Robust Bayesian PeertoPeer Federated Learning’
“We introduce SABRE, a novel framework for robust variational Bayesian peertopeer federated learning. We analyze the robustness of the known variational Bayesian peertopeer federated learning framework (BayP2PFL) against poisoning attacks and subsequently show that BayP2PFL is not robust against those attacks. The new SABRE aggregation methodology is then devised to overcome the limitations of the existing frameworks. SABRE works well in nonIID settings, does not require the majority of the benign nodes over the compromised ones, and even outperforms the baseline algorithm in benign settings.” Find the paper and full list of authors at ArXiv.

‘Investigating Large Language Models’ Perception of Emotion Using Appraisal Theory’
“As more people interact with [Large Language Models like ChatGPT], improving our understanding of these black box models is crucial, especially regarding their understanding of human psychological aspects. In this work, we investigate their emotion perception through the lens of appraisal and coping theory using the Stress and Coping Process Questionaire (SCPQ). … The results show that LLMs’ responses are similar to humans in terms of dynamics of appraisal and coping, but their responses did not differ along key appraisal dimensions as predicted by the theory and data.” Find the paper and full list of authors at ArXiv.

‘An Efficient Algorithm for Fair MultiAgent MultiArmed Bandit With Low Regret’
“Recently a multiagent variant of the classical multiarmed bandit was proposed to tackle fairness issues in online learning. Inspired by a long line of work in social choice and economics, the goal is to optimize the Nash social welfare instead of the total utility. Unfortunately previous algorithms either are not efficient or achieve suboptimal regret in terms of the number of rounds. We propose a new efficient algorithm with lower regret than even previous inefficient ones.” Find the paper and full list of authors in the Proceedings of the AAAI Conference on Artificial Intelligence.

‘Streaming Submodular Maximization With Differential Privacy’
“In this work, we study the problem of privately maximizing a submodular function in the streaming setting. … When the size of the data stream drawn from the domain of the objective function is large or arrives very fast, one must privately optimize the objective within the constraints of the streaming setting. We establish fundamental differentially private baselines for this problem and then derive better tradeoffs between privacy and utility for the special case of decomposable submodular functions.” Find the paper and full list of authors in Proceedings of Machine Learning Research.

‘DomainNet: Homograph Detection and Understanding in Data Lake Disambiguation’
“Modern data lakes are heterogeneous in the vocabulary that is used to describe data. … How can we determine if a data value occurring more than once in the lake has different meanings and is therefore a homograph? While word and entity disambiguation have been well studied, … we show that data lakes provide a new opportunity for disambiguation of data values. … We introduce DomainNet, which efficiently represents this network, and investigate to what extent it can be used to disambiguate values without requiring any supervision.” Find the paper and full list of authors in ACM Transactions on Database…

‘Blackbox Attacks Against Neural Binary Function Detection’
“Binary analyses based on deep neural networks (DNNs), or neural binary analyses (NBAs), have become a hotly researched topic in recent years. DNNs have been wildly successful at pushing the performance and accuracy envelopes in the natural language and image processing domains. … [However,] in this paper, we empirically demonstrate that the current state of the art in neural function boundary detection is vulnerable to both inadvertent and deliberate adversarial attacks.” Find the paper and full list of authors in the Proceedings of the 26th International Symposium on Research in Attacks, Intrusions and Defenses.

‘Accelerating Finite Field Arithmetic for Homomorphic Encryption on GPUs’
“Fully homomorphic encryption (FHE) is a rapidly developing technology that enables computation directly on encrypted data, making it a compelling solution for security in cloudbased systems. In addition, modern FHE schemes are believed to be resistant to quantum attacks. Although FHE offers unprecedented potential for security, current implementations suffer from prohibitively high latency. … The parallel processing capabilities provided by modern GPUs make them compelling candidates to target these highly parallelizable workloads. In this article, we discuss methods to accelerate polynomial multiplication with GPUs, with the goal of making FHE practical.” Find the paper and list of authors at IEEE…

‘Mapping the Typographic Latent Space of Digits’
“Since the advancement of handwritten text to typefaces on a computer, the human mind has evolved towards corresponding various typefaces as norms of comprehension. … Currently, the PANOSE system, developed in 1998, is the most widely used and accepted method for classifying typefaces based on 10 visual attributes. In this work, we employ Disentangled BetaVAE’s, in an unsupervised learning approach, to map the latent feature space with a dataset of MNIST Style Typographic Images (TMNISTDigit) of 09 digits across 2990 unique font styles.” Find the paper and full list of authors at Open Review.

‘StereoVoxelNet: RealTime Obstacle Detection Based on Occupancy Voxels from a Stereo Camera Using Deep Neural Networks’
“Obstacle detection is a safetycritical problem in robot navigation, where stereo matching is a popular visionbased approach. While deep neural networks have shown impressive results in computer vision, most of the previous obstacle detection works only leverage traditional stereo matching techniques to meet the computational constraints for realtime feedback. This paper proposes a computationally efficient method that employs a deep neural network to detect occupancy from stereo images directly. … Our approach extracts the compact obstacle distribution based on volumetric representations.” Find the paper and full list of authors in the 2023 IEEE International Conference on Robotics and Automation proceedings.

‘Lilac: A Modal Separation Logic for Conditional Probability’
“We present Lilac, a separation logic for reasoning about probabilistic programs where separating conjunction captures probabilistic independence. Inspired by an analogy with mutable state where sampling corresponds to dynamic allocation, we show how probability spaces over a fixed, ambient sample space appear to be the natural analogue of heap fragments, and present a new combining operation on them such that probability spaces behave like heaps and measurability of random variables behaves like ownership. This combining operation forms the basis for our model of separation, and produces a logic with many pleasant properties.” Find the paper and authors list at ArXiv.

‘Style2Fab: FunctionalityAware Segmentation for Fabricating Personalized 3D Models With Generative AI’
“With recent advances in Generative AI, it is becoming easier to automatically manipulate 3D models. However, current methods tend to apply edits to models globally, which risks compromising the intended functionality of the 3D model when fabricated in the physical world. For example, modifying functional segments in 3D models, such as the base of a vase, could break the original functionality of the model, thus causing the vase to fall over. We introduce a method for automatically segmenting 3D models into functional and aesthetic elements.” Find the paper and full list of authors at ArXiv.

‘HighThroughput Microscopy Image Deblurring With Graph Reasoning Attention Network’
“Highquality (HQ) microscopy images afford more detailed information for modern life science research and quantitative image analyses. However, in practice, HQ microscopy images are not commonly available or suffer from blurring artifacts. Compared with natural images, such lowquality (LQ) microscopy ones often share some visual characteristics: more complex structures, less informative background, and repeating patterns. … To address those problems, we collect HQ electron microscopy and histology datasets and propose a graph reasoning attention network (GRAN).” Find the paper and full list of authors in the 2023 IEEE 20th International Symposium on Biomedical Imaging proceedings.

‘MultiPLE: A Scalable and Polyglot Approach to Benchmarking Neural Code Generation’
“Large language models have demonstrated the ability to generate both natural language and programming language text. Although contemporary code generation models are trained on corpora with several programming languages, they are tested using benchmarks that are typically monolingual. The most widely used code generation benchmarks only target Python, so there is little quantitative evidence of how code generation models perform on other programming languages. We propose MultiPLE, a system for translating unit testdriven code generation benchmarks to new languages.” Find the paper and full list of authors at IEEE Transactions on Software Engineering.

‘NeRFInvertor: High Fidelity NeRFGAN Inversion for SingleShot Real Image Animation’
“Nerfbased Generative models have shown impressive capacity in generating highquality images with consistent 3D geometry. Despite successful synthesis of fake identity images randomly sampled from latent space, adopting these models for generating face images of real subjects is still a challenging task due to its socalled inversion issue. In this paper, we propose a universal method to surgically finetune these NeRFGAN models in order to achieve highfidelity animation of real subjects only by a single image.” Find the paper and full list of authors in the 2023 IEEE/CVF Conference on Computer Vision and Pattern Recognition proceedings.

‘Viper: A Fast Snapshot Isolation Checker’
“Snapshot isolation (SI) is supported by most commercial databases and is widely used by applications. However, checking SI today — given a set of transactions, checking if they obey SI — is either slow or gives up soundness. We present viper, an SI checker that is sound, complete and fast. Viper checks blackbox databases and hence is transparent to both users and databases. To be fast, viper introduces BCpolygraphs, a new representation of transaction dependencies.” Find the paper and full list of authors in the Proceedings of the Eighteenth European Conference on Computer Systems.

‘PACLearning for Strategic Classification’
‘The study of strategic or adversarial manipulation of testing data to fool a classifier has attracted much recent attention. Most previous works have focused on two extreme situations where any testing data point either is completely adversarial or always equally prefers the positive label. In this paper, we generalize both of these through a unified framework by considering strategic agents with heterogenous preferences, and introduce the notion of strategic VCdimension (SVC) to capture the PAClearnability in our general strategic setup. SVC provably generalizes the recent concept of adversarial VCdimension (AVC).” Find the paper and full list of authors at JMLR.

‘Framing Frames: Bypassing WiFi Encryption by Manipulating Transmit Queues’
“WiFi devices routinely queue frames at various layers of the network stack before transmitting, for instance, when the receiver is in sleep mode. In this work, we investigate how WiFi access points manage the security context of queued frames. By exploiting powersave features, we show how to trick access points into leaking frames in plaintext, or encrypted using the group or an allzero key. We demonstrate resulting attacks against several opensource network stacks.” Find the paper and full list of authors in the 32nd USENIX Security Symposium proceedings.

‘One Tree to Rule Them All: PolyLogarithmic Universal Steiner Tree’
“A spanning tree T of graph G is a ρapproximate universal Steiner tree (UST) for root vertex r if, for any subset of vertices S containing r, the cost of the minimal subgraph of T connecting S is within a ρ factor of the minimum cost tree connecting S in G. … We settle [several] open questions by giving polynomialtime algorithms for computing both O(log7n)approximate USTs and polylogarithmic strong sparse partition hierarchies.” Find the paper and full list of authors at ArXiv.

‘OASIS: Optimal Arrangements for Sensing in SLAM’
“The number and arrangement of sensors on an autonomous mobile robot dramatically influence its perception capabilities. Ensuring that sensors are mounted in a manner that enables accurate detection, localization and mapping is essential for the success of downstream control tasks. However, when designing a new robotic platform, researchers and practitioners alike usually mimic standard configurations or maximize simple heuristics like fieldofview (FOV) coverage to decide where to place exteroceptive sensors. … We conduct an informationtheoretic investigation of this overlooked element of mobile robotic perception in the context of simultaneous localization and mapping.” Find the paper and authors list at ArXiv.

‘La Independiente: Designing Ubiquitous Systems for Latin American and Caribbean Women Crowdworkers’
“Since 2018, Venezuelans have contributed to 75% of leading AI crowd work platforms’ total workforce. … Few initiatives have investigated the impact of such work in the Global South through the lens of feminist theory. … We surveyed 55 LAC women on the crowd work platform Toloka to understand their personal goals, professional values and hardships. … Most participants shared a desire to hear the experiences of other women crowdworkers.” Find the paper and full list of authors in the Adjunct Proceedings of the 2023 ACM International Joint Conference on Pervasive and Ubiquitous Computing and the International Symposium on Wearable…

‘SCORE: A SecondOrder Conic Initialization for RangeAided SLAM’
“We present a novel initialization technique for the rangeaided simultaneous localization and mapping (RASLAM) problem. In RASLAM we consider measurements of pointtopoint distances in addition to measurements of rigid transformations to landmark or pose variables. Standard formulations of RASLAM approach the problem as nonconvex optimization, which requires a good initialization to obtain quality results. The initialization technique proposed here relaxes the RASLAM problem to a convex problem which is then solved to determine an initialization for the original, nonconvex problem.” Find the paper and full list of authors in the IEEE International Conference on Robotics and Automation proceedings.

‘Towards Automated Pain Assessment Using Embodied Conversational Agents’
“Narrative accounts are the ultimate authoritative source for pain assessment, and facetoface encounters provide a rich context in which nonverbal conversational behavior can be used to enrich the detail in these descriptions. Embodied Conversational Agents—animated characters that simulate facetoface conversation—can provide a medium for automated pain assessment in which multimodal pain narratives are elicited, clarified and grounded. … We describe work towards a conversational agent that elicits various aspects of a pain experience, followed by an empathic summary.” Find the paper and full list of authors in the Companion Publication of the 25th International Conference on Multimodal Interaction.

‘Improving Multiparty Interactions With a Robot Using Large Language Models’
“Speaker diarization is a key component of systems that support multiparty interactions of colocated users, such as meeting facilitation robots. The goal is to identify who spoke what, often to provide feedback, moderate participation, and personalize responses by the robot. … We leverage large language models (LLMs) to identify speaker labels from transcribed text and observe an exact match of 77% and a word level accuracy of 90%.” Find the paper and full list of authors in the Extended Abstracts of the 2023 CHI Conference on Human Factors in Computing Systems.

‘NPMFollower: A Complete Dataset Tracking the NPM Ecosystem’
“Software developers typically rely upon a large network of dependencies to build their applications. … However, prior work on NPM dataset construction typically has two limitations: 1) only metadata is scraped, and 2) packages or versions that are deleted from NPM can not be scraped. … We present npmfollower, a dataset and crawling architecture which archives metadata and code of all packages and versions as they are published and is thus able to retain data which is later deleted.” Find the paper and full list of authors at ArXiv.

‘Sublinear Time Algorithms and Complexity of Approximate Maximum Matching’
“Sublinear time algorithms for approximating maximum matching size have long been studied. Much of the progress over the last two decades on this problem has been on the algorithmic side. … A more recent algorithm by [Behnezhad, Roghani, Rubinstein, and Saberi; SODA’23] obtains a slightlybetterthan1/2 approximation in O(n1+є) time (for arbitrarily small constant ε>0). … Proving any superlinear in n lower bound, even for (1−є)approximations, has remained elusive. … In this paper, we prove the first superlinear in n lower bound for this problem.” Find the paper and authors list in the Proceedings of the 55th Annual ACM Symposium on…

‘Testing Methods of Neural Systems Understanding’
“Neuroscientists apply a range of analysis tools to recorded neural activity in order to glean insights into how neural circuits drive behavior in organisms. … Can the tools of neuroscience be applied to artificial neural networks (ANNs) and if so what would this process tell us about ANNs, brains, and – most importantly – the tools themselves? Here we argue that applying analysis methods from neuroscience to ANNs will provide a muchneeded test of the abilities of these tools.” Find the paper and full list of authors at Cognitive Systems Research.

‘Flexible and Optimal Dependency Management via MaxSMT’
“Package managers such as NPM have become essential for software development. The NPM repository hosts over 2 million packages and serves over 43 billion downloads every week. Unfortunately, the NPM dependency solver has several shortcomings. … Although existing tools try to address these problems they are either brittle, rely on post hoc changes to the dependency tree, do not guarantee optimality, or are not composable. We present Pacsolve, a unifying framework and implementation for dependency solving which allows for customizable constraints and optimization goals.” Find the paper and full list of authors at the International Conference on Software Engineering proceedings.

‘Active Learning for Classifying 2D GridBased Level Completability’
“Determining the completability of levels generated by procedural generators such as machine learning models can be challenging, as it can involve the use of solver agents that often require a significant amount of time to analyze and solve levels. Active learning is not yet widely adopted in game evaluations, although it has been used successfully in natural language processing, image and speech recognition, and computer vision, where the availability of labeled data is limited or expensive. In this paper, we propose the use of active learning for learning level completability classification.” Find the paper and full list of authors at ArXiv.