All Work
Title
Topic
-
‘SantaCoder: Don’t Reach for the Stars!
“The BigCode project is an open-scientific collaboration working on the responsible development of large language models for code. This tech report describes the progress of the collaboration until December 2022, outlining the current state of the Personally Identifiable Information (PII) redaction pipeline, the experiments conducted to de-risk the model architecture, and the experiments investigating better preprocessing methods for the training data. … We find that more aggressive filtering of near-duplicates can further boost performance and, surprisingly, that selecting files from repositories with 5+ GitHub stars deteriorates performance significantly.” Find the paper and the full list of authors at ArXiv.
-
‘Do Machine Learning Models Produce TypeScript Types that Type Check?’
“Type migration is the process of adding types to untyped code to gain assurance at compile time. TypeScript and other gradual type systems facilitate type migration by allowing programmers to start with imprecise types and gradually strengthen them. … Existing machine learning models report a high degree of accuracy in predicting individual TypeScript type annotations. However, in this paper we argue that accuracy can be misleading, and we should address a different question: can an automatic type migration tool produce code that passes the TypeScript type checker?” Read the paper and see the full list of authors in ArXiv.
-
‘CHiLL: Zero-Shot Custom Interpretable Feature Extraction From Clinical Notes With Large Language Models’
“Large Language Models (LLMs) have yielded fast and dramatic progress in NLP, and now offer strong few- and zero-shot capabilities on new tasks, reducing the need for annotation. This is especially exciting for the medical domain, in which supervision is often scant and expensive. At the same time, model predictions are rarely so accurate that they can be trusted blindly. … We propose CHiLL (Crafting High-Level Latents), which uses LLMs to permit natural language specification of high-level features for linear models via zero-shot feature extraction using expert-composed queries.” Find the paper and the full list of authors in ArXiv.
-
Jornet receives best demo for ‘Adversarial Aerial Metasurfaces’ at ACM HotMobile 2023
“Electrical and computer engineering associate professor Josep Jornet received the Best Demo Award at the 24th International Workshop on Mobile Computing Systems and Applications (HotMobile) for the work titled ‘Adversarial Aerial Metasurfaces,’ with electrical engineering student Sherif Badran, PhD’26, and collaborators at Rice and Brown Universities.”
-
‘Certifiably Correct Range-Aided SLAM’
“We present the first algorithm capable of efficiently computing certifiably optimal solutions to range-aided simultaneous localization and mapping (RA-SLAM) problems. Robotic navigation systems are increasingly incorporating point-to-point ranging sensors, leading state estimation which takes the form of RA-SLAM. However, the RA-SLAM problem is more difficult to solve than traditional pose-graph SLAM … a single range measurement does not uniquely determine the relative transform between the involved sensors, and RA-SLAM inference is highly sensitive to initial estimates.” Read the paper and see the full list of authors in ArXiv.
-
‘Why is the State of Neural Network Pruning so Confusing?’
“The state of neural network pruning has been noticed to be unclear and even confusing for a while, largely due to ‘a lack of standardized benchmarks and metrics.’ To standardize benchmarks, first, we need to answer: what kind of comparison setup is considered fair? … Meanwhile, we observe several papers have used (severely) sub-optimal hyper-parameters in pruning experiments, while the reason behind them is also elusive. These sub-optimal hyper-parameters further exacerbate the distorted benchmarks, rendering the state of neural network pruning even more obscure.” Read the paper and see the full list of authors in ArXiv.
-
‘Adaptive Test Generation Using a Large Language Model’
“Unit tests play a key role in ensuring the correctness of software. However, manually creating unit tests is a laborious task, motivating the need for automation. This paper presents TestPilot, an adaptive test generation technique that leverages Large Language Models (LLMs). TestPilot uses Codex, an off-the-shelf LLM, to automatically generate unit tests for a given program without requiring additional training or few-shot learning on examples of existing tests.” Read the paper and see the full list of authors in ArXiv.
-
‘Improving Deep Policy Gradients With Value Function Search’
“Deep Policy Gradient (PG) algorithms employ value networks to drive the learning of parameterized policies and reduce the variance of the gradient estimates. However, value function approximation gets stuck in local optima and struggles to fit the actual return, limiting the variance reduction efficacy and leading policies to sub-optimal performance. This paper focuses on improving value approximation and analyzing the effects on Deep PG primitives such as value prediction, variance reduction, and correlation of gradient estimates with the true gradient.” Read the paper and see the full list of authors in ArXiv.
-
‘Safe Deep Reinforcement Learning by Verifying Task-Level Properties’
“Cost functions are commonly employed in Safe Deep Reinforcement Learning (DRL). However, the cost is typically encoded as an indicator function due to the difficulty of quantifying the risk of policy decisions in the state space. Such an encoding requires the agent to visit numerous unsafe states to learn a cost-value function to drive the learning process toward safety. … In this paper, we investigate an alternative approach that uses domain knowledge to quantify the risk in the proximity of such states by defining a violation metric.” Read the paper and see the full list of authors in ArXiv.
-
Flood dangers rise as shipping channels deepen
Maqsood Mansur, graduate teaching assistant, assistant professor Julia Hopkins and professor Qin Jim Chen, have published a study investigating if “depth increase in a navigational channel in an estuarine region results in the amplification of the inland penetration of storm surge, thereby increasing the flood vulnerability,” concluding “that even the most conservative scenario of [sea-level rise] will cause an approximately 51% increase in flooded area in … the deepest ship channel.” Find “Estuarine Response to Storm Surge and Sea-Level Rise Associated with Channel Deepening: A Flood Vulnerability Assessment of Southwest Louisiana, USA” and the full list of authors in Natural…
-
Landherr receives American Institute of Chemical Engineers grant to create instructional comic for high schoolers
“Chemical engineering distinguished teaching professor Lucas Landherr has received a $3,500 grant from the American Institute of Chemical Engineers Foundation to create a comic that details the work of chemical engineering for high school seniors and first-year college engineering students.”
-
‘Efficient Resilient Functions’
“An n-bit boolean function is resilient to coalitions of size q if no fixed set of q bits is likely to influence the value of the function when the other n — q bits are chosen uniformly at random, even though the function is nearly balanced. We construct explicit functions resilient to coalitions of size q = n/(log n)O(log log n) = n1-o(1) computable by linear-size circuits and linear-time algorithms. We also obtain a tight size-depth tradeoff for computing such resilient functions.” Read the paper and see the full list of authors at SIAM.
-
Tadigadapa joins 2023 National Academy of Inventors as Senior Member
Professor and chair of electrical and computer engineering Srinivas Tadigadapa has been named as a Senior Member of the National Academy of Inventors. The National Academy of Inventors “was founded in 2010 to recognize and encourage inventors with patents issued from the United States Patent and Trademark Office, enhance the visibility of academic technology and innovation, encourage the disclosure of intellectual property, educate, and mentor innovative students, and translate the inventions of its members to benefit society,” they write in their mission statement.
-
Hofmann wins Outstanding Dissertation Award for work in disability studies and human-computer interaction
Megan “Hofmann, a senior research fellow at Khoury College who will begin as an assistant professor this fall,” Matty Wasserman writes for the Khoury College of Computer Science, had been awarded with the SIGCHI Outstanding Dissertation Award for her work “within the fields of human–computer interaction (HCI) and digital fabrication.”
-
Innovations in printed electronics: Transistors in silicon
Professor of electrical and computer engineering Ravinder Dahiya, in collaboration with researchers from the University of Glasgow, has published research that advances electronic printing. Printing “high-performance and stable transistors … remains a major challenge. This is because of the difficulties to print high-mobility semiconducting materials and the lack of high-resolution printing techniques,” they write. Crucially, the researchers now propose “silicon based … transistors to demonstrate the possibility of developing high-performance complementary metal–oxide–semiconductor… computing architecture.” Read “Printed n- and p-Channel Transistors using Silicon Nanoribbons Enduring Electrical, Thermal, and Mechanical Stress” and see the full list of authors in ACS Publications.
-
‘NapSS: Paragraph-Level Medical Text Simplification via Narrative Prompting and Sentence-Matching Summarization’
“Accessing medical literature is difficult for laypeople as the content is written for specialists and contains medical jargon. Automated text simplification methods offer a potential means to address this issue. In this work, we propose a summarize-then-simplify two-stage strategy, which we call NapSS, identifying the relevant content to simplify while ensuring that the original narrative flow is preserved. In this approach, we first generate reference summaries via sentence matching between the original and the simplified abstracts.” Read the paper and see the full list of authors in ArXiv.
-
Bajpayee spotlight speaker at Orthopedic Research Society
Associate professor Ambika Bajpayee presented as a spotlight speaker at the 2023 Orthopedic Research Society conferences, from February 10-14. Her talk was on “Bioelectricity for Cartilage Drug Delivery and Imaging.”
-
Riley receives Black Heritage Award for ‘dedicated service to Northeastern’
“Civil and environmental engineering lecturer and operations manager Rozanna Riley was selected to receive the Black Heritage Award, which is given to those Northeastern staff and administrators in recognition of their dedicated service to Northeastern, to the students, and/or to the John D. O’Bryant African American Institute.”
-
Patent for ultrasonic, underwater communication system
“Electrical and computer engineering assistant professor Francesco Restuccia, research assistant professor Emrecan Demirors and professor Tommaso Melodia were awarded a patent for “Underwater ultrasonic communication system and method.”
-
‘Generalization in Graph Neural Networks: Improved PAC-Bayesian Bounds on Graph Diffusion’
“Graph neural networks are widely used tools for graph prediction tasks. Motivated by their empirical performance, prior works have developed generalization bounds for graph neural networks, which scale with graph structures in terms of the maximum degree. In this paper, we present generalization bounds that instead scale with the largest singular value of the graph neural network’s feature diffusion matrix.” Read the paper and see the full list of authors in ArXiv.
-
‘How Many and Which Training Points Would Need To Be Removed To Flip this Prediction?’
“We consider the problem of identifying a minimal subset of training data St such that if the instances comprising St had been removed prior to training, the categorization of a given test point xt would have been different. … We propose comparatively fast approximation methods to find St based on influence functions, and find that—for simple convex text classification models—these approaches can often successfully identify relatively small sets of training examples which, if removed, would flip the prediction.” Read the paper and see the full list of authors in ArXiv.
-
Ganguly presents ‘a personal journey’ of climate resistance
“Auroop Ganguly, professor of civil and environmental engineering at Northeastern University, will share his personal journey building climate resilience. Professor Ganguly co-founded the climate analytics startup risQ, which models the complex financial risks posed by climate change.”
-
Hajjar receives $3.1 million grant for carbon-neutral construction research
“In a new $3.1 million grant from the Department of Energy’s Advanced Research Projects Agency-Energy (ARPA-E), Northeastern department of civil and environmental engineering chair and CDM Smith Professor Jerome Hajjar will lead a multi-institution team of researchers developing a new carbon sequestration technique using cross-laminated timber composite floor systems in bolted steel construction for building structures. The new structural method aims to decrease the use of steel while increasing the use of carbon-storing timber and design for deconstruction methods.”
-
‘An Optimized Acidic Digestion for the Isolation of Microplastics From Biota-Rich Samples’
“Plastic pollution is a growing concern. To analyze plastics in environmental samples, plastics need to be isolated. We present an acidic/oxidative method optimized to preserve plastics while digesting synthetic cellulose acetate and a range of organics encountered in environmental samples.” Find the paper and the full list of authors in Environmental Pollution.
-
‘Generative Adversarial Symmetry Discovery’
“Despite the success of equivariant neural networks in scientific applications, they require knowing the symmetry group a priori. However, it may be difficult to know which symmetry to use as an inductive bias in practice. Enforcing the wrong symmetry could even hurt the performance. In this paper, we propose a framework, LieGAN, to automatically discover equivariances from a dataset using a paradigm akin to generative adversarial training.” Read the paper and see the full list of authors in ArXiv.