Title

Topic

  • ‘DIALITE: Discover, Align and Integrate Open Data Tables’

    “We demonstrate a novel table discovery pipeline called DIALITE that allows users to discover, integrate and analyze open data tables. DIALITE has three main stages. First, it allows users to discover tables from open data platforms using state-of-the-art table discovery techniques. Second, DIALITE integrates the discovered tables to produce an integrated table. Finally, it allows users to analyze the integration result by applying different downstreaming tasks over it. Our pipeline is flexible such that the user can easily add and compare additional discovery and integration algorithms.” Find the paper and the full list of authors at ArXiv.

    Learn more

  • ‘A Statistical Approach for Finding Property-Access Errors’

    “We study the problem of finding incorrect property accesses in JavaScript where objects do not have a fixed layout, and properties (including methods) can be added, overwritten, and deleted freely throughout the lifetime of an object. Since referencing a non-existent property is not an error in JavaScript, accidental accesses to non-existent properties … can go undetected without thorough testing, and may manifest far from the source of the problem. We propose a two-phase approach for detecting property access errors based on the observation that, in practice, most property accesses will be correct.” Find the paper and full list of authors…

    Learn more

  • ‘Safe Environmental Envelopes of Discrete Systems’

    “A safety verification task involves verifying a system against a desired safety property under certain assumptions about the environment. However, these environmental assumptions may occasionally be violated due to modeling errors or faults. Ideally, the system guarantees its critical properties even under some of these violations, i.e., the system is robust against environmental deviations. This paper proposes a notion of robustness as an explicit, first-class property of a transition system that captures how robust it is against possible deviations in the environment.” Find the paper and the full list of authors at ArXiv.

    Learn more

  • Huselid named SIOP and NAHR fellow for ‘highest level of achievement in the human resources profession’

    “D’Amore-McKim School of Business Distinguished Professor Mark Huselid was recently named a fellow by the Society for Industrial and Organizational Psychology (SIOP) and the National Academy of Human Resources (NAHR). SIOP fellows must have made significant and sustained contributions that have shaped the field of Industrial-Organizational (I-O) psychology–the scientific study of working and the application of that science to workplace issues facing individuals, teams, and organizations. Huselid joins D’Amore-McKim professors Paula Caligiuri and Cynthia Lee who are also SIOP fellows.” NAHR fellowship, which “he was also recently inducted as … is regarded as the most prestigious honor in the human resources field.” h

    Learn more

  • Hoitash and Pei win 2023 Ronald Copeland Best Paper Awards

    ,

    “Udi Hoitash, Lilian L. and Harry A. Cowan Professor of Accounting, and Amy Pei, Assistant Professor, Marketing, have authored papers awarded the 2023 Copeland Best Paper Awards. Each Group Research Committee nominated a paper published in 2022 by a faculty member in the Group to be considered for a Ronald Copeland Best Paper Award. Two papers were selected as this year’s winners as recommended by the DMSB Research Committee and confirmed by Dean Emery Trahan. Each D’Amore-McKim author will receive a $1,000 stipend.”

    Learn more

  • Sanders awarded Sushil K. Gupta Distinguished Service Award for contributions to ‘premier professional society’

    Distinguished professor Nada Sanders has been awarded the Sushil K. Gupta Distinguished Service Award for her contributions to the Production and Operations Management Society, which she described as “our premier professional society” in a LinkedIn post.

    Learn more

  • ‘School Transitions, Peer Processes and Delinquency: A Social Network Approach to Turning Points in Adolescence’

    “Objectives: We examine how normative school transitions (e.g., moves from elementary to middle school) shape adolescents’ experiences with three network processes that inform delinquency: delinquent popularity, delinquent sociability, and friend selection on shared delinquency participation. Methods: … We compare outcomes for students who change schools between 6th and 7th grade to those who remain in the same building. Results: We find that adolescents who transition schools between these grades have significantly different experiences with delinquency-related network processes when compared to their peers.” Find the paper and the full list of authors at the Journal of Research in Crime and Delinquency.

    Learn more

    ,
  • ‘Mississippi River Low-Flows: Context, Causes and Future Projections’

    “The Mississippi River represents a major commercial waterway, and periods of anomalously low river levels disrupt riverine transport. These low-flow events occur periodically, with a recent event in the fall of 2022 slowing barge traffic and generating sharp increases in riverine transportation costs. Here we … evaluate historical trends and future projections. … Model simulations from the LENS2 dataset show that, under a moderate-high emissions scenario (SSP3-7.0), the severity and duration of low-flow events is projected to decrease through to the end of the 21st century. ” Find the paper and the full list of authors in Environmental Research: Climate.

    Learn more

    , ,
  • ‘Just-in-Time for Supply Chains in Turbulent Times’

    “The Covid-19 pandemic and other recent disruptions in the early 2020s led to sections in the business press blaming just-in-time (JIT) practices for operational failings. … Some scholars argue that JIT is not resilient, while others maintain that JIT can continue providing superior performance even with disruptions. Motivated by this debate, we discuss various misconceptions about JIT that underlie this debate … and argue that companies can improve their supply chain performance if JIT supply chain segments are chosen fittingly—even more so—during disruptions.” Find the paper and the full list of authors at Production and Operations Management.

    Learn more

  • ‘Why Not Yet: Fixing a Top-k Ranking That is Not Fair to Individuals’

    ,

    “This work considers why-not questions in the context of top-k queries and score-based ranking functions. Following the popular linear scalarization approach for multi-objective optimization, we study rankings based on the weighted sum of multiple scores. A given weight choice may be controversial or perceived as unfair. … We introduce various notions of such why-not-yet queries and formally define them as satisfiability or optimization problems, whose goal is to propose alternative ranking functions that address the placement of the entities of interest.” Find the paper and the full list of authors at the Association for Computing Machinery.

    Learn more

  • ‘Fast Optimal Locally Private Mean Estimation via Random Projections’

    “We study the problem of locally private mean estimation of high-dimensional vectors in the Euclidean ball. Existing algorithms for this problem either incur sub-optimal error or have high communication and/or run-time complexity. We propose a new algorithmic framework, ProjUnit, for private mean estimation that yields algorithms that are computationally efficient, have low communication complexity, and incur optimal error up to a 1+o(1)-factor. Our framework is deceptively simple: each randomizer projects its input to a random low-dimensional subspace, normalizes the result, and then runs an optimal algorithm.” Find the paper and the full list of authors at ArXiv.

    Learn more

  • ‘Online and Streaming Algorithms for Constrained k-Submodular Maximization’

    “Constrained k-submodular maximization is a general framework that captures many discrete optimization problems such as ad allocation, influence maximization, personalized recommendation, and many others. In many of these applications, datasets are large or decisions need to be made in an online manner, which motivates the development of efficient streaming and online algorithms. In this work, we develop single-pass streaming and online algorithms for constrained k-submodular maximization with both monotone and general (possibly non-monotone) objectives subject to cardinality and knapsack constraints.” Find the paper and the full list of authors at ArXiv.

    Learn more

  • ‘Poisoning Network Flow Classifiers’

    ,

    “As machine learning (ML) classifiers increasingly oversee the automated monitoring of network traffic, studying their resilience against adversarial attacks becomes critical. This paper focuses on poisoning attacks, specifically backdoor attacks, against network traffic flow classifiers. We investigate the challenging scenario of clean-label poisoning where the adversary’s capabilities are constrained to tampering only with the training data—without the ability to arbitrarily modify the training labels or any other component of the training process.” Find the paper and the full list of authors at ArXiv.

    Learn more

  • ‘Unleashing the Power of Randomization in Auditing Differentially Private ML’

    “We present a rigorous methodology for auditing differentially private machine learning algorithms by adding multiple carefully designed examples called canaries. … First, we introduce Lifted Differential Privacy (LiDP) that expands the definition of differential privacy to handle randomized datasets. … Second, we audit LiDP by trying to distinguish between the model trained with K canaries versus K−1 canaries in the dataset, leaving one canary out. … Third, we introduce novel confidence intervals that take advantage of the multiple test statistics by adapting to the empirical higher-order correlations. Together, this new recipe demonstrates significant improvements in sample complexity.” Find the paper and the full list…

    Learn more

  • ‘Freaky Leaky SMS: Extracting User Locations by Analyzing SMS Timings’

    “Short Message Service (SMS) remains one of the most popular communication channels since its introduction. … In this paper, we demonstrate that merely receiving silent SMS messages regularly opens a stealthy side-channel that allows other regular network users to infer the whereabouts of the SMS recipient. The core idea is that receiving an SMS inevitably generates Delivery Reports whose reception bestows a timing attack vector at the sender. We conducted experiments across various countries, operators, and devices to show that an attacker can deduce the location of an SMS recipient.” Find the paper and the full list of authors at ArXiv.

    Learn more

    ,
  • ‘How Creative Versus Technical Constraints Affect Individual Learning in an Online Innovation Community’

    “Online innovation communities allow for a search for novel solutions within a design space bounded by constraints. Past research has focused on the effect of creative constraints on individual projects, but less is known about how constraints affect learning from repeated design submissions and the effect of the technical constraints that are integral to online platforms. How do creative versus technical constraints affect individual learning in exploring a design space in online communities? … We find that creative constraints lead to high rates of learning only if technical constraints are sufficiently relaxed.” Find the paper and full list of authors…

    Learn more

  • ‘Homo in Machina: Improving Fuzz Testing Coverage via Compartment Analysis’

    “Fuzz testing is often automated, but also frequently augmented by experts who insert themselves into the workflow in a greedy search for bugs. In this paper, we propose Homo in Machina, or HM-fuzzing, in which analyses guide the manual efforts, maximizing benefit. As one example … we introduce compartment analysis. Compartment analysis uses a whole-program dominator analysis to estimate the utility of reaching new code, and combines this with a dynamic analysis indicating drastically under-covered edges guarding that code.” Find the paper and full list of authors in the proceedings of the 2023 IEEE Conference on Software Testing, Verification and…

    Learn more

  • ‘Incremental Non-Gaussian Inference for SLAM Using Normalizing Flows’

    “This article presents normalizing flows for incremental smoothing and mapping (NF-iSAM), a novel algorithm for inferring the full posterior distribution in SLAM problems with nonlinear measurement models and non-Gaussian factors. NF-iSAM exploits the expressive power of neural networks, and trains normalizing flows to model and sample the full posterior. By leveraging the Bayes tree, NF-iSAM enables efficient incremental updates similar to iSAM2, albeit in the more challenging non-Gaussian setting.” Find the paper and the full list of authors in the IEEE Transactions on Robotics.

    Learn more

  • ‘A Variational Perspective on Generative Flow Networks’

    “Generative flow networks (GFNs) are a class of probabilistic models for sequential sampling of composite objects, proportional to a target distribution that is defined in terms of an energy function or a reward. GFNs are typically trained using a flow matching or trajectory balance objective, which matches forward and backward transition models over trajectories. … We introduce a variational objective for training GFNs, which is a convex combination of the reverse- and forward KL divergences, and compare it to the trajectory balance objective when sampling from the forward- and backward model, respectively.” Find the paper and full list of authors…

    Learn more

  • ‘String Diagrams with Factorized Densities’

    “A growing body of research on probabilistic programs and causal models has highlighted the need to reason compositionally about model classes that extend directed graphical models. Both probabilistic programs and causal models define a joint probability density over a set of random variables, and exhibit sparse structure that can be used to reason about causation and conditional independence. This work builds on recent work on Markov categories of probabilistic mappings to define a category whose morphisms combine a joint density … with a deterministic mapping from samples to return values.” Find the paper and the full list of authors at ArXiv.

    Learn more

  • ‘Efficient Computation of Quantiles over Joins’

    “We present efficient algorithms for Quantile Join Queries, abbreviated as %JQ. … Our goal is to avoid materializing the set of all join answers, and to achieve quasilinear time in the size of the database, regardless of the total number of answers. Even for basic ranking functions beyond sum, such as min or max over different attributes, so far it is not known whether there is any nontrivial tractable %JQ. In this work, we develop a new approach to solving %JQ.” Find the paper and the full list of authors at ArXiv.

    Learn more

  • ‘Tractable Orders for Direct Access to Ranked Answers of Conjunctive Queries’

    ,

    “We study the question of when we can provide direct access to the k-th answer to a Conjunctive Query (CQ) according to a specified order over the answers in time logarithmic in the size of the database, following a preprocessing step that constructs a data structure in time quasilinear in database size. Specifically, we embark on the challenge of identifying the tractable answer orderings, that is, those orders that allow for such complexity guarantees.” Find the paper and the full list of authors in the ACM Transactions on Database Systems.

    Learn more

  • ‘FibeRed: Fiberwise Dimensionality Reduction of Topologically Complex Data with Vector Bundles’

    “Datasets with non-trivial large scale topology can be hard to embed in low-dimensional Euclidean space with existing dimensionality reduction algorithms. We propose to model topologically complex datasets using vector bundles, in such a way that the base space accounts for the large scale topology, while the fibers account for the local geometry. This allows one to reduce the dimensionality of the fibers, while preserving the large scale topology. We formalize this point of view and … describe a dimensionality reduction algorithm based on topological inference for vector bundles.” Find the paper and full list of authors at Dagstuhl Research Online.

    Learn more

  • ‘Toroidal Coordinates: Decorrelating Circular Coordinates with Lattice Reduction’

    “The circular coordinates algorithm of de Silva, Morozov, and Vejdemo-Johansson takes as input a dataset together with a cohomology class representing a 1-dimensional hole in the data; the output is a map from the data into the circle that captures this hole, and that is of minimum energy in a suitable sense. However, when applied to several cohomology classes, the output circle-valued maps can be ‘geometrically correlated’ even if the chosen cohomology classes are linearly independent. … We identify a formal notion of geometric correlation between circle-valued maps.” Find the paper and full list of authors at Dagstuhl Research Online.

    Learn more

  • ‘Topological Data Analysis of Electroencephalogram Signals for Pediatric Obstructive Sleep Apnea’

    ,

    “Topological data analysis (TDA) is an emerging technique for biological signal processing. TDA leverages the invariant topological features of signals in a metric space for robust analysis of signals even in the presence of noise. In this paper, we leverage TDA on brain connectivity networks derived from electroencephalogram (EEG) signals to identify statistical differences between pediatric patients with obstructive sleep apnea (OSA) and pediatric patients without OSA … and show that TDA enables us to see a statistical difference between the brain dynamics of the two groups.” Find the paper and the full list of authors at ArXiv.

    Learn more

    ,
  • ‘A Video-Based End-to-End Pipeline for Non-Nutritive Sucking Action Recognition and Segmentation in Young Infants’

    , , ,

    “We present an end-to-end computer vision pipeline to detect non-nutritive sucking (NNS)—an infant sucking pattern with no nutrition delivered—as a potential biomarker for developmental delays, using off-the-shelf baby monitor video footage. One barrier to clinical (or algorithmic) assessment of NNS stems from its sparsity, requiring experts to wade through hours of footage to find minutes of relevant activity.” Find the paper and the full list of authors at ArXiv.

    Learn more

    ,
  • ‘StudentEval: A Benchmark of Student-Written Prompts for Large Language Models of Code’

    “Code LLMs are being rapidly deployed and there is evidence that they can make professional programmers more productive. Current benchmarks for code generation measure whether models generate correct programs given an expert prompt. In this paper, we present a new benchmark containing multiple prompts per problem, written by a specific population of non-expert prompters: beginning programmers. StudentEval contains 1,749 prompts for 48 problems, written by 80 students who have only completed one semester of Python programming.” Find the paper and the full list of authors at ArXiv.

    Learn more

    ,
  • ‘StarCoder: May the Source Be with You!’

    “The BigCode community, an open-scientific collaboration working on the responsible development of Large Language Models for Code (Code LLMs), introduces StarCoder and StarCoderBase. … StarCoderBase is trained on 1 trillion tokens sourced from The Stack, a large collection of permissively licensed GitHub repositories with inspection tools and an opt-out process. We fine-tuned StarCoderBase on 35B Python tokens, resulting in the creation of StarCoder. We perform the most comprehensive evaluation of Code LLMs to date and show that StarCoderBase outperforms every open Code LLM that supports multiple programming languages.” Find the paper and the full list of authors at ArXiv.

    Learn more

  • ‘Pretraining Language Models with Human Preferences’

    “Language models (LMs) are pretrained to imitate internet text, including content that would violate human preferences if generated by an LM: falsehoods, offensive comments, personally identifiable information, low-quality or buggy code, and more. Here, we explore alternative objectives for pretraining LMs in a way that also guides them to generate text aligned with human preferences. We benchmark five objectives for pretraining with human feedback across three tasks and study how they affect the trade-off between alignment and capabilities of pretrained LMs.” Read the paper and see the full list of authors in ArXiv.

    Learn more

  • Dean Abowd receives lifetime achievement award for work in human-computer interaction

    Dean Gregory Abowd received the Lifetime Achievement Award from SIGCHI, the premier conference on human-computer interaction. He later re-presented his award acceptance speech at an event on Northeastern’s Boston campus. Abowd hopes “to inspire others to dispel fear of the unknown and unlock their potential,” he says in the presentation abstract. “Life, like research, is best when shared with others whom you can respect and befriend.” Find the recorded speech on YouTube.

    Learn more

    , ,