Using selective profiling, we build DMon, a system that can automatically locate data locality problems in production, identify access patterns that hurt locality, and repair such patterns using targeted optimizations. Professor Veloso has been recognized with a multiple honors, including being a Fellow of the ACM, IEEE, AAAS, and AAAI. Compared to a state-of-the-art fuzzer, Fluffy improves the fuzzing throughput by 510 and the code coverage by 2.7 with various optimizations: in-process fuzzing, fuzzing harnesses for Ethereum clients, and semantic-aware mutation that reduces erroneous test cases. Each new model trained with DP increases the bound on data leakage and can be seen as consuming part of a global privacy budget that should not be exceeded. In this paper, we present P3, a system that focuses on scaling GNN model training to large real-world graphs in a distributed setting. If the conference registration fee will pose a hardship for the presenter of the accepted paper, please contact conference@usenix.org. A hardware-accelerated thread scheduler makes sub-nanosecond decisions, leading to high CPU utilization and low tail response time for RPCs. Writing a correct operating system kernel is notoriously hard. Authors should email the program co-chairs, osdi21chairs@usenix.org, a copy of the related workshop paper and a short explanation of the new material in the conference paper beyond that published in the workshop version. Devices employ adaptive interrupt coalescing heuristics that try to balance between these opposing goals. Mothy's current research centers on Enzian, a powerful hybrid CPU/FPGA machine designed for research into systems software. This is the first OSDI in an odd year as OSDI moves to a yearly cadence. All deadline times are 23:59 hrs UTC. Leveraging these information, Pollux dynamically (re-)assigns resources to improve cluster-wide goodput, while respecting fairness and continually optimizing each DL job to better utilize those resources. In contrast, CLP achieves significantly higher compression ratio than all commonly used compressors, yet delivers fast search performance that is comparable or even better than Elasticsearch and Splunk Enterprise. We develop a prototype of Zeph on Apache Kafka to demonstrate that Zeph can perform large-scale privacy transformations with low overhead. It then feeds those invariants and the desired safety properties to an SMT solver to check if the conjunction of the invariants and the safety properties is inductive. (Jan 2019) Our REPT paper won a best paper at OSDI'18 (Oct 2018) I will serve in the SOSP'19 PC. Pollux simultaneously considers both aspects. Although SSDs can be simplified under the current ZNS interface, its counterpart LFS must bear segment compaction overhead. The ZNS+ also allows each zone to be overwritten with sparse sequential write requests, which enables the LFS to use threaded logging-based block reclamation instead of segment compaction. Calibrated interrupts increase throughput by up to 35%, reduce CPU consumption by as much as 30%, and achieve up to 37% lower latency when interrupts are coalesced. Mothy joined the Computer Science Department ETH Zurich in January 2007 and was named Fellow of the ACM in 2013 for contributions to operating systems and networking research. Instead, we propose addressing the root cause of the heuristics problem by allowing software to explicitly specify to the device if submitted requests are latency-sensitive. Welcome to the 15th USENIX Symposium on Operating Systems Design and Implementation (OSDI '21) submissions site. Used Zotero to organize papers about the stress and diffusion between anode and electrolyte and made a summary . We convert five state-of-the-art PM indexes using Nap. USENIX, like other scientific and technical conferences and journals, prohibits these practices and may, on the recommendation of a program chair, take action against authors who have committed them. USENIX ATC '21 - HotCRP.com Swapnil Gandhi and Anand Padmanabha Iyer, Microsoft Research. We also welcome work that explores the interface to related areas such as computer architecture, networking, programming languages, analytics, and databases. There is no explicit limit to the response, but authors are strongly encouraged to keep it under 500 words; reviewers are neither required nor expected to read excessively long responses. The chairs may reject abstracts or papers on the basis of egregious missing or extraneous conflicts. USENIX Security '21 has three submission deadlines. The full program will be available in May 2021. If in doubt about whether your submission to OSDI 2021 and your upcoming submission to SOSP are the same paper or not, please contact the PC chairs by email. When uploading your OSDI 2021 reviews for your submission to SOSP, you can optionally append a note about how you addressed the reviews and comments. This is unfortunate because good OS design has always been driven by the underlying hardware, and right now that hardware is almost unrecognizable from ten years ago, let alone from the 1960s when Unix was written. OSDI '21 Technical Sessions | USENIX OSDI '21 Call for Papers | USENIX Papers accompanied by nondisclosure agreement forms will not be considered. The 15th USENIX Symposium on Operating Systems Design and Implementation seeks to present innovative, exciting research in computer systems. Han Meng - Research Assistant - Michigan State University | LinkedIn Owing to the sequential write-only zone scheme of the ZNS, the log-structured file system (LFS) is required to access ZNS solid-state drives (SSDs). We describe Fluffy, a multi-transaction differential fuzzer for finding consensus bugs in Ethereum. We conclude with a discussion of additional techniques for improving the allocator development process and potential optimization strategies for future memory allocators. Camera-ready submission (all accepted papers): 15 Mars 2022. Submission of a response is optional. The 15th USENIX Symposium on Operating Systems Design and Implementation (OSDI '21) will take place as a virtual event on July 1416, 2021. Lifting predicates and crash framing make the specification easy to use for developers, and logically atomic crash specifications allow for modular reasoning in GoJournal, making the proof tractable despite complex concurrency and crash interleavings. Typically, monolithic kernels share state across cores and rely on one-off synchronization patterns that are specialized for each kernel structure or subsystem. Paper Submission Information All submissions must be received by 11:59 PM AoE (UTC-12) on the day of the corresponding deadline. HotCRP.com signin Sign in using your HotCRP.com account. Under different configurations of TPC-C and TPC-E, Polyjuice can achieve throughput numbers higher than the best of existing algorithms by 15% to 56%. We argue that a key-value interface between a file system and an SSD is superior to the legacy block interface by presenting KEVIN. will work with the steering committee to ensure that the symposium program will accommodate presentations for all accepted papers. OSDI'21 accepted 31 papers and 26 papers participated in the AE, a significant increase in the participate ratio: 84%, compared to OSDI'20 (70%) and SOSP'19 (61%). Our approach outperforms existing file systems on a block SSD by a wide margin 6.2 on average for metadata-intensive benchmarks. Starting with small invariant formulas and strongest possible invariants avoids large SMT queries, improving SMT solver performance. She developed the technology for making network routing self-stabilizing, largely self-managing, and scalable. OSDI - Guide Proceedings This paper presents the design and implementation of CLP, a tool capable of losslessly compressing unstructured text logs while enabling fast searches directly on the compressed data. These are hard deadlines, and no extensions will be given. (Visa applications can take at least 30 working days to process.) The conference papers and full proceedings are available to registered attendees now and will be available to everyone beginning Wednesday, July 14, 2021. Authors may upload supplementary material in files separate from their submissions. Sat, Aug 7, 2021 3 min read researches review. She has been recognized with many industry honors including induction into the National Academy of Engineering, the Inventor Hall of Fame, The Internet Hall of Fame, Washington State Academy of Science, and lifetime achievement awards from USENIX and SIGCOMM. Papers must be in PDF format and must be submitted via the submission form. JEL codes: Q18, Q28, Q57 . Our approach effectively eliminates high communication and partitioning overheads, and couples it with a new pipelined push-pull parallelism based execution strategy for fast model training. 1 Acknowledgements: Paper prepared for the post-conference workshop on Food for Thought: Economic Analysis in Anticipation of the Next Farm Bill at the Agricultural and Applied Economics Association annual meeting, Austin, TX . We evaluate PrivateKube and DPF on microbenchmarks and an ML workload on Amazon Reviews data. Many application domains can benefit from hybrid transaction/analytical processing (HTAP) by executing queries on real-time datasets produced by concurrent transactions. Responses should be limited to clarifying the submitted work. In this paper, we propose a software-hardware co-design to support dynamic, fine-grained, large-scale secure memory as well as fast-initialization. For general conference information, see https://www.usenix.org/conference/osdi22. Federated Learning (FL) is an emerging direction in distributed machine learning (ML) that enables in-situ model training and testing on edge data. The device then "calibrates" its interrupts to completions of latency-sensitive requests. . We demonstrate that KEVIN reduces the amount of I/O traffic between the host and the device, and remains particularly robust as the system ages and the data become fragmented. . OSDI brings together professionals from academic and industrial backgrounds in what has become a premier forum for discussing the design, implementation, and implications of systems software. The 15th USENIX Symposium on Operating Systems Design and Implementation (OSDI '21) will take place as a virtual event on July 14-16, 2021. NrOS is primarily constructed as a simple, sequential kernel with no concurrency, making it easier to develop and reason about its correctness. In 2023 I started another two-year term on the . A graph neural network (GNN) enables deep learning on structured graph data. Web pages today commonly include large amounts of JavaScript code in order to offer users a dynamic experience. While compiler-based techniques have been proposed to improve data locality, they depend on heuristics, which can sometimes hurt performance. Authors of each accepted paper must ensure that at least one author registers for the conference, and that their paper is presented in-person at the conference. In addition, increasing CPU core counts further complicate kernel development. The NAL maintains 1) per-node partial views in PM for serving insert/update/delete operations with failure atomicity and 2) a global view in DRAM for serving lookup operations. Performance experiments show that GoNFS provides similar performance (e.g., at least 90% throughput across several benchmarks on an NVMe disk) to Linuxs NFS server exporting an ext4 file system, suggesting that GoJournal is a competitive journaling system. We identify that current systems for learning the embeddings of large-scale graphs are bottlenecked by data movement, which results in poor resource utilization and inefficient training. Abstract registrations that do not provide sufficient information to understand the topic and contribution (e.g., empty abstracts, placeholder abstracts, or trivial abstracts) will be rejected, thereby precluding paper submission. Samantha Vaive - Member Board Of Trustees - Lansing Community College When further combined with a simple caching strategy, our evaluation shows that P3 is able to outperform existing state-of-the-art distributed GNN frameworks by up to 7. Acm Ccs 2022 - Sigsac Advisor: You have a past or present association as thesis advisor or advisee. 64 papers accepted out of 341 submitted. blk-switch uses this insight to adapt techniques from the computer networking literature (e.g., multiple egress queues, prioritized processing of individual requests, load balancing, and switch scheduling) to the Linux kernel storage stack. Computation separation makes it possible to construct a deep, bounded-asynchronous pipeline where graph and tensor parallel tasks can fully overlap, effectively hiding the network latency incurred by Lambdas. Tej Chajed, MIT CSAIL; Joseph Tassarotti, Boston College; Mark Theng, MIT CSAIL; Ralf Jung, MPI-SWS; M. Frans Kaashoek and Nickolai Zeldovich, MIT CSAIL. Prior or concurrent publication in non-peer-reviewed contexts, like arXiv.org, technical reports, talks, and social media posts, is permitted. Our evaluation shows that DistAI successfully verifies 13 common distributed protocols automatically and outperforms alternative methods both in the number of protocols it verifies and the speed at which it does so, in some cases by more than two orders of magnitude. Title Page, Copyright Page, and List of Organizers | Sanitizers detect unsafe actions such as invalid memory accesses by inserting checks that are validated during a programs execution. Secure Computation (SC) is a family of cryptographic primitives for computing on encrypted data in single-party and multi-party settings. We build Polyjuice based on our learning framework and evaluate it against several existing algorithms. Further, Vegito can recover from cascading machine failures by using the columnar backup in less than 60 ms. OSDI will provide an opportunity for authors to respond to reviews prior to final consideration of the papers at the program committee meeting. First, GNNAdvisor explores and identifies several performance-relevant features from both the GNN model and the input graph, and use them as a new driving force for GNN acceleration. The NAL eliminates remote PM accesses to hot items without inducing extra local PM accesses. This kernel is scaled across NUMA nodes using node replication, a scheme inspired by state machine replication in distributed systems. Editor in charge: Daniel Petrolia . PDF Why Has Personality Psychology Played an Outsized Role in the Table of Contents | If you have any questions about conflicts, please contact the program co-chairs. While several new GNN architectures have been proposed, the scale of real-world graphsin many cases billions of nodes and edgesposes challenges during model training. We present DPF (Dominant Private Block Fairness) a variant of the popular Dominant Resource Fairness (DRF) algorithmthat is geared toward the non-replenishable privacy resource but enjoys similar theoretical properties as DRF. She has a PhD in computer science from MIT. Overall, the OSDI PC accepted 31 out of 165 submissions. Furthermore, by combining SanRazor with an existing sanitizer reduction tool ASAP, we show synergistic effect by reducing the runtime cost to only 7.0% with a reasonable tradeoff of security. Submitted November 12, 2021 Accepted January 20, 2022. HotNets 2021: Call for Papers - sigcomm Commonly used log archival and compression tools like Gzip provide high compression ratio, yet searching archived logs is a slow and painful process as it first requires decompressing the logs. VLDB 2021: Venue Tivoli Hotel & Congress Center Arni Magnussons Gade 2 1577 Copenhagen, Denmark +45 3268 4300 In-person attendees can purchase tickets for the park / gardens with a 15% discount, which is a special offer by Tivoli Hotel & Congress Center to VLDB 2021 attendees. Graph Neural Networks (GNNs) have gained significant attention in the recent past, and become one of the fastest growing subareas in deep learning. We also show that Marius can scale training to datasets an order of magnitude beyond a single machine's GPU and CPU memory capacity, enabling training of configurations with more than a billion edges and 550 GB of total parameters on a single machine with 16 GB of GPU memory and 64 GB of CPU memory. To resolve the problem, we propose a new LFS-aware ZNS interface, called ZNS+, and its implementation, where the host can offload data copy operations to the SSD to accelerate segment compaction. EuroSys 2021 Prepublication versions of the accepted papers from the summer submission deadline are available below. Differential privacy (DP) enables model training with a guaranteed bound on this leakage. To adapt to different workloads, prior works mix or switch between a few known algorithms using manual insights or simple heuristics. We present case studies and end-to-end applications that show how Storm lets developers specify diverse policies while centralizing the trusted code to under 1% of the application, and statically enforces security with modest type annotation overhead, and no run-time cost. Mingyu Li, Jinhao Zhu, and Tianxu Zhang, Institute of Parallel and Distributed Systems, Shanghai Jiao Tong University; Shanghai AI Laboratory; Engineering Research Center for Domain-specific Operating Systems, Ministry of Education, China; Cheng Tan, Northeastern University; Yubin Xia, Institute of Parallel and Distributed Systems, Shanghai Jiao Tong University; Shanghai AI Laboratory; Engineering Research Center for Domain-specific Operating Systems, Ministry of Education, China; Sebastian Angel, University of Pennsylvania; Haibo Chen, Institute of Parallel and Distributed Systems, Shanghai Jiao Tong University; Shanghai AI Laboratory; Engineering Research Center for Domain-specific Operating Systems, Ministry of Education, China. Session Chairs: Nadav Amit, VMware Research Group, and Ada Gavrilovska, Georgia Institute of Technology, Stephen Ibanez, Alex Mallery, Serhat Arslan, and Theo Jepsen, Stanford University; Muhammad Shahbaz, Purdue University; Changhoon Kim and Nick McKeown, Stanford University. However, a plethora of recent data breaches show that even widely trusted service providers can be compromised. Professor Veloso is on leave from Carnegie Mellon University as the Herbert A. Simon University Professor in the School of Computer Science, and the past Head of the Machine Learning Department. We will look at various problems and approaches, and for each, see if blockchain would help. (Oct 2018) Awarded an Intel Faculty Grant for Research on automated performance optimization (Sep. 2018) Our paper on Foreshadow is accepted to appear at USENIX Security. Extensive experiments show that GNNAdvisor outperforms the state-of-the-art GNN computing frameworks, such as Deep Graph Library (3.02 faster on average) and NeuGraph (up to 4.10 faster), on mainstream GNN architectures across various datasets. SOSP Conference - Home - ACM Digital Library Kyuhwa Han, Sungkyunkwan University and Samsung Electronics; Hyunho Gwak and Dongkun Shin, Sungkyunkwan University; Jooyoung Hwang, Samsung Electronics. This paper presents Dorylus: a distributed system for training GNNs. Most existing schedulers expect users to specify the number of resources for each job, often leading to inefficient resource use. As increasingly more sensitive data is being collected to gain valuable insights, the need to natively integrate privacy controls in data analytics frameworks is growing in importance. This paper describes the design, implementation, and evaluation of Addra, the first system for voice communication that hides metadata over fully untrusted infrastructure and scales to tens of thousands of users. OSDI '22 - HotCRP.com Some recent schedulers choose job resources for users, but do so without awareness of how DL training can be re-optimized to better utilize the provided resources. She also invented the spanning tree algorithm, which transformed Ethernet from a technology that supported a few hundred nodes, to something that can support large networks. Researchers from the Software Systems Laboratory bagged a Best Paper Award at the 16th USENIX Symposium on Operating Systems Design and Implementation (OSDI 2021). They collectively make the backup fresh, columnar, and fault-tolerant, even facing millions of concurrent transactions per second. Main conference program: 5-8 April 2022. Artifact Evaluation - Systems Research Artifacts He joined Intel Research at Berkeley in April 2002 as a principal architect of PlanetLab, an open, shared platform for developing and deploying planetary-scale services. Moreover, as of October 2020, a review of the 50 most cited empirical papers that list personality as a keyword indicates that all 50 papers were authored by people with insti tutional affiliations in the United States, Canada, Germany, the UK, and New Zealand, and only three papers included samples outside of these regions (see Supplementary Indeed, it is a prime target for powerful adversaries such as nation states. Oort Accepted to Appear at OSDI'2021 | Mosharaf Chowdhury Amy Tai, VMware Research; Igor Smolyar, Technion Israel Institute of Technology; Michael Wei, VMware Research; Dan Tsafrir, Technion Israel Institute of Technology and VMware Research. We present Nap, a black-box approach that converts concurrent persistent memory (PM) indexes into NUMA-aware counterparts. Pollux is implemented and publicly available as part of an open-source project at https://github.com/petuum/adaptdl. GoJournal is implemented in Go, and Perennial is implemented in the Coq proof assistant. If you are uncertain about how to anonymize your submission, please contact the program co-chairs, osdi21chairs@usenix.org, well in advance of the submission deadline. At a high level, Addra follows a template in which callers and callees deposit and retrieve messages from private mailboxes hosted at an untrusted server. Based on the observation that invariants are often concise in practice, DistAI starts with small invariant formulas and enumerates all strongest possible invariants that hold for all samples. We describe PrivateKube, an extension to the popular Kubernetes datacenter orchestrator that adds privacy as a new type of resource to be managed alongside other traditional compute resources, such as CPU, GPU, and memory. Haojie Wang, Jidong Zhai, Mingyu Gao, Zixuan Ma, Shizhi Tang, and Liyan Zheng, Tsinghua University; Yuanzhi Li, Carnegie Mellon University; Kaiyuan Rong and Yuanyong Chen, Tsinghua University; Zhihao Jia, Carnegie Mellon University and Facebook. Fan Lai, Xiangfeng Zhu, Harsha V. Madhyastha, and Mosharaf Chowdhury, University of Michigan. Sponsored by USENIX in cooperation with ACM SIGOPS. However, with the increasingly speedy transactions and queries thanks to large memory and fast interconnect, commodity HTAP systems have to make a tradeoff between data freshness and performance degradation. Storm ensures security using a Security Typed ORM that refines the (type) abstractions of each layer of the MVC API with logical assertions that describe the data produced and consumed by the underlying operation and the users allowed access to that data. SOSP 2021 - Symposium on Operating Systems Principles Academic and industrial participants present research and experience papers that cover the full range of theory and practice of computer . We present TEMERAIRE, a hugepage-aware enhancement of TCMALLOC to reduce CPU overheads in the applications code. Sep 2021 - Present 1 year 7 months. We present application studies for 8 applications, improving requests-per-second (RPS) by 7.7% and reducing RAM usage 2.4%. Yuke Wang, Boyuan Feng, Gushu Li, Shuangchen Li, Lei Deng, Yuan Xie, and Yufei Ding, University of California, Santa Barbara. First, it enables a caller to push a message to a callee in two hops, using a new way of assigning mailboxes to users that resembles how a post office assigns PO boxes to its customers. Reviews will be available for response on Wednesday, March 3, 2021. DMons targeted optimizations provide 16.83% speedup on average (up to 53.14%), compared to a baseline that uses the highest level of compiler optimization. We implemented the ZNS+ SSD at an SSD emulator and a real SSD. Moreover, to handle dynamic workloads, Nap adopts a fast NAL switch mechanism. In this paper, we show how to address this inefficiency without requiring pages to be rewritten or browsers to be modified. PET discovers and applies program transformations that improve computation efficiency but only maintain partial functional equivalence. People often assume that blockchain has Byzantine robustness, so adding it to any system will make that system super robust against any calamity. With an aim to improve time-to-accuracy performance in model training, Oort prioritizes the use of those clients who have both data that offers the greatest utility in improving model accuracy and the capability to run training quickly. Machine learning (ML) models trained on personal data have been shown to leak information about users. See the USENIX Conference Submissions Policy for details. As the emerging trend of graph-based deep learning, Graph Neural Networks (GNNs) excel for their capability to generate high-quality node feature vectors (embeddings). Such centralized engines are in a perfect position to censor content and violate users privacy, undermining some of the key tenets behind decentralization. Poor data locality hurts an application's performance. Metadata from voice calls, such as the knowledge of who is communicating with whom, contains rich information about peoples lives. AI enables principled representation of knowledge, complex strategy optimization, learning from data, and support to human decision making. OSDI brings together professionals from academic and industrial backgrounds in a premier forum for discussing the design, implementation, and implications of systems software. Additionally, there is no assurance that data processing and handling comply with the claimed privacy policies. DistAI generates data by simulating the distributed protocol at different instance sizes and recording states as samples. Pages should be numbered, and figures and tables should be legible in black and white, without requiring magnification.