|
|
23andMe produced
genotype- and phenotype-based
health reports
to provide individuals with information
about their predispositions to certain health conditions
and how their DNA and lifestyle choices
might influence their health,
empowering them
to make more informed decisions
about their well-being.
Publications include:
-
Boyoung Yoo (Quinn), Xi (Iris) Yang, Teresa Filshtein Sonmez, Jianan Zhan, Stacey B. Detweiler, James R.
Ashenhurst, Alison L. Chubb, Sahil Hansalia, Akele K. Reed, Theodore M. Wong, Ryanne R. Wu, Bertram L.
Koelsch, Anna Guan, Sarah B. Laskey, Michael V. Holmes.
Development and testing of estimating Biological Age from blood based signatures
.
23andMe White Paper 23-26,
May 2024
PDF
(4.6 MB)
Abstract
Aging is defined as the gradual functional and structural decline of an organism. It is considered a
major risk factor for adverse health outcomes and mortality. While chronological time is a strong
indicator of the aging process, it does not perfectly predict the underlying physiological state or
biological age of an individual. Each individual with the same chronological age (CA) is at a
different risk for developing aging-related diseases and mortality, primarily due to their varying
lifestyles, genetic compositions, and biological aging processes. Over the past few decades,
considerable efforts have been made to develop an informative measure of biological aging. Such a
measurement should meet the following criteria as set out by the American Federation for Aging
Research: (1) predict remaining life expectancy and disease-specific mortality better than CA, (2)
monitor mechanisms underlying the aging process but not a specific disease, (3) be subject to repeated
tests without harming the individual nor affecting life expectancy, and (4) be testable in both
laboratory animals and humans. affecting life expectancy, and (4) be testable in both laboratory
animals
and humans.
Blood biomarkers (e.g., total cholesterol or platelet count) are often used to assess the overall
health
status of an individual in a clinical setting. Blood biomarker-based biological age (BBA) combines
these
blood biomarkers into a single value to facilitate interpretation of biological aging and better
predict
lifespan than CA. One of the leading BBA methods is Morgan Levine's PhenoAge (phenotypic age).
Levine et al. used the National Health and Nutrition Examination Survey (NHANES) dataset to
fit
a model to estimate BBA using ten predictor variables: albumin, alkaline phosphatase, creatinine,
C-reactive protein, glucose, lymphocyte percentage, mean corpuscular volume, red blood cell
distribution
width, white blood cell count, and CA. After adjusting for CA, the authors reported that a one-year
increase in PhenoAge was associated with a 9% increase in mortality risk.
In 2023, 23andMe, Inc. started offering blood biomarker testing products, which opened up an
opportunity
to provide a measure of BBA to our customers. In this White Paper, we detail the modeling
methodologies
and validating procedures used to create the BBA model powering the 23andMe's Biological Age
feature. We describe how we expanded on elements of PhenoAge to build 23andMe's Biological Age, a
survival model trained on a large cohort from the UK Biobank (UKB). Since a larger set of blood
biomarkers are available in 23andMe's blood biomarker panel and UKB than in NHANES, we developed
our
own model so that we could provide our own feature engineering and train the model on a larger cohort.
The model was then validated on several ethnically-diverse subsets from UKB to show that the model
performance is consistent across different ancestries. To further ensure our model's performance
outside the somewhat limited CA range and ancestral background of the UKB participants, we used the
NHANES as a validation set. We further computed the effect size of each biomarker per individual to
provide a customized explanation of which biomarkers contribute to an increase or decrease of their
BBA.
We show that BBA is a better indicator of mortality risk and aging-related events than CA,
particularly
for cardiovascular and metabolic diseases. We also included a sex specific analysis to show that our
result is consistent for both male and female participants (Supplementary Material). As an example of
the dynamic nature of BBA, we include a longitudinal case study of an individual to demonstrate
BBA's ability to reflect underlying changes to the physiological processes arising from
environmental change over time.
-
James R. Ashenhurst, Sarah B. Laskey, Jianan Zhan
Aditya Ambati, Rafaela Bagur Quetglas, Anna Guan,
Jingran Wen, Peter Wilton, Iris Yang, Bo Yoo, Sahil
Hansalia, Yashashree Kokje, Ishana Raghuram, Akele
Reed, Marcos Torres, 23andMe Research Team, Subarna
Sinha, Theodore M. Wong, Bertram
L. Koelsch.
A
Generalized Method for the Creation and Evaluation
of Polygenic Scores: Accounting for Genetic Ancestry
as a Continuum
.
23andMe White Paper 23-25,
December 2023
PDF
(699 KB)
Abstract
Polygenic scores (PGS) strive to estimate the heritable portion of risk for many common diseases and
other traits. Genome-wide association studies (GWAS) frequently identify multiple genetic variants
with
small to moderate individual impact on risk for a condition; many of these variants are commonly
single
nucleotide polymorphisms (SNPs). To quantify the cumulative impact of these variants on risk, machine
learning methods are used to construct statistical models that generate polygenic scores. Recently,
advances in modeling methodology have enabled massive increases in the number of genetic variants that
can be included in polygenic models, leading to corresponding increases in the proportion of trait
variance that these models explain. As a result, PGS are now being used to estimate heritable risk for
a
wide range of conditions and research is ongoing to evaluate their potential utility as part of
clinical
decision making.
The key factor that limits researchers' ability to create large polygenic models is the size of
the
training cohort. Very large sample sizes are necessary both to identify genetic variants associated
with
a disease and to estimate their joint contribution to risk. Additionally, obtaining samples from
diverse
populations is necessary to create models that are calibrated to these populations, whether by
assessing
how well a model developed using data from a particular ancestry group (usually European) generalizes
to
other (often non-European) groups, or by developing models using data from various populations
themselves. With over 14 million kits sold and approximately 80% of customers—including
customers
of many different ancestries—consenting to participate in research, 23andMe has a unique ability
to develop large PGS that predict a wide range of health conditions and traits and to optimize and
assess PGS performance across people with diverse ancestral backgrounds. Analyses of the
company's
genetic and self-reported health data show that we can replicate GWAS on clinically collected health
information. Over the last several years, 23andMe has used PGS as the basis of dozens of customer
reports on topics ranging from the ability to match a musical pitch to the likelihood of developing
type
2 diabetes.
Here we detail the modeling methodologies and evaluation procedures used to create the PGS behind new
23andMe Health Predisposition and Wellness reports on common health conditions. The Appendices to this
White Paper further summarize the performance and characteristics of each PGS used in recently
released
reports (starting in 2024). We intend for this White Paper and the Appendices to be living documents
that will be updated as methodologies change and new PGS-based genetic reports are released. A change
log is provided at the bottom of this document to describe significant updates.
Prediction of human physical traits and demographic
information from genomic data challenges privacy and data
deidentification in personalized medicine. To explore the
current capabilities of phenotype-based genomic
identification, we applied whole-genome sequencing, detailed
phenotyping, and statistical modeling to predict biometric
traits in a cohort of 1,061 participants of diverse
ancestry. Individually, for a large fraction of the traits,
their predictive accuracy beyond ancestry and demographic
information is limited. However, we have developed a maximum
entropy algorithm that integrates multiple predictions to
determine which genomic samples and phenotype measurements
originate from the same person. Using this algorithm, we have
reidentified an average of >8 of 10 held-out individuals in
an ethnically mixed cohort and an average of 5 of either 10
African Americans or 10 Europeans. This work challenges
current conceptions of personal privacy and may have
far-reaching ethical and legal implications.
Publications include:
-
Ken Yocum, Sean Rowan, and Jonathan Lunt, and Theodore M. Wong.
Disdat: Bundle data management for machine learning pipelines
.
In Proceedings of the 2019 USENIX Conference on Operational Machine Learning (OpML 2019),
May 2019
PDF
(1.4 MB)
Abstract
Modern machine learning pipelines can produce hundreds of data artifacts
(such as features,
models,
and
predictions)
throughout their lifecycle.
During that time,
data scientists need to reproduce errors,
update features,
re-train on specific data,
validate / inspect outputs,
and share models and predictions.
Doing so requires the ability to publish,
discover,
and version those artifacts.
This work introduces Disdat,
a system to simplify ML pipelines
by addressing these data management challenges.
Disdat is built on two core data abstractions:
bundles
and contexts.
A bundle is a versioned,
typed,
immutable collection of data.
A context is a sharable set of bundles
that can exist on local and cloud storage environments.
Disdat provides a bundle management API
that we use to extend an existing workflow system
to produce and consume bundles.
This bundle-based approach to data management has simplified
both authoring and deployment of our ML pipelines.
-
Christoph Lippert et al.
Identification of individuals by trait
prediction
using whole-genome sequencing data.
In Proceedings of the National Academy of Sciences (PNAS),
September 2017
PDF
(2.8 MB)
Supporting information PDF
(41 MB)
Abstract
Prediction of human physical traits and demographic
information from genomic data challenges privacy and data
deidentification in personalized medicine. To explore the
current capabilities of phenotype-based genomic
identification, we applied whole-genome sequencing,
detailed phenotyping, and statistical modeling to predict
biometric traits in a cohort of 1,061 participants of
diverse ancestry. Individually, for a large fraction of
the traits, their predictive accuracy beyond ancestry and
demographic information is limited. However, we have
developed a maximum entropy algorithm that integrates
multiple predictions to determine which genomic samples
and phenotype measurements originate from the same
person. Using this algorithm, we have reidentified an
average of >8 of 10 held-out individuals in an ethnically
mixed cohort and an average of 5 of either 10 African
Americans or 10 Europeans. This work challenges current
conceptions of personal privacy and may have far-reaching
ethical and legal implications.
Cognitive Computing is an emerging field with a goal to
develop a coherent, unified, universal mechanism to engineer
the mind. Cognitive computing seeks to implement a unified
computational theory of the mind, taking advantage of the
ability of the brain to integrate ambiguous sensory
information, form spatiotemporal associations and abstract
concepts, and make decisions and initiate sophisticated
coordinated actions.
Our approach to cognitive computing is to develop dedicated
hardware systems for implementing a canonical cortical
circuit that can achieve tremendous gains in power and space
efficiency when compared to traditional von Neumann
circuits. Such efficiency is crucial when scaling such
circuits to the size of a mammalian cortex. Our cortical
circuit is a reconfigurable network of spiking neurons that
is composed of neuron processing elements connected through
synapse memory elements—both akin to the basic building
blocks of the brain.
To validate and verify the configuration of our hardware, we
have developed a simulator that can reproduce hardware
functional behavior when testing circuits at the size of a
mammalian cortex. Such a simulator also doubles as a research
tool for developing and testing new cognitive computing
algorithms for implementation on the hardware.
Publications include:
-
Arnon Amir, Pallab Datta, William P. Risk, Andrew S. Cassidy, Jeffrey A. Kusnitz, Steven K. Esser,
Alexander Andreopoulos, Theodore M. Wong, Myron Flickner, Rodrigo Alvarez, Emmett McQuinn, Ben Shaw, Norm
Pass, and Dharmendra S. Modha.
Cognitive computing programming paradigm:
A Corelet Language for composing networks of neurosynaptic cores
In Proceedings of the 2013 International Joint Conference on Neural Networks (IJCNN 2013),
August 2013
PDF
(1.8 MB)
Abstract
The sequential programming paradigm
of the von Neumann architecture
is wholly unsuited for TrueNorth.
Therefore,
as our main contribution,
we develop a new programming paradigm
that permits construction of complex cognitive algorithms
and applications
while being efficient for TrueNorth
and effective for programmer productivity.
The programming paradigm consists of
(a) an abstraction for a TrueNorth program,
named Corelet,
for representing a network of neurosynaptic cores
that encapsulates all details
except external inputs and outputs;
(b) an object-oriented Corelet Language for creating,
composing,
and decomposing corelets;
(c) a Corelet Library
that acts as an ever-growing repository of reusable corelets
from which programmers compose new corelets;
and (d) an end-to-end Corelet Laboratory
that is a programming environment
which integrates with the TrueNorth architectural simulator,
Compass,
to support all aspects of the programming cycle from design,
through development,
debugging,
and up to deployment.
The new paradigm seamlessly scales
from a handful of synapses and neurons
to networks of neurosynaptic cores of progressively increasing size
and complexity.
The utility of the new programming paradigm is underscored by the fact that
we have designed and implemented more than 100 algorithms
as corelets for TrueNorth
in a very short time span.
-
Andrew S. Cassidy, Paul Merolla, John V. Arthur, Steve K. Esser, Bryan Jackson, Rodrigo Alvarez-Icaza,
Pallab Datta, Jun Sawada, Theodore M. Wong, Vitaly Feldman, Arnon Amir, Daniel Ben-Dayan Rubin, Filipp
Akopyan, Emmett McQuinn, William P. Risk, and Dharmendra S. Modha.
Cognitive computing building block:
A versatile and efficient digital neuron model for neurosynaptic cores.
In Proceedings of the 2013 International Joint Conference on Neural Networks (IJCNN 2013),
August 2013
PDF
(654 KB)
Abstract
Judiciously balancing the dual objectives
of functional capability
and implementation/operational cost,
we develop a simple,
digital,
reconfigurable,
versatile spiking neuron model
that supports one-to-one equivalence
between hardware and simulation
and is implementable using only 1272 ASIC gates.
Starting with the classic leaky integrate-and-fire neuron,
we add:
(a) configurable and reproducible stochasticity
to the input,
the state,
and the output;
(b) four leak modes
that bias the internal state dynamics;
(c) deterministic and stochastic thresholds;
and (d) six reset modes
for rich finite-state behavior.
The model supports a wide variety of computational functions
and neural codes.
We capture 50+ neuron behaviors
in a library for hierarchical composition
of complex computations and behaviors.
Although designed with cognitive algorithms and applications in mind,
serendipitously,
the neuron model can qualitatively replicate
the 20 biologically-relevant behaviors of
a dynamical neuron model.
-
Steve K. Esser, Alexander Andreopoulos, Rathinakumar Appuswamy, Pallab Datta, Davis Barch, Arnon Amir,
John Arthur, Andrew Cassidy, Myron Flickner, Paul Merolla, Shyamal Chandra, Nicola Basilico, Stefano
Carpin, Tom Zimmerman, Frank Zee, Rodrigo Alvarez-Icaza, Jeffrey A. Kusnitz, Theodore M. Wong, William P.
Risk, Emmett McQuinn, Tapan K. Nayak, Raghavendra Singh, and Dharmendra S. Modha.
Cognitive computing systems:
Algorithms and applications for networks of neurosynaptic cores.
In Proceedings of the 2013 International Joint Conference on Neural Networks (IJCNN 2013),
August 2013
PDF
(1.4 MB)
Abstract
The
non-von Neumann nature of the
TrueNorth architecture necessitates a novel approach
to efficient system design.
To this end,
we have developed a set of abstractions,
algorithms,
and applications
that are natively efficient for TrueNorth.
First,
we developed repeatedly-used abstractions
that span neural codes
(such as binary,
rate,
population,
and time-to-spike),
long-range connectivity,
and short-range connectivity.
Second,
we implemented ten algorithms
that include convolution networks,
spectral content estimators,
liquid state machines,
restricted Boltzmann machines,
hidden Markov models,
looming detection,
temporal pattern matching,
and various classifiers.
Third,
we demonstrate seven applications
that include speaker recognition,
music composer recognition,
digit recognition,
sequence prediction,
collision avoidance,
optical flow,
and eye detection.
Our results showcase the parallelism,
versatility,
rich connectivity,
spatio-temporality,
and multi-modality of the TrueNorth architecture
as well as compositionality
of the corelet programming paradigm
and the flexibility of the underlying neuron model.
-
Theodore M. Wong, Robert Preissl, Pallab Datta,
Myron Flickner, Raghavendra Singh, Steven K. Esser,
Emmett McQuinn, Rathinakumar Appuswamy, William
P. Risk, Horst D. Simon, Dharmendra
S. Modha.
1014.
IBM Technical Paper RJ10502,
November 2012
PDF
(221 KB)
Abstract
Since the final submission of our work on the Compass
scalable simulator for the IBM TrueNorth Cognitive
Computing architecture, we have simulated an
unprecedented 2.084 billion neurosynaptic cores
containing 53×1010 neurons and
1.37×1014 synapses running at only
1542× slower than real time. We attained this scale
by using the Sequoia 96-rack IBM® Blue Gene®/Q
supercomputer at Lawrence Livermore National Labs. By
comparison, the ultimate vision of the DARPA SyNAPSE
program is to build a cognitive computing architecture
with 1010 neurons and 1014
synapses, inspired by the following: Shepherd estimates
the number of synapses in the human brain as
0.6×1014, and Koch estimates the number
of synapses in the human brain as
2.4×1014.
It is important to clarify that we have not built a
biologically realistic simulation of the complete human
brain. Rather, we have simulated a novel modular,
scalable, non-von Neumann, ultra-low power, cognitive
computing architecture at the DARPA SyNAPSE metric of
1014 synapses that itself is inspired by the
number of synapses in the human brain. We mathematically
abstract away computation (“neurons”), memory
(“synapses”), and communication
(“axons”, “dendrites”) from
biological detail towards engineering goals of maximizing
function (utility, applications) and minimizing hardware
cost (power, area, delay) and design complexity.
-
Robert Preissl, Theodore M. Wong, Pallab Datta, Raghavendra
Singh, Steven Esser, William Risk, Horst Simon, Myron
Flickner, and Dharmendra S. Modha.
Compass:
A Scalable Simulator for an Architecture for Cognitive
Computing. In Proceedings of the International
Conference for High Performance Computing, Networking,
Storage, and Analysis (SC 2012), November 2012
PDF
(1.6 MB)
Abstract
Inspired by the function, power, and volume of the
organic brain, we are developing TrueNorth, a novel
modular, non-von Neumann, ultra-low power, compact
architecture. TrueNorth consists of a scalable network of
neurosynaptic cores, with each core containing neurons,
dendrites, synapses, and axons. To set sail for
TrueNorth, we developed Compass, a multi-threaded,
massively parallel functional simulator and a parallel
compiler that maps a network of long-distance pathways in
the macaque monkey brain to TrueNorth. We demonstrate
near-perfect weak scaling on a 16 rack IBM® Blue
Gene®/Q (262144 CPUs, 256 TB memory), achieving an
unprecedented scale of 256 million neurosynaptic cores
containing 65 billion neurons and 16 trillion synapses
running only 388× slower than real-time with an
average spiking rate of 8.1 Hz. By using emerging PGAS
communication primitives, we also demonstrate 2×
better real-time performance over MPI primitives on a 4
rack Blue Gene/P (16384 CPUs, 16 TB memory).
Distributed, adaptive, hard real-time applications, such as
process control or guidance systems, have requirements that
go beyond those of traditional real-time systems:
accommodation of a dynamic set of applications, autonomous
adaptation as application requirements and system resources
change, and security between applications from different
organizations. Developers need a middleware with features
that support developing and running these applications,
especially as commercial and defense systems become more
network-centric. The Virtual Mission Bus (VMB)
middleware, targeted at both distributed IT systems and
real-time systems, provides the essential basic services to
support these applications and the tools for building more
complex services, all while keeping the middleware kernel
minimal enough for embedded system use. We successfully used
the VMB to prototype a distributed spacecraft cluster system.
Publications include:
-
Theodore M. Wong, Richard A. Golding, Harvey M. Ruback,
Wilfred Plouffe, and Scott A. Brandt.
The Virtual Mission Bus
.
IBM Technical Paper RJ10472,
September 2010
PDF
(257 KB)
Abstract
Distributed, adaptive, hard real-time applications, such
as process control or guidance systems, have requirements
that go beyond those of traditional real-time systems:
accommodation of a dynamic set of applications,
autonomous adaptation as application requirements and
system resources change, and security between
applications from different organization. Developers need
a middleware with features that support developing and
running these applications, especially as commercial and
defense systems become more
"network-centric". The Virtual Mission Bus
(VMB) middleware, targeted at both distributed IT systems
and real-time systems, provides the essential basic
services to support these applications and the tools for
building more complex services, all while keeping the
middleware kernel minimal enough for embedded system
use. We successfully used the VMB to prototype a
distributed spacecraft cluster system.
-
David M. LoBosco, Glen E. Cameron, Richard A. Golding,
and Theodore M. Wong.
The Pleiades fractionated space system architecture and the future of national security space
.
In Proceedings of the AIAA SPACE 2008 Conference,
September 2008
PDF
(992 KB)
Abstract
Our paper is the first in a series of publications to
document the development of a fractionated space
system. We explain the derivation of the technical
approach for system development and present the
preliminary architecture. We will publish future papers
following the PDR, CDR, and flight demonstration.
Storage systems for large and distributed clusters of compute
servers are themselves large and distributed. Their
complexity and scale make it hard to manage these systems,
and in particular, they make it hard to ensure that
applications using them get good, predictable performance. At
the same time, shared access to the system from multiple
applications, users, and competition from internal system
activities leads to a need for predictable performance.
The storage
quality-of-service project at the UCSC Storage Systems
Research Center investigates mechanisms for improving
storage system performance in large distributed storage
systems through mechanisms that integrate the performance
aspects of the path that I/O operations take through the
system, from the application interface on the compute server,
through the network, to the storage servers. We focus on five
parts of the I/O path in a distributed storage system: I/O
scheduling at the storage server, storage server cache
management, client-to-server network flow control,
client-to-server connection management, and client cache
management.
Publications include:
-
Tim Kaldewey, Theodore M. Wong, Richard Golding, Anna
Povzner, Scott Brandt, and Carlos Maltzahn.
Virtualizing disk performance
.
In Proceedings of the 14th IEEE Real-Time and Embedded Technology and Applications Symposium (RTAS
2008),
April 2008 (Best student paper)
PDF
(256 KB)
Abstract
Large- and small-scale storage systems frequently serve a
mixture of workloads, an increasing number of which
require some form of performance guarantee. Providing
guaranteed disk performance—the equivalent of a
“virtual disk”—is challenging because disk
requests are non-preemptible and their execution times
are stateful, partially non-deterministic, and can vary
by orders of magnitude. Guaranteeing throughput, the
standard measure of disk performance, requires worst-case
I/O time assumptions orders of magnitude greater than
average I/O times, with correspondingly low performance
and poor control of the resource allocation. We show that
disk time utilization—analogous to CPU utilization
in CPU scheduling and the only fully
provisionable aspect of disk performance—yields
greater control, more efficient use of disk resources,
and better isolation between request streams than
bandwidth or I/O rate when used as the basis for disk
reservation and scheduling.
-
Anna Povzner, Tim Kaldewey, Scott Brandt, Richard
Golding, Theodore M. Wong, and Carlos Maltzahn.
Efficient guaranteed disk request scheduling with Fahrrad
.
In Proceedings of the ACM SIGOPS/EuroSys European Conference on Computer Systems 2008 (EuroSys
2008),
April 2008
PDF
(400 KB)
Abstract
Guaranteed I/O performance is needed for a variety of
applications ranging from real-time data collection to
desktop multimedia to large-scale scientific
simulations. Reservations on throughput, the standard
measure of disk performance, fail to effectively manage
disk performance due to the orders of magnitude
difference between best-, average-, and worst-case
response times, allowing reservation of less than 0.01%
of the achievable bandwidth. We show that by reserving
disk resources in terms of utilization, it is
possible to create a disk scheduler that supports
reservation of nearly 100% of the disk resources,
provides arbitrarily hard or soft guarantees depending
upon application needs, and yields efficiency as good or
better than best-effort disk schedulers tuned for
performance. We present the architecture of our
scheduler, prove the correctness of its algorithms, and
provide results demonstrating its effectiveness.
-
David O. Bigelow, Suresh Iyer, Tim Kaldewey, Roberto
C. Pineiro, Anna Povzner, Scott A. Brandt, Richard
A. Golding, Theodore M. Wong, and Carlos
Maltzahn.
End-to-end performance management for scalable distributed storage
.
In Proceedings of the Petascale Data Storage Workshop,
November 2007
PDF
(130 KB)
Abstract
Many applications—
for example,
scientific simulation,
real-time data acquisition,
and distributed reservation systems—
have I/O performance requirements,
yet most
large, distributed storage systems
lack the ability to guarantee I/O performance.
We are working on end-to-end performance management
in scalable,
distributed storage systems.
The kinds of storage systems we are targeting
include large high-performance computing (HPC) clusters, which
require both large data volumes
and high I/O rates,
as well as large-scale general-purpose storage systems.
-
Theodore M. Wong, Richard A. Golding, Caixue Lin, and
Ralph A. Becker-Szendy.
Zygaria: Storage performance as a managed resource
.
In Proceedings of the 12th IEEE Real-Time and Embedded Technology and Applications Symposium (RTAS
2006),
April 2006
PDF
(424 KB)
Abstract
Large-scale storage systems often hold data for
multiple applications and users. A problem in such
systems is isolating applications and users from
each other to prevent their corresponding workloads
from interacting in unexpected ways. Another is
ensuring that each application receives an
appropriate level of performance. As part of the
solution to these problems, we have designed a
hierarchical I/O scheduling algorithm to manage
performance resources on an underlying storage
device. Our algorithm uses a simple allocation
abstraction: an application or user is associated
with a corresponding pool of throughput, and manages
throughput within its pool by opening sessions. The
algorithm ensures that each pool and session
receives at least a reserve rate of throughput and
caps usage at a limit rate, using hierarchical token
buckets and EDF I/O scheduling. Once it has
fulfilled the reserves of all active sessions and
pools, it shares unused throughput fairly among
active sessions and pools such that they tend to
receive the same amount. It thus combines deadline
scheduling with proportional-style resource sharing
in a novel way. We assume that the device performs
its own low-level head scheduling, rather than
modeling the device in detail. Our implementation
shows the correctness of our algorithm, imposes
little overhead on the system, and achieves
throughput nearly equal to that of an unmanaged
device.
Patents include:
-
Ralph A. Becker-Szendy, Richard A. Golding, Caixue Lin,
Theodore M. Wong, and Omer A. Zaki. System and method for
managing storage system performance as a resource. United
States Patent 7,962,563 (granted 14 June 2011)
The growth in the amount of data being stored and
manipulated for commercial, scientific, and intelligence
applications is worsening the manageability and
reliability of data storage systems. The expansion of
such large-scale storage systems into petabyte
capacities puts pressure on cost, leading to systems
built out of many cheap but relatively unreliable
commodity storage servers. These systems are expensive
and difficult to manage—current figures show that
management and operation costs are often several times
purchase cost—partly because of the number of
components to configure and monitor, and partly because
system management actions often have unexpected,
system-wide side effects. Also, these systems are
vulnerable to attack because they have many entry
points, and because there are no mechanisms to contain
the effects either of attacks or of subsystem failures.
Kybos
is a distributed storage system that addresses these
issues. It will provide manageable, available, reliable, and
secure storage for large data collections, including data
that is distributed over multiple geographical sites. Kybos
is self-managing, which reduces the cost of administration by
eliminating complex management operations and simplifying the
model by which administrators configure and monitor the
system. Kybos stores data redundantly across multiple
commodity storage servers, so that the failure of any one
server does not compromise data. Finally, Kybos is built as a
loosely coupled federation of servers, so that the compromise
or failure of some servers will not impede remaining servers
from continuing to take collective action toward system
goals.
Our primary application is the self-management of
federated (but potentially unreliable) clusters of
storage servers, but we anticipate that the algorithms
we have developed (and will implement) will have broad
applicability to the general class of problems involving
the coordination of independent autonomous agents with a
collective set of mission goals.
Publications include:
-
Richard A. Golding and Theodore M. Wong.
Walking toward moving goalposts: agile management for evolving systems
.
In Proceedings of the First Workshop on Hot Topics in Autonomic Computing (HotAC I),
June 2006
PDF
(136 KB)
Abstract
Much of the practical work in the autonomic management of
storage systems has taken the “bolt-on”
approach: take existing systems and add a separate
management system on the side. While this approach can
improve legacy systems, it has several problems,
including scaling to heterogeneous and large systems and
maintaining consistency between the system and the
management model. We argue for a different approach,
where autonomic management is woven throughout a system,
as in the K2 distributed storage system that we are
implementing. This distributes responsibility for
management operations over all nodes according to ability
and security, and stores management state as part of the
entities being managed. Decision algorithms set general
configuration goals and then let many system components
work in parallel to move toward the goals.
-
Winfried W. Wilcke et al.
IBM Intelligent Bricks project—Petabytes and beyond
.
IBM Journal of Research and Development,
50(2/3),
pp. 181–198,
March–May 2006
PDF
(682 KB)
Abstract
This paper provides an overview of the Intelligent Bricks
project in progress at IBM Research. It describes common
problems faced by data center operators and proposes a
comprehensive solution based on brick
architectures. Bricks are hardware building
blocks. Because of certain properties, defined here,
scalable and reliable systems can be built with
collections of identical bricks. An important feature is
that brick-based systems must survive the failure of any
brick without requiring human intervention, as long as
most bricks are operational. This simplifies system
management and allows very dense and very scalable
systems to be built. A prototype storage server in the
form of a 3×3×3 array of bricks, capable of
storing 26 TB, is operational at the IBM Almaden Research
Center. It successfully demonstrates the concepts of the
Intelligent Bricks architecture. The paper describes this
implementation of brick architectures based on newly
developed communication and cooling technologies, the
software developed, and techniques for building very
reliable systems from low-cost bricks, and it discusses
the performance and the future of intelligent brick
systems.
-
Theodore M. Wong, Richard A. Golding, Joseph S. Glider,
Elizabeth Borowsky, Ralph A. Becker-Szendy, Claudio
Fleiner, Deepak R. Kenchammana-Hosekote, and Omer
A. Zaki.
Kybos: Self-management for distributed brick-based storage
.
IBM Technical Paper RJ10356,
August 2005
PDF
(197 KB)
Abstract
Current tools for storage system configuration
management make offline decisions, recovering from,
instead of preventing, performance specification
violations. The consequences are severe in a
large-scale system that requires complex actions to
recover from failures, and can result in a temporary
shutdown of the system. We introduce Kybos, a
distributed storage system that makes online,
autonomous responses to system changes. It runs on
clusters of intelligent bricks, which provide local
enforcement of global performance and reliability
specifications and so isolate both recovery and
application IO traffic. A management agent within
Kybos translates simple, high-level specifications
into brick-level enforcement targets, invoking
centralized algorithms only when taking actions that
require global state. Our initial implementation
shows that this approach works well.
Patents include:
-
Richard A. Golding, Theodore M. Wong, and Omer A. Zaki.
Computer program and method for managing resources in a
distributed storage system. United States Patent 7,694,082
(granted 6 April 2010)
Modern society has produced a wealth of data to preserve for
the long term. Some data we keep for cultural benefit, in
order to make it available to future generations, while other
data we keep because of legal imperatives. One way to
preserve such data is to store it using survivable storage
systems. Survivable storage is distinct from reliable
storage in that it tolerates confidentiality failures in
which unauthorized users compromise component storage
servers, as well as crash failures of servers. Thus, a
survivable storage system can guarantee both the availability
and the confidentiality of stored data.
Research into survivable storage systems investigates
the use of m-of-n threshold sharing
schemes to distribute data to servers, in which each
server receives a share of the data. Any m
shares can be used to reconstruct the data, but any
m - 1 shares reveal no information
about the data. The central thesis of this dissertation
is that to truly preserve data for the long term, a
system that uses threshold schemes must incorporate
recovery protocols able to overcome server failures,
adapt to changing availability or confidentiality
requirements, and operate in a decentralized manner.
To support the thesis, I present the design and
experimental performance analysis of a verifiable
secret redistribution protocol for threshold
sharing schemes. The protocol redistributes shares of
data from old to new, possibly disjoint, sets of
servers, such that new shares generated by
redistribution cannot be combined with old shares to
reconstruct the original data. The protocol is
decentralized, and does not require intermediate
reconstruction of the data; thus, it does not introduce
a central point of failure or risk the exposure of the
data during execution. The protocol incorporates a
verification capability that enables new servers to
confirm that their shares can be used to reconstruct the
original data.
Publications include:
-
Theodore M. Wong.
Decentralized recovery for survivable storage systems
.
PhD dissertation (Technical Report CMU-CS-04-119),
School of Computer Science,
Carnegie Mellon University,
Pittsburgh, PA, May 2004
PDF
(714 KB)
PostScript
(1642 KB)
Abstract
Modern society has produced a wealth of data to preserve
for the long term. Some data we keep for cultural
benefit, in order to make it available to future
generations, while other data we keep because of legal
imperatives. One way to preserve such data is to store it
using survivable storage systems. Survivable
storage is distinct from reliable storage in that it
tolerates confidentiality failures in which unauthorized
users compromise component storage servers, as well as
crash failures of servers. Thus, a survivable storage
system can guarantee both the availability and the
confidentiality of stored data.
Research into survivable storage systems investigates the
use of m-of-n threshold sharing schemes
to distribute data to servers, in which each server
receives a share of the data. Any m shares can
be used to reconstruct the data, but any
m - 1 shares reveal no information
about the data. The central thesis of this dissertation
is that to truly preserve data for the long term, a
system that uses threshold schemes must incorporate
recovery protocols able to overcome server failures,
adapt to changing availability or confidentiality
requirements, and operate in a decentralized manner.
To support the thesis, I present the design and
experimental performance analysis of a verifiable
secret redistribution protocol for threshold sharing
schemes. The protocol redistributes shares of data from
old to new, possibly disjoint, sets of servers, such that
new shares generated by redistribution cannot be combined
with old shares to reconstruct the original data. The
protocol is decentralized, and does not require
intermediate reconstruction of the data; thus, it does
not introduce a central point of failure or risk the
exposure of the data during execution. The protocol
incorporates a verification capability that enables new
servers to confirm that their shares can be used to
reconstruct the original data.
-
Theodore M. Wong, Chenxi Wang, and Jeannette M. Wing.
Verifiable secret redistribution for archive systems
.
In Proceedings of the First International IEEE Security in Storage Workshop (SISW 2002),
December 2002
PDF
(424 KB)
Abstract
We present a new verifiable secret
redistribution protocol for threshold sharing
schemes that forms a key component of a proposed
archival storage system. Our protocol supports
redistribution from (m,n) to (m',n') threshold
sharing schemes without requiring reconstruction of
the original data. The design is motivated by
archive systems for which the added security of
threshold sharing of data must be accompanied by the
flexibility of dynamic shareholder changes. Our
protocol enables the dynamic addition or removal of
shareholders, and also guards against mobile
adversaries. We observe that existing protocols
either cannot be extended readily to allow
redistribution between different access structures,
or have vulnerabilities that allow faulty old
shareholders to distribute invalid shares to new
shareholders. Our primary contribution is that in
our protocol, new shareholders can verify the
validity of their shares after redistribution
between different access structures.
-
Theodore M. Wong and Jeannette M. Wing.
Verifiable secret redistribution
.
Technical Report CMU-CS-01-155,
School of Computer Science,
Carnegie Mellon University,
Pittsburgh, PA, October 2001
PDF
(176 KB)
PostScript
(204 KB)
Thesis committee:
[I began this research project while interning with the
Storage Systems Program
at Hewlett-Packard Labs.]
Modern high-end disk arrays often have several gigabytes
of cache RAM. Unfortunately, most array caches use
management policies which duplicate the same data blocks
at both the client and array levels of the cache
hierarchy: they are inclusive. Thus, the
aggregate cache behaves as if it was only as big as the
larger of the client and array caches, instead of as
large as the sum of the two. Inclusiveness is wasteful:
cache RAM is expensive.
We explore the benefits of a simple scheme to achieve
exclusive caching, in which a data block is
cached at either a client or the disk array, but not
both. Exclusiveness helps to create the effect of a
single, large unified cache. We introduce a DEMOTE
operation to transfer data ejected from the client to
the array, and explore its effectiveness with simulation
studies. We quantify the benefits and overheads of
demotions across both synthetic and real-life
workloads. The results show that we can obtain useful
(sometimes substantial) speedups.
During our investigation, we also developed some new
cache-insertion algorithms that show promise for
multi-client systems, and report on some of their
properties.
Publications include:
Patents include:
-
John Wilkes and Theodore M. Wong. Exclusive caching in
computer systems. United States Patent 6,851,024 (granted
1 February 2005)
-
John Wilkes and Theodore M. Wong. Adaptive data insertion
for caching. United States Patent 6,728,837 (granted 27
April 2004)
|