TY - GEN
T1 - Uncertain〈T〉
T2 - 19th International Conference on Architectural Support for Programming Languages and Operating Systems, ASPLOS 2014
AU - Bornholt, James
AU - Mytkowicz, Todd
AU - McKinley, Kathryn S.
PY - 2014
Y1 - 2014
N2 - Emerging applications increasingly use estimates such as sensor data (GPS), probabilistic models, machine learning, big data, and human data. Unfortunately, representing this uncertain data with discrete types (floats, integers, and booleans) encourages developers to pretend it is not probabilistic, which causes three types of uncertainty bugs. (1) Using estimates as facts ignores random error in estimates. (2) Computation compounds that error. (3) Boolean questions on probabilistic data induce false positives and negatives. This paper introduces Uncertain〈T〉, a new programming language abstraction for uncertain data. We implement a Bayesian network semantics for computation and conditionals that improves program correctness. The runtime uses sampling and hypothesis tests to evaluate computation and conditionals lazily and efficiently. We illustrate with sensor and machine learning applications that Uncertain〈T〉 improves expressiveness and accuracy. Whereas previous probabilistic programming languages focus on experts, Uncertain〈T〉 serves a wide range of developers. Experts still identify error distributions. However, both experts and application writers compute with distributions, improve estimates with domain knowledge, and ask questions with conditionals. The Uncertain〈T〉 type system and operators encourage developers to expose and reason about uncertainty explicitly, controlling false positives and false negatives. These benefits make Uncertain〈T〉 a compelling programming model for modern applications facing the challenge of uncertainty.
AB - Emerging applications increasingly use estimates such as sensor data (GPS), probabilistic models, machine learning, big data, and human data. Unfortunately, representing this uncertain data with discrete types (floats, integers, and booleans) encourages developers to pretend it is not probabilistic, which causes three types of uncertainty bugs. (1) Using estimates as facts ignores random error in estimates. (2) Computation compounds that error. (3) Boolean questions on probabilistic data induce false positives and negatives. This paper introduces Uncertain〈T〉, a new programming language abstraction for uncertain data. We implement a Bayesian network semantics for computation and conditionals that improves program correctness. The runtime uses sampling and hypothesis tests to evaluate computation and conditionals lazily and efficiently. We illustrate with sensor and machine learning applications that Uncertain〈T〉 improves expressiveness and accuracy. Whereas previous probabilistic programming languages focus on experts, Uncertain〈T〉 serves a wide range of developers. Experts still identify error distributions. However, both experts and application writers compute with distributions, improve estimates with domain knowledge, and ask questions with conditionals. The Uncertain〈T〉 type system and operators encourage developers to expose and reason about uncertainty explicitly, controlling false positives and false negatives. These benefits make Uncertain〈T〉 a compelling programming model for modern applications facing the challenge of uncertainty.
KW - Estimates
KW - Probabilistic Programming
KW - Statistics
UR - http://www.scopus.com/inward/record.url?scp=84897803531&partnerID=8YFLogxK
U2 - 10.1145/2541940.2541958
DO - 10.1145/2541940.2541958
M3 - Conference contribution
SN - 9781450323055
T3 - International Conference on Architectural Support for Programming Languages and Operating Systems - ASPLOS
SP - 51
EP - 65
BT - ASPLOS 2014 - 19th International Conference on Architectural Support for Programming Languages and Operating Systems
Y2 - 1 March 2014 through 5 March 2014
ER -