SmartNoise

SmartNoise is a set of tools for creating differentially private reports, dashboards, synopses, and synthetic data releases. It includes a SQL processing layer, supporting queries over Spark and popular database engines, and a collection of synthesizers.

SmartNoise includes a SQL processing library and a synthetic data library. SmartNoise is built on OpenDP.

smartnoise-fig-education smartnoise-fig-simulations smartnoise-fig-size smartnoise-fig-utility

When to Use

Differential privacy is the gold standard definition of privacy. Use differential privacy when you need to protect your data releases against membership inference, database reconstruction, record linkage, and other privacy attacks.

Here are some rules of thumb for when to use which components:

  • Use OpenDP directly, if you are creating Jupyter notebooks and reproducible research, or if you need the fine-grained control over processing and privacy budget.

  • Use SmartNoise SQL, if you are generating reports or data cubes over tabular data stored in SQL databases or Spark, or when your data are very large.

  • Use SmartNoise Synthesizers, if you can’t predict the workload in advance, and want to be able to share “looks like” data with collaborators.

Getting Started

For SmartNoise SQL, pip install smartnoise-sql and read the SQL documentation

For SmartNoise Synthesizers, pip install smartnoise-synth and read the Synthesizers documentation

OpenDP is included with SmartNoise. To install standalone, pip install opendp and read the OpenDP documentation

Source Code

The SmartNoise GitHub repository is at https://github.com/opendp/smartnoise-sdk

Getting Help

If you have questions or feedback regarding SmartNoise, you are welcome to post to the SmartNoise section of GitHub Discussions.