Releases: sdv-dev/SDV
v0.3.6 - 2020-07-23
This release introduces a new concept of Constraints
, which allow the user to define
special relationships between columns that will not be handled via modeling.
This is done via a new sdv.constraints
subpackage which defines some well-known pre-defined
constraints, as well as a generic framework that allows the user to customize the constraints
to their needs as much as necessary.
New Features
- Support for Constraints - Issue #169 by @csala
v0.3.5 - 2020-07-09
This release introduces a new subpackage sdv.tabular
with models designed specifically
for single table modeling, while still providing all the usual conveniences from SDV, such
as:
- Seamless multi-type support
- Missing data handling
- PII anonymization
Currently implemented models are:
- GaussianCopula: Multivariate distributions modeled using copula functions. This is stronger
version, with more marginal distributions and options, than the one used to model multi-table
datasets. - CTGAN: GAN-based data synthesizer that can generate synthetic tabular data with high fidelity.
v0.3.4 - 2020-07-04
New Features
- Support for Multiple Parents - Issue #162 by @csala
- Sample by default the same number of rows as in the original table - Issue #163 by @csala
General Improvements
- Add benchmark - Issue #165 by @csala
v0.3.3 - 2020-06-26
General Improvements
- Use SDMetrics for evaluation - Issue #159 by @csala
v0.3.2 - 2020-02-03
General Improvements
- Improve metadata visualization - Issue #151 @csala @JDTheRipperPC
v0.3.1 - 2020-01-22
New Features
-
Add Metadata Validation - Issue #134 by @csala @JDTheRipperPC
-
Add Metadata Visualization - Issue #135 by @JDTheRipperPC
General Improvements
-
Add path to metadata JSON - Issue #143 by @JDTheRipperPC
-
Use new Copulas and RDT versions - Issue #147 by @csala @JDTheRipperPC
v0.3.0 - 2019-12-23
New Features
- Create sdv.models subpackage - Issue #141 by @JDTheRipperPC
v0.2.2 - 2019-12-10
Resolved Issues
-
Adapt evaluation to the different data types - Issue #128 by @csala @JDTheRipperPC
-
Extend
load_demo
functionality to load other datasets - Issue #136 by @JDTheRipperPC
v0.2.1 - 2019-11-25
Resolved Issues
- Methods to generate Metadata from DataFrames - [Issue #126] by @csala @JDTheRipperPC
v0.2.0 - 2019-11-11
This release introduces a big reorganization of the project and some API changes with a strong focus on simplicity and usability.
New Features
- Ability to pass the data both as CSV files or DataFrames
- Ability to pass the Metadata both as a JSON file or as a python dict
- Simplified metadata format
- Fixed Primary Key generation issues
- Added support for Integer Primary Keys
- Added boolean modeling
- Improved categorical distribution modeling
- Fixed incorrect number of children rows modeling
- Fixed incorrect null values modeling
Special thanks to @csala and @JDTheRipperPC for the hard work put on making this release possible!
Resolved issues
- compatibility with rdt issue 72 - Issue #120 by @csala and @JDTheRipperPC
- Error docstring sampler.__fill_text_columns - Issue #144 by @JDTheRipperPC
- Reach 90% coverage - Issue #112 by @JDTheRipperPC
- Review unittests - Issue #111 by @JDTheRipperPC
- Time required for sample_all function? - Issue #118 by @csala and @JDTheRipperPC
- Primary Key and Foreign Keys have to be integers for it to work - Issue #117 by @csala and @JDTheRipperPC
- Generating samples is taking lot of time. Is there any way to speed up sample generation. - Issue #103 by @csala and @JDTheRipperPC