Release Notes#
scystream-sdk 1.0.0 - Release Notes#
The first version of the scystream sdk.
Providing basic functionalities:
Defining entrypoints.
Introducing the cbc.yaml.
Defining settings for entrypoints, parsing ENV-Variables.
Parsing & cross validation of cbc.yaml with actual code definition.
Reading & writing to a postgres database, integrating Apache Spark.
Reading & writing from/to a S3 bucket, currently not using Apache Spark.
scystream-sdk 1.1.0 - Release Notes#
Updated ENV settings.
We have added two predefined classes for postgres and file in- and outputs.
These contain default ENV-Keys such as the PG_HOST for defining the host of a postgres in-/ouput or S3_ACCESS_KEY for defining the access key of an S3 Bucket.
These settings can ultimatively be used to configure the corresponding connections.
scystream-sdk 1.2.0 - Release Notes#
Removed adjustable config path from SDKConfig
The configuration option config_path was removed. Every compute block repository must now contain the cbc.yaml within it’s root directory and with the file-name: cbc.yaml
scystream-sdk 1.3.0 - Release Notes#
Added#
Introduced
PandasPostgresOperationsfor interacting with PostgreSQL using Pandas DataFramesAdded
BasePostgresOperationsabstraction to unify database operations across backendsAdded support for optional PostgreSQL dependencies via
extras_require["postgres"]Introduced table name validation (PostgreSQL 63 character limit)
Changed#
Refactored PostgreSQL handling into backend-specific implementations:
SparkPostgresOperationsPandasPostgresOperations
Simplified database interaction API to remove duplicated logic across compute blocks
Updated internal structure to better support future extensibility (e.g. additional backends)
Added
DB_NAMEtoPostgresSettingsandPostgresConfigChange
PG_PORTtype tointinPostgresSettingsandPostgresConfig
Breaking Changes#
DB_NAMEis now a required configuration parameter in bothPostgresConfigandPostgresSettingsRemoved implicit/default database usage (e.g. defaulting to
"postgres")Users must now explicitly install optional dependencies:
scystream-sdk[postgres]for Pandas-based operations
scystream-sdk 1.4.0 - Release Notes#
Added#
Introduced DSN-based database configuration across the SDK
Added support for generic database backends via SQLAlchemy-compatible DSNs
Introduced DatabaseSettings to replace database-specific settings
Unified database interaction to support multiple backends (e.g. PostgreSQL, MySQL, SQLite)
Changed#
Replaced PostgresSettings with DatabaseSettings - Now requires:
DB_DSN(full database connection string)DB_TABLE(target table name)
Refactored database operations to be backend-agnostic
Updated Spark and Pandas database integrations to use DSN-based connections
Simplified database configuration by removing individual connection parameters (host, port, user, password)
Breaking Changes#
Removed PostgresConfig entirely
Replaced PostgresSettings with DatabaseSettings
Input/Output type
pg_tablehas been renamed todatabase_tableUsers must now provide a full DSN instead of individual connection parameters
Existing compute blocks using PostgreSQL-specific configuration must be migrated
scystream-sdk 1.5.0 - Release Notes#
Added#
Added support for database schemas in database operations
Introduced optional
DB_SCHEMAconfiguration forDatabaseSettingsSchema is now applied automatically to all database read/write operations
Added internal normalization of schema values (e.g.
""and whitespace are treated asNone)
Notes#
Schema support is primarily intended for PostgreSQL
Automatic schema creation is currently only supported for PostgreSQL
For other databases (e.g. MySQL, SQLite), schema behavior depends on the underlying backend