Data contract
Provides classes for working with data contract on Blip's data ecossystem.
Data contracts are the source of truth of data products in Blip's data ecossystem. It keeps information about domain ownership, columns metadata, update frequency, availability period of data and more. All this information is necessary to ensure the governance of our data ecossystem and is used in BlipDataForge library to enforce governance rules during data read, write and transformations.
DataContract
dataclass
Base class to work with data contracts.
Data contracts are the basic output of the communication with Data Contract API. It contains metadata about data products used in governance enforcements.
Attributes:
| Name | Type | Description |
|---|---|---|
product_name |
str
|
Identification name of the data contract. |
catalog |
str
|
The data product catalog. |
schema |
str
|
The data product schema / database. |
table |
str
|
The data product table. |
columns |
list
|
Raw metadata about table columns. |
owner_email |
str
|
Email of the data product owner. |
support_email |
str
|
Email of the directly responsable for the freshness of the data product. |
lifcycle_time |
str
|
Number of the days to store the data. |
Source code in blipdataforge/data_contract.py
21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60 61 62 63 64 65 66 67 68 69 70 71 72 73 74 75 76 77 78 79 80 81 82 83 84 85 86 87 88 89 90 91 92 93 94 95 96 97 98 99 100 101 102 103 104 105 106 107 108 109 110 111 112 113 114 115 116 117 118 | |
get_quality_checks()
Returns a quality check string based on Data Contract attributes.
The quality checks generated by this functions cover up schema and expected values.
Returns:
| Type | Description |
|---|---|
str
|
A check string compatible with data platform quality engine. |
Source code in blipdataforge/data_contract.py
DataContractRepository
The entrypoint for communication with Data Contract API.
This class implements the communication with Data Contract API endpoints.
Source code in blipdataforge/data_contract.py
121 122 123 124 125 126 127 128 129 130 131 132 133 134 135 136 137 138 139 140 141 142 143 144 145 146 147 148 149 150 151 152 153 154 155 156 157 158 159 160 161 162 163 164 165 166 167 168 169 170 171 172 173 174 175 176 177 178 179 180 181 182 183 184 185 186 187 188 189 190 191 192 193 194 195 196 197 198 199 200 201 202 203 204 205 206 207 208 209 210 211 212 213 214 215 216 217 218 219 220 221 222 223 224 225 226 227 228 229 230 231 | |
__init__(api_url, api_auth_token)
Initializes an instance to communicate with Data Contract API.
If the environment variable API_DATACONTRACT_URL exists, it will
be used as the base url of the API. If it does not exists, a
predefined value will be used.
Parameters:
| Name | Type | Description | Default |
|---|---|---|---|
api_url
|
str
|
Tue url used to communicate with Data Contract API. |
required |
api_auth_token
|
str
|
The token to authenticate on the API. |
required |
Source code in blipdataforge/data_contract.py
get_contract(catalog, schema, table)
Request a specific data contract.
Parameters:
| Name | Type | Description | Default |
|---|---|---|---|
catalog
|
str
|
Target catalog. |
required |
schema
|
str
|
Target schema / database. |
required |
table
|
str
|
Target table. |
required |
Returns:
| Type | Description |
|---|---|
Union[DataContract, None]
|
The object with data contract information. |
Source code in blipdataforge/data_contract.py
list_contracts()
List all data contracts.
Returns:
| Type | Description |
|---|---|
Union[list, None]
|
list A list with dicts, each dict containing a resume of a data contract. |