Auditing collected data

The dataset collected by the Credolab SDK is secured using AES symmetric encryption. To decrypt this dataset, you'll need to obtain decryption keys from the Credolab team. These keys consist of an AES encryption key and an initialization vector (IV).

To obtain the decryption key pair, submit the dataset collected by the SDK to the designated Credolab API. The API will return the key pair required to decrypt the dataset using the AES encryption algorithm.

Below, you can find the detailed manual on how to decrypt the dataset collected

  1. Obtain API key
  2. Receive encryption keys
  3. Decrypt dataset
  4. Run decryption flow for the dataset collected by SDK

1. Obtain API key

The API key - is the key Credolab uses to identify the client who is try receive the encryption keys for the dataset.

How to get the API key:

  • the API key with Decrypter should be provided by Credolab CSM

2. Receive encryption keys

POST /api/datasets/v1/decryptionkeys

Request Description

Get decryption keys for the dataset provided

Pay attention: the request to the API mentioned are rate limited (10 per day)

Request Headers

HeaderValueDescription
AuthorizationBearer $API_KEYThe header is used to authorize the request. Remember that you will have to replace $API_KEY with your actual API key from Obtain API key

Request Parameters

Name

Description

data [string]

The mobile dataset collected by Credolab SDK

Response Parameters

Name

Description

aes [string]

Base64 encoded AES key required during the decryption process.

iv
[string]

Base64 encoded initialization vector required during the decryption process.

3. Decrypt dataset

The data collected by Credolab is both compressed (zipped) and encrypted. To audit or access this data, you will need the encryption keys, which can be obtained through the Credolab API.

Below is a Python function that can be used to decrypt the encrypted dataset:

def decrypt(key, iv, data):
    """
    Decrypts the given data using the provided key and initialization vector (IV).

    This function takes in a key, initialization vector (IV), and encrypted data, 
    and returns the decrypted data as a JSON string.

    Args:
        key (str): base64 encoded encryption key used for decryption (from API response).
        iv (str): base64 encoded initialization vector used for decryption (from API response).
        data (str): data returned by the Credolab SDK.

    Returns:
        str: A JSON string representing the decrypted data.
    """
    # get encrypted data bytes
    encrypted_data_bytes = base64.b64decode(data)

    # get encrypted payload data from the data returned by SDK
    # 292 - magic number defined by Credolab, the bytes before are related to some meta information
    payload_start = 292
    encrypted_payload = encrypted_data_bytes[payload_start:]

    # the data is encrypted using AES encryption algorithm, to decrypt it we need key and iv
    # decode key bytes from base64 encoded string
    key_bytes = base64.b64decode(key)
    iv_bytes = base64.b64decode(iv)

    # create cheaper for encryption
    cipher = AES.new(key_bytes, AES.MODE_CBC, iv_bytes)

    # perform decryption
    compressed_payload =  unpad(cipher.decrypt(encrypted_payload), AES.block_size)

    # the decrypted data is json string compressed with the usage of deflate compression
    # to get the original data we need to decompress data
    payload_bytes = zlib.decompress(compressed_payload, -zlib.MAX_WBITS)

    # decode to json string using utf-8 encoding
    json_data = payload_bytes.decode('utf-8')

    return json_data

4. Run decryption flow for the dataset collected by SDK

The following Python script demonstrates the process of decrypting a dataset. This includes retrieving the decryption keys from the Credolab API and using them to decrypt the dataset content:

# the required dependency - pycryptodome==3.20.0

import base64
import zlib
import requests
import json

from Crypto.Cipher import AES
from Crypto.Util.Padding import unpad


def extract_keys(cl_url, auth_key, data):
    """
    Extracts encryption keys from the collected data using Credolab API.
    
    Args:
        cl_url (str): the URL of the Credolab service.
        auth_key (str): Bearer authentication key received from Credolab CSM.
        data (dict): data collected by the Credolab SDK.
    
    Returns:
        dict: A dictionary containing the extracted AES key and initialization vector required for decryption.
    """
    # set authorization header required for receiving receiving decryption keys for dataset
    headers = {
        'Authorization': f'Bearer {auth_key}'
    }
    
    # build json payload
    payload = {
        'data': data
    }
    
    response = requests.post(
        url=f'{cl_url}/api/datasets/v1/decryptionkeys', 
        headers=headers,
        json=payload)
    
    keys = json.loads(response.content)

    print(keys)
    
    return keys['aes'], keys['iv']


def decrypt(key, iv, data):
    """
    Decrypts the given data using the provided key and initialization vector (IV).

    This function takes in a key, initialization vector (IV), and encrypted data, 
    and returns the decrypted data as a JSON string.

    Args:
        key (str): base64 encoded encryption key used for decryption (from API response).
        iv (str): base64 encoded initialization vector used for decryption (from API response).
        data (str): data returned by the Credolab SDK.

    Returns:
        str: A JSON string representing the decrypted data.
    """
    # get encrypted data bytes
    encrypted_data_bytes = base64.b64decode(data)

    # get encrypted payload data from the data returned by SDK
    # 292 - magic number defined by Credolab, the bytes before are related to some meta information
    payload_start = 292
    encrypted_payload = encrypted_data_bytes[payload_start:]

    # the data is encrypted using AES encryption algorithm, to decrypt it we need key and iv
    # decode key bytes from base64 encoded string
    key_bytes = base64.b64decode(key)
    iv_bytes = base64.b64decode(iv)

    # create cheaper for encryption
    cipher = AES.new(key_bytes, AES.MODE_CBC, iv_bytes)

    # perform decryption
    compressed_payload =  unpad(cipher.decrypt(encrypted_payload), AES.block_size)

    # the decrypted data is json string compressed with the usage of deflate compression
    # to get the original data we need to decompress data
    payload_bytes = zlib.decompress(compressed_payload, -zlib.MAX_WBITS)

    # decode to json string using utf-8 encoding
    json_data = payload_bytes.decode('utf-8')

    return json_data


def main():
    cl_url = 'https://******.credolab.com' # the url to the Credolab server
    auth_key = '*************************' # auth key provided by Credolab CSM
    data = '*****************************' # the data returned by credolab sdk

    aes_key, iv = extract_keys(cl_url, auth_key, data)
    decrypted_data = decrypt(aes_key, iv, data)

    print(decrypted_data)

if __name__ == '__main__':
    main()