4. Dataset and Model Preparation

Cifer’s FHE framework supports encrypting both data and models for privacy-preserving computation.

  • Data files must be prepared in .npz format (NumPy compressed archive).

  • Models should be saved in .h5 format (Keras model format).

Before encryption, you must prepare your dataset or model appropriately.

4.1 Prepare Dataset in .npz Format

To use your dataset in the system, organize the data into a .npz file. The file should contain numerical arrays stored with keys train_images and train_labels, as shown in the example below:

python
import numpy as np

# X: feature data (e.g., images, tabular features), y: labels
np.savez("datasets/my_dataset.npz", train_images=X, train_labels=y)

Data Requirements:

  • X must be a NumPy array with an appropriate shape, e.g., (1000, 20) for 1000 samples with 20 features each.

  • y must be a 1D array (vector) of labels, e.g., (1000,).

  • The dataset must not contain missing values or malformed entries.

Additional Conversion Examples

Here are code examples demonstrating how to convert various raw data formats into .npz files. Below, you’ll find separate tabs for each file type with the corresponding conversion code snippets to help you prepare your dataset for encryption.

python
import numpy as np

data = np.loadtxt("data/my_data.csv", delimiter=",", skiprows=1)
X = data[:, :-1]  # features
y = data[:, -1]   # labels

np.savez("datasets/my_dataset.npz", train_images=X, train_labels=y)

4.2 Prepare Model in .h5 Format for Encryption

Cifer’s FHE framework supports encryption of models saved in Keras’s native .h5 format.

Save your trained model using:

python
model.save("trained_model/my_model.h5")

Example: Creating and Saving a Keras Model

python
from tensorflow import keras

model = keras.Sequential([
    keras.layers.Input(shape=(X.shape[1],)),
    keras.layers.Dense(64, activation="relu"),
    keras.layers.Dense(1, activation="sigmoid")
])

model.compile(optimizer="adam", loss="binary_crossentropy", metrics=["accuracy"])
model.fit(X, y, epochs=5)

# Save the model
model.save("trained_model/my_model.h5")

Additional Notes on Model Formats

If you have models in other formats (e.g., TensorFlow SavedModel, PyTorch), convert them to .h5 format for compatibility:

You can load the SavedModel and save as .h5 as explained in TensorFlow official guide.

Example snippet:

python
model = tf.keras.models.load_model('saved_model_dir')
model.save('model.h5')

Now your dataset and model files are properly prepared for FHE encryption. Proceed to the next step to perform encryption and integrate them into your federated learning workflow.

Last updated