Cifer’s FHE framework supports encrypting both data and models for privacy-preserving computation.
Data files must be prepared in .npz format (NumPy compressed archive).
Models should be saved in .h5 format (Keras model format).
Before encryption, you must prepare your dataset or model appropriately.
4.1 Prepare Dataset in .npz Format
To use your dataset in the system, organize the data into a .npz file. The file should contain numerical arrays stored with keys train_images and train_labels, as shown in the example below:
python
import numpy as np# X: feature data (e.g., images, tabular features), y: labelsnp.savez("datasets/my_dataset.npz",train_images=X,train_labels=y)
Data Requirements:
X must be a NumPy array with an appropriate shape, e.g., (1000, 20) for 1000 samples with 20 features each.
y must be a 1D array (vector) of labels, e.g., (1000,).
The dataset must not contain missing values or malformed entries.
Additional Conversion Examples
Here are code examples demonstrating how to convert various raw data formats into .npz files.
Below, you’ll find separate tabs for each file type with the corresponding conversion code snippets to help you prepare your dataset for encryption.
4.2 Prepare Model in .h5 Format for Encryption
Cifer’s FHE framework supports encryption of models saved in Keras’s native .h5 format.
Save your trained model using:
Example: Creating and Saving a Keras Model
Additional Notes on Model Formats
If you have models in other formats (e.g., TensorFlow SavedModel, PyTorch), convert them to .h5 format for compatibility:
After ONNX export, use converters such as onnx-tf to convert to TensorFlow, then save as .h5
Now your dataset and model files are properly prepared for FHE encryption. Proceed to the next step to perform encryption and integrate them into your federated learning workflow.
import numpy as np
data = np.loadtxt("data/my_data.csv", delimiter=",", skiprows=1)
X = data[:, :-1] # features
y = data[:, -1] # labels
np.savez("datasets/my_dataset.npz", train_images=X, train_labels=y)
pyth
from PIL import Image
import numpy as np
import os
image_dir = "data/images"
image_list = []
for filename in os.listdir(image_dir):
if filename.endswith(".png"):
img = Image.open(os.path.join(image_dir, filename)).convert("L") # grayscale
img_array = np.array(img)
image_list.append(img_array)
X = np.stack(image_list)
y = np.array([...]) # corresponding labels
np.savez("datasets/my_dataset.npz", train_images=X, train_labels=y)
python
import librosa
import numpy as np
audio_path = 'data/audio/example.wav'
y, sr = librosa.load(audio_path, sr=None) # Load audio file
# Extract MFCC features
mfcc = librosa.feature.mfcc(y=y, sr=sr, n_mfcc=13)
# Transpose to shape (frames, features)
mfcc = mfcc.T
# Example: dummy labels for each audio file
labels = np.array([0]) # Replace with actual labels
# Save to npz
np.savez('datasets/audio_dataset.npz', train_images=mfcc, train_labels=labels)
python
import cv2
import numpy as np
video_path = 'data/video/example.mp4'
cap = cv2.VideoCapture(video_path)
frames = []
ret = True
while ret:
ret, frame = cap.read()
if ret:
gray = cv2.cvtColor(frame, cv2.COLOR_BGR2GRAY)
frames.append(gray)
cap.release()
frames_array = np.stack(frames) # Shape: (num_frames, height, width)
# Example: dummy labels
labels = np.array([0]) # Replace with actual labels
np.savez('datasets/video_dataset.npz', train_images=frames_array, train_labels=labels)
python
model.save("trained_model/my_model.h5")
python
from tensorflow import keras
model = keras.Sequential([
keras.layers.Input(shape=(X.shape[1],)),
keras.layers.Dense(64, activation="relu"),
keras.layers.Dense(1, activation="sigmoid")
])
model.compile(optimizer="adam", loss="binary_crossentropy", metrics=["accuracy"])
model.fit(X, y, epochs=5)
# Save the model
model.save("trained_model/my_model.h5")
python
model = tf.keras.models.load_model('saved_model_dir')
model.save('model.h5')