# 4. Configuration

## Configuring and Setting Up the Federated Learning Environment

## Understanding the Configuration File

Cifer's FedLearn uses a `config.json` file located in the root directory of the project for configuration. This file contains all necessary settings for both local testing and distributed federated learning.

To locate the file:

{% code title="bash" %}

```bash
cd path/to/cifer
ls config.json
```

{% endcode %}

## Configuration Parameters

The `config.json` file includes the following key parameters:

* `server_address`: The gRPC server address and port (e.g., "localhost:8080").
* `num_rounds`: Number of federated learning rounds.
* `strategy`: The federated learning strategy (e.g., "FedAvg" for Federated Averaging).
* `grpc_max_message_length`: Maximum gRPC message size in bytes.
* `mode`: "local" for single-machine simulation, "collaborative" for distributed setup.
* `min_num_clients`: Minimum number of clients required for a round (in collaborative mode).
* `min_sample_size`: Minimum number of samples required from each client.
* `model_name`: Name of the model to be used.
* `local_epochs`: Number of local training epochs per round.
* `batch_size`: Batch size for local training.
* `learning_rate`: Learning rate for local training.
* `data_dir`: Directory containing the dataset.
* `output_dir`: Directory for saving output files.

## Local Mode Configuration

For local testing and development, use these settings:

{% code title="json" %}

```json
{
  "server_address": "localhost:8080",
  "num_rounds": 10,
  "strategy": "FedAvg",
  "grpc_max_message_length": 104857600,
  "mode": "local",
  "num_clients": 3,
  "local_epochs": 5,
  "batch_size": 32,
  "learning_rate": 0.01,
  "model_name": "simple_cnn",
  "dataset": "mnist",
  "data_dir": "./data",
  "output_dir": "./output"
}
```

{% endcode %}

## Collaborative Mode Configuration

For distributed federated learning, modify the configuration as follows:

{% code title="json" %}

```json
{
  "server_address": "0.0.0.0:8080",
  "num_rounds": 50,
  "strategy": "FedAvg",
  "grpc_max_message_length": 209715200,
  "mode": "collaborative",
  "min_num_clients": 3,
  "min_sample_size": 1000,
  "model_name": "resnet18",
  "local_epochs": 5,
  "batch_size": 64,
  "learning_rate": 0.001,
  "data_dir": "./local_data",
  "output_dir": "./collaborative_output"
}
```

{% endcode %}

## Setting Up the Server

1. Choose a machine with good network connectivity and sufficient resources.
2. Update the `config.json` on the server with appropriate settings (use the collaborative mode configuration as a starting point).
3. Start the server:

{% code title="python" %}

```python
from cifer import fedlearn

# Initialize and start FedLearn server
fedlearn.start_server('config.json')
```

{% endcode %}

## Setting Up Clients

For each participating client:

1. Install Cifer and ensure access to the local dataset.
2. Update the `config.json` on each client, changing the `server_address` to match the server's address.
3. Start the client:

{% code title="python" %}

```python
from cifer import fedlearn

# Initialize and start FedLearn client
fedlearn.start_client('config.json')
```

{% endcode %}

## Network and Security Considerations

* Ensure the server's firewall allows incoming connections on the specified port.
* If using secure gRPC, generate and distribute SSL/TLS certificates, and update configurations accordingly.

## Data Preparation and Model Selection

* Ensure each client has access to its local dataset in the specified `data_dir`.
* Verify data format consistency across all clients.
* Choose an appropriate `model_name` in the configuration, or implement and specify a custom model.

## Monitoring and Logging

* Monitor server output for overall progress and issues.
* Check individual client logs for local training progress and client-specific problems.

## Troubleshooting Common Issues

* Connection Problems: Verify network settings and firewall configurations.
* Data Issues: Ensure correct data format and sufficient samples on all clients.
* Resource Constraints: Adjust `batch_size` and `local_epochs` if clients experience memory issues or slow processing.
* Model Compatibility: Verify that the chosen model is compatible with the dataset and task.

## Participant Roles and Responsibilities

In a federated learning environment, different participants have distinct roles and responsibilities:

### 1. Server Administrator:

* Set up and maintain the central server
* Configure global parameters (rounds, minimum clients, etc.)
* Monitor the overall federated learning process
* Ensure security and privacy protocols are followed
* Manage model aggregation and distribution

### 2. Client Participants:

* Prepare and maintain local datasets
* Ensure local system meets hardware and software requirements
* Run client software and participate in training rounds
* Protect the privacy of local data
* Report any issues to the server administrator

### 3. Data Scientists/ML Engineers:

* Design and implement the shared model architecture
* Define evaluation metrics for the federated model
* Analyze aggregated results and refine the model as needed
* Collaborate with server administrator on hyperparameter tuning

### 4. Privacy Officers (if applicable):

* Review and approve the federated learning setup
* Ensure compliance with data protection regulations
* Monitor the process for potential privacy risks
* Advise on implementing additional privacy-preserving techniques

### 5. IT Support:

* Assist with network configuration and troubleshooting
* Ensure proper firewall settings and network security
* Help with software installation and updates on client machines

### 6. Project Manager:

* Coordinate between different participants
* Ensure timely execution of federated learning rounds
* Manage communication between technical and non-technical stakeholders

By clearly defining these roles and responsibilities, organizations can ensure a smooth and effective federated learning process while maintaining data privacy and security.
