Troubleshooting Dstack CVM Restarts With MPC Node
Hey guys! Ever run into a snag when trying to restart your dstack CVM and the MPC node just refuses to cooperate? It's a head-scratcher, but don't worry, we're going to dive deep into this issue and figure out how to get those MPC nodes up and running smoothly, even after multiple restarts. So, grab your favorite caffeinated beverage, and let's get started!
Understanding the Problem
Let's talk about restarting dstack CVM. When you attempt to restart your CVM a second time with the launcher, you might encounter a frustrating error. The launcher bravely tries to start the MPC node docker image, but it faceplants with an error message. The root cause? The docker container, in all likelihood, already exists from the previous run. This is like trying to start a race when your car is already on the track – the system gets confused.
The error message typically looks something like this:
docker: Error response from daemon: Conflict. The container name "/mpc_node" is already in use by container "some-long-container-id". You have to remove (or rename) that container to be able to reuse that name.
This message is Docker's way of saying, "Hey, that name's taken!" It's a common issue, especially when dealing with containerized environments.
The key here is that you need a way to either restart the existing container or remove it and create a new one. But remember, you also want to apply any new parameters you've set in your user.config file. It’s a bit like trying to change the tires on a moving car, but we'll figure it out!
Decoding the Error Logs
When things go south, the logs are your best friends. Let's break down the provided log snippet:
2025-11-03T08:22:18.363382398Z 2025-11-03 08:22:18,363 [INFO] docker cmd docker run --env MPC_IMAGE_HASH=6707e073b7e39c5af0963a788d46a83e447543748bc08e31c93d7b8cd641d94f --env MPC_LATEST_ALLOWED_HASH_FILE=/mnt/shared/image-digest.bin --env DSTACK_ENDPOINT=/var/run/dstack.sock --env MPC_ACCOUNT_ID=sam.test.near --env MPC_LOCAL_ADDRESS=127.0.0.1 --env MPC_SECRET_STORE_KEY=AAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAA --env MPC_CONTRACT_ID=mpc-contract.test.near --env MPC_ENV=mpc-localnet --env MPC_HOME_DIR=/data --env RUST_BACKTRACE=full --env RUST_LOG=mpc=debug,info --env NEAR_BOOT_NODES=ed25519:BGa4WiBj43Mr66f9Ehf6swKtR6wZmWuwCsV3s4PSR3nx@91.134.92.20:24566 --add-host mpc-node-0.service.mpc.consul:35.185.233.54 --add-host mpc-node-1.service.mpc.consul:34.168.117.59 -p 3030:3030 -p 18448:18448 -p 3101:3101 -p 8182:8080 -p 24566:24566 --security-opt no-new-privileges:true -v /tapp:/tapp:ro -v /var/run/dstack.sock:/var/run/dstack.sock -v shared-volume:/mnt/shared -v mpc-data:/data --name mpc_node --detach sha256:6707e073b7e39c5af0963a788d46a83e447543748bc08e31c93d7b8cd641d94f
2025-11-03T08:22:18.441060279Z docker: Error response from daemon: Conflict. The container name "/mpc_node" is already in use by container "d0d839a6699c2a8d8bda62fae3eb887e65e0d429c6f7aac9ea0dc25e3d049533". You have to remove (or rename) that container to be able to reuse that name.
2025-11-03T08:22:18.441157736Z See 'docker run --help'.
2025-11-03T08:22:18.444397888Z Error: ('docker run non-zero exit code %d', 125)
2025-11-03T08:22:18.445065073Z Traceback (most recent call last):
2025-11-03T08:22:18.445070970Z   File "/scripts/launcher.py", line 586, in <module>
2025-11-03T08:22:18.445084922Z     main()
2025-11-03T08:22:18.445089397Z   File "/scripts/launcher.py", line 375, in main
2025-11-03T08:22:18.445094690Z     raise RuntimeError("docker run non-zero exit code %d", proc.returncode)
2025-11-03T08:22:18.445099198Z RuntimeError: ('docker run non-zero exit code %d', 125)
- Docker Run Command: This section displays the exact 
docker runcommand that the launcher is trying to execute. It's packed with environment variables, volume mappings, and port configurations. Pay close attention to this, as any discrepancies here could be the source of your woes. - Error Response: This is the juicy part. It clearly states that the container name "/mpc_node" is already in use. Docker won't allow you to have two containers with the same name.
 - Non-Zero Exit Code: The 
docker runcommand returned a non-zero exit code (125), which signals that something went wrong. In this case, the conflict in container names. - Traceback: This is the Python traceback from the 
launcher.pyscript. It pinpoints the exact line of code where the error occurred. This is helpful for debugging the launcher script itself, but the primary issue is the Docker conflict. 
Crafting a Solution
To address this, we need a strategy that handles the existing container gracefully. Here’s a breakdown of potential solutions:
- 
Check and Restart (or Create):
- Before running the 
docker runcommand, check if a container namedmpc_nodealready exists. - If it exists, stop and remove it.
 - Then, proceed with the 
docker runcommand to create a new container with the updated parameters fromuser.config. 
 - Before running the 
 - 
Docker Compose (Recommended):
- Use Docker Compose to define your MPC node service.
 - Docker Compose simplifies the process of managing containers and their dependencies.
 - With Compose, you can use the 
docker-compose up --force-recreatecommand to update the container with new parameters. 
 
Let's elaborate on each of these solutions.
Solution 1: Check and Restart (or Create)
This approach involves scripting the following steps:
- Check for Existing Container: Use 
docker ps -a -f name=mpc_nodeto check if a container with the namempc_nodeexists. The-aflag ensures you see all containers, even stopped ones, and-f name=mpc_nodefilters the results to only show containers with that name. - Stop and Remove (if Exists): If the container exists, use 
docker stop mpc_nodeto stop it gracefully, followed bydocker rm mpc_nodeto remove it. Important: Be absolutely certain you want to remove the container before executing this command, as this action is irreversible. - Run Docker with Updated Parameters: Finally, execute the 
docker runcommand with the updated parameters from youruser.configfile. 
Here's a sample bash script:
#!/bin/bash
CONTAINER_NAME="mpc_node"
# Check if the container exists
if docker ps -a -f name=$CONTAINER_NAME | grep -q $CONTAINER_NAME; then
  echo "Container $CONTAINER_NAME exists. Stopping and removing..."
  docker stop $CONTAINER_NAME && docker rm $CONTAINER_NAME
else
  echo "Container $CONTAINER_NAME does not exist. Creating..."
fi
# Run the docker command with updated parameters
docker run --env MPC_IMAGE_HASH=your_image_hash \
  --env MPC_LATEST_ALLOWED_HASH_FILE=/mnt/shared/image-digest.bin \
  --env DSTACK_ENDPOINT=/var/run/dstack.sock \
  --env MPC_ACCOUNT_ID=sam.test.near \
  --env MPC_LOCAL_ADDRESS=127.0.0.1 \
  --env MPC_SECRET_STORE_KEY=AAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAA \
  --env MPC_CONTRACT_ID=mpc-contract.test.near \
  --env MPC_ENV=mpc-localnet \
  --env MPC_HOME_DIR=/data \
  --env RUST_BACKTRACE=full \
  --env RUST_LOG=mpc=debug,info \
  --env NEAR_BOOT_NODES=ed25519:your_boot_node@91.134.92.20:24566 \
  --add-host mpc-node-0.service.mpc.consul:35.185.233.54 \
  --add-host mpc-node-1.service.mpc.consul:34.168.117.59 \
  -p 3030:3030 -p 18448:18448 -p 3101:3101 -p 8182:8080 -p 24566:24566 \
  --security-opt no-new-privileges:true \
  -v /tapp:/tapp:ro \
  -v /var/run/dstack.sock:/var/run/dstack.sock \
  -v shared-volume:/mnt/shared \
  -v mpc-data:/data \
  --name $CONTAINER_NAME --detach your_image_name
Important Notes:
- Replace 
your_image_hash,your_boot_node, andyour_image_namewith the actual values from your environment. - This script assumes that the 
dockercommand is available in your PATH. - Adjust the paths and environment variables according to your specific setup.
 
Solution 2: Docker Compose
Docker Compose is a fantastic tool for defining and managing multi-container applications. It uses a docker-compose.yml file to configure your services, networks, and volumes.
Here's an example docker-compose.yml file for your MPC node:
version: "3.8"
services:
  mpc_node:
    image: your_image_name:latest
    container_name: mpc_node
    environment:
      MPC_IMAGE_HASH: your_image_hash
      MPC_LATEST_ALLOWED_HASH_FILE: /mnt/shared/image-digest.bin
      DSTACK_ENDPOINT: /var/run/dstack.sock
      MPC_ACCOUNT_ID: sam.test.near
      MPC_LOCAL_ADDRESS: 127.0.0.1
      MPC_SECRET_STORE_KEY: AAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAA
      MPC_CONTRACT_ID: mpc-contract.test.near
      MPC_ENV: mpc-localnet
      MPC_HOME_DIR: /data
      RUST_BACKTRACE: full
      RUST_LOG: mpc=debug,info
      NEAR_BOOT_NODES: ed25519:your_boot_node@91.134.92.20:24566
    ports:
      - "3030:3030"
      - "18448:18448"
      - "3101:3101"
      - "8182:8080"
      - "24566:24566"
    security_opt:
      - no-new-privileges:true
    volumes:
      - /tapp:/tapp:ro
      - /var/run/dstack.sock:/var/run/dstack.sock
      - shared-volume:/mnt/shared
      - mpc-data:/data
    networks:
      - mpc_network
networks:
  mpc_network:
    driver: bridge
volumes:
  shared-volume:
  mpc-data:
Key Points:
- Version: Specifies the Docker Compose file version.
 - Services: Defines the services that make up your application (in this case, just the 
mpc_node). - Image: The Docker image to use for the service.
 - Container_name: The name of the container.
 - Environment: Environment variables to pass to the container.
 - Ports: Port mappings between the host and the container.
 - Volumes: Volume mappings for persistent storage.
 - Networks: Defines the network the container will be attached to.
 
To update the container with new parameters, simply modify the docker-compose.yml file and run:
docker-compose up --force-recreate
The --force-recreate flag tells Docker Compose to stop and remove the existing container before creating a new one with the updated configuration.
User Story and Acceptance Criteria
Let's revisit the user story and acceptance criteria to ensure our solution aligns with the requirements.
User Story:
"As an operator, I'd like to be able to restart my CVM in order to update the MPC node."
Acceptance Criteria:
- Restarting the CVM with or without a new User Config is possible.
 - New "User Config" parameters are applied if they are provided.
 
Both solutions described above satisfy these criteria. By either explicitly removing the existing container or using Docker Compose with the --force-recreate flag, you can ensure that the MPC node is restarted with the latest configuration.
Conclusion
Restarting dstack CVM and ensuring that the MPC node comes back up with the correct configuration can be tricky, but with the right approach, it's definitely achievable. By either scripting the container removal and recreation process or leveraging the power of Docker Compose, you can automate this task and minimize the risk of errors. Always remember to back up your data and test your solutions thoroughly before deploying them to production. Happy restarting!