🧀 BigCheese.ai

Social

DeepSeek Open Source Optimized Parallelism Strategies, 3 repos

🧀

The GitHub repository deepseek-ai/profile-data publicly shares profiling data for training and inference frameworks, showcasing strategies for computational and communication overlap, mainly focusing on MoE layers within the DeepSeek infrastructure.

  • Public profiling data for training frameworks.
  • Uses PyTorch Profiler for data capture.
  • Visualized via Chrome or Edge tracing.
  • Affiliated with DualPipe and DeepEP.
  • Balanced MoE routing strategy simulated.