Code for project: Approximate Edge AI Components for energy-aware adaptive framework
The project contains a collection of code for energy-aware training and inference of CNN/DNN and transformer models, and further supports deployment on edge devices. The models used are primarily within the scope of perception applications. The project provides a comprehensive end-to-end framework for developing, optimizing, and deploying neural networks across CNN/DNN and transformer architectures. It implements approximate algorithms for model quantization and compression to reduce computational overhead while maintaining performance accuracy. The codebase also includes bias mitigation techniques with sensitivity and selectivity metrics analysis to ensure fair and robust model behavior across diverse datasets. Graph optimization modules enable efficient model structure refinement and computational graph streamlining for enhanced inference performance. The system supports comprehensive bias analysis tools that evaluate model fairness and provide detailed sensitivity reports for ethical AI deployment. Advanced quantization techniques include both post-training and quantization-aware training methods with support for various bit-width configurations. The compression pipeline includes pruning, knowledge distillation, and weight sharing strategies to minimize model size without significant accuracy loss. The code also includes a model partition and resource allocation mechanism using an integrated LAP-DTR (Layer-Adaptive Partitioning with Dynamic Task Redistribution) component, enabling energy-aware distributed inference for Vision Transformers across heterogeneous edge device networks with adaptive partitioning strategies and fault-tolerant execution. Edge deployment capabilities include optimized inference engines, real-time performance monitoring, and dynamic resource management for constrained computing environments. Overall, the codebase is specifically designed for perception applications in autonomous driving, computer vision tasks, and distributed edge computing scenarios including vehicular networks and IoT systems.