Artificial Intelligence

2603 Submissions

[7] viXra:2603.0138 [pdf] submitted on 2026-03-31 00:30:12

The Environment Layer: Building Infrastructure for Agentic AI Training

Authors: Fei Wang, Eric Wang, Salon Ren
Comments: 45 Pages. (Note by viXra Admin: Please submit article written with AI assistance to ai.viXra.org)

The emergence of Agentic Reinforcement Learning (Agentic RL) has created an urgent need for sophisticated training environments that go beyond traditional RL benchmarks. While conventional LLM-RL operates within single-step MDPs, Agentic RL requires environments supporting multi-turn interactions, tool integration, and verifiable reward signals. This white paper argues that RL environments are the foundational infrastructure for agentic AI---the critical layer that determines what capabilities agents can learn and how reliably they transfer to deployment.We present a comprehensive analysis of the environment layer, organized around three core questions: (1) Environment Design---what makes an effective Agentic RL environment, including observation spaces, action interfaces, and reward mechanisms; (2) Environment Infrastructure---the frameworks, protocols, and tools for building and deploying environments at scale; and (3) Environment Quality---methodologies for evaluating environment fidelity, the sim-to-real gap, and production readiness.We survey the ecosystem of environment frameworks (OpenEnv, GEM, MCP), synthetic environment generation pipelines (Agent World Model, Reasoning Gym), and specialized environments for embodied AI (NVIDIA Cosmos, Isaac Sim). We introduce the Environment Quality Framework (EQF) for systematic environment evaluation and analyze the critical sim-to-real gap through the User-Sim Index. Finally, we present a research agenda for next-generation RL environments that will enable the transition from research prototypes to production-ready agentic systems.
Category: Artificial Intelligence

[6] viXra:2603.0075 [pdf] submitted on 2026-03-14 10:32:52

Proactive Heuristic Synthesis (PHS): Addressing the Reactive Bottleneck Through Latent Idle Consolidation in LLMs

Authors: Ali Zulfiqar
Comments: 6 Pages.

Current Large Language Model (LLM) architecturesare fundamentally reactive, operating within a "Prompt-Response" paradigm that leaves significant computational resourcesdormant during inter-prompt intervals. This paper introducesProactive Heuristic Synthesis (PHS), an architecturalframework that enables models to transition into a state of LatentIdle Consolidation. Unlike "thought-at-inference" methods (e.g.,Quiet-STaR) which impose latency penalties, or static distillationmethods (e.g., Fast Quiet-STaR) which lack continuous adaptability,PHS shifts the computational burden to asynchronousidle cycles. The framework utilizes a Regret-Based Replaymechanism defined by the counterfactual delta between initialinference failure and post-exploration success to target high-valueoptimization trajectories. Unlike recent replay methods suchas SuRe, which prioritize high-perplexity samples for memoryretention, PHS prioritizes high-regret samples for reasoningevolution. By autonomously navigating these associations, themodel synthesizes novel heuristics verified via a Dual-ModelConsensus engine to prevent generation-verification collapse. Thispaper establishes the mathematical formulation and architecturalblueprint of PHS. We theoretically demonstrate that by shiftingfrom perplexity-driven retention to regret-driven evolution, PHSprovides an asymptotically optimal solution to the latencyreasoningtrade-off. Furthermore, relying on recent bounds inrecursive training, we formalize how our Dual-Model Consensusmathematically mitigates model collapse, offering a rigorouspathway to zero-latency self-evolution in LLMs.
Category: Artificial Intelligence

[5] viXra:2603.0065 [pdf] submitted on 2026-03-12 12:46:30

Occluded Person Re-identification via Spatio-Semantic Topology Guidance and Geometry-Aware Semantic Alignment

Authors: Xiaohao Xie, Wenhua Jiao, Caoyu Chen
Comments: 10 Pages.

Identifying pedestrians under heavy occlusion (Occluded Re-ID) remains highly challenging, primarily because obstacles inevitably corrupt human structural integrity and induce severe spatial-semantic mismatching. Current approaches either struggle to recover fragmented topological features or blindly trust fragile pose estimators, making them highly vulnerable to complex background interference. To overcome these bottlenecks, we present textbf{SSGA}, a unified multi-modal enhancement framework that seamlessly couples topology restoration, cross-modal feature calibration, and semantic-driven decoding. Specifically, a Spatial Guided Graph Convolutional Network (SG-GCN) is first formulated to repair corrupted local structures by embedding physical spatial constraints into visual patch representations. Moreover, to tackle cross-modal mismatching, we propose the Spatio-Semantic Dual-Metric Greedy Alignment (SSDA) strategy. By anchoring visual embeddings to reliable skeletal cues under strict geometric boundaries, SSDA effectively eliminates semantic ambiguity such as symmetrical limb confusion. Furthermore, a Geometry-Aware Semantic Matching (GASM) module is designed to employ learnable semantic queries for dynamically extracting part-level features, which forces the network to highlight visible body regions and filter out occlusion noise. Comprehensive evaluations across five standard benchmarks validate the superiority of our SSGA framework, which establishes new state-of-the-art results and yields substantial improvements particularly on the severely occluded Occluded-Duke and Occluded-ReID datasets.
Category: Artificial Intelligence

[4] viXra:2603.0064 [pdf] submitted on 2026-03-12 13:17:51

A Multi-Background Normalization and Dynamic Meta Feature Mining Approach for Person Re-Identification

Authors: Xiaohao Xie
Comments: 10 Pages.

Person re-identification (ReID) aims to retrieve pedestrians across cameras, facing challenges from differences in perspective, background, and lighting, which introduce noise and hinder key feature extraction. Existing methods, often relying on normalization or generative data augmentation, suffer from limitations such as neglecting camera label information or the unreliability of two-stage learning. To address this, we propose a one-stage architecture, M-MBNNet, consisting of MBN (Multi Background Norm) and MetaRep (Meta-Representation for Adaptive Metric) modules. MBN uses a camera-wise Assignment Gate and Multi-aggregation Norm to align and normalize backgrounds, reducing interference and enhancing person-relevant feature robustness. MetaRep bridges representation and metric learning, leveraging mutual information (quality measures) to dynamically adjust asymmetric metrics for consistent multi-task convergence. It also incorporates curriculum learning to dynamically emphasize either inter-class separability or intra-class compactness. M-MBNNet offers a systematic approach to extracting key pedestrian features and resolving cross-camera differences through active alignment and adaptive optimization. We achieve strong results on two baselines—one mainly for representation and one for metric learning—demonstrating the method's scalability.
Category: Artificial Intelligence

[3] viXra:2603.0029 [pdf] submitted on 2026-03-05 18:13:42

Polynomial Feature Engineering for Analytical Ridge Regression: A Case Study in Aerospace Anomaly Detection

Authors: Ansh Mathur, Atrishman Mukherjee, Supratik Dey
Comments: 11 Pages. (Note by viXra Admin: Please submit article written with AI assistance to ai.viXra.org)

We investigate the effectiveness of polynomial feature engineering when combined with analytical ridge regression for multi-class classification tasks. Using the NASA Shuttle dataset as a case study, we demonstrate that degree-4 polynomial features enable closed-form solutions to achieve 99.43% test accuracy in 45 milliseconds of training time. This accuracy matches or exceeds previously reported results while offering substantial computational advantages through elimination of iterative optimization. Our systematic evaluation across six feature configurations reveals that test accuracy improves monotonically from 87.33% withlinear features to 99.43% with degree-4 polynomial interactions, representing a 12.10% absolute improvement. Generalization gaps remain below 0.3% across all tested configurations, indicating robust performance despite increased model capacity.These findings suggest that ex-plicit polynomial feature expansion, when properly regularized, provides a computationally efficient alternative to iterative learning methods for problems with polynomial structure. We discuss the applicability of this approach to safety-critical aerospace applications where deterministic training guarantees and rapid model updates are valued.
Category: Artificial Intelligence

[2] viXra:2603.0026 [pdf] submitted on 2026-03-05 18:17:50

Not All Weights Are Equal

Authors: Benjamin Cowherd
Comments: 9 Pages. github.com/orbits64/project-synapse

We test whether a dedicated ternary-weight reasoning component outperforms a homogeneous MLP baseline on multi-step logic problems requiring generalization to unseen entities. In two runs, the reasoning component outperformed the baseline by 12.2% and 19.6% respectively, while the baseline overfit and stalled. We propose a full architecture built on this result, with separate components for reasoning and language.
Category: Artificial Intelligence

[1] viXra:2603.0020 [pdf] submitted on 2026-03-04 21:15:12

Machine Learning Based Credit Card Fraud Detection

Authors: Avinash Chaurasiya
Comments: 24 Pages. (Note by viXra Admin: Please submit article written with AI assistance to ai.viXra.org)

Credit card fraud poses an escalating threat to the global financial ecosystem, causing billions of dollars in annual losses and eroding consumer trust. Effective automated fraud detection must contend with severe class imbalance, evolvingattack patterns, and the practical need for explainable, actionable predictions. In this paper, we present a rigorous comparative study of five machine learning classifiers—Logistic Regression, Decision Tree, Random Forest, Gradient Boosting, and XGBoost—applied to a dataset of 50,000 credit card transactions exhibiting a realistic fraud rate of 0.34%. We evaluate the impact of two class-imbalanceremediation strategies (SMOTE oversampling and random undersampling), conduct threshold optimisation to align classification decisions with business economics, and employ SHAP (SHapley Additive exPlanations) values to provide model-level and instance-level interpretability. Our best model, Gradient Boosting, achieves a ROCAUCof0.9995, aPR-AUCof0.9421, andanF1scoreof0.7805underacost-optimised decision threshold of 0.75, translating into an estimated net business benefit of $4,228 per 10,000 transactions compared to a no-model baseline. Feature analysis identifies V27 (importance = 0.397) and V2 (0.213) as the dominant fraud signalsamong the PCA-derived features. This work demonstrates that ensemble gradient-boosted trees, combined with principled threshold tuning and SHAP explainability,constitute a production-ready solution for real-world fraud detection.
Category: Artificial Intelligence