Conventional topology optimization methods, such as the Solid Isotropic Material with Penalization (SIMP) and Bi-directional Evolutionary Structural Optimization (BESO) approaches, rely on sensitivity analysis, which requires repeated mathematical derivations for different structures and constraints. This reliance not only limits generalization but also results in slow convergence and high computational costs. To overcome these limitations, this study develops a Reinforcement Learning Cellular Automaton (RLCA) framework that combines reinforcement learning with cellular automaton principles. By treating finite elements as cells and aggregating their behavior into a unified learning agent, efficient optimization can be achieved without explicit sensitivity calculations. The proposed method explores innovative designs for state representation, action spaces, and adaptive reward functions to better balance structural stiffness and material usage. Unlike traditional algorithms, the RLCA framework learns through interaction with the environment, enabling it to adapt and improve across diverse problems. This research highlights the potential of reinforcement learning in structural optimization, demonstrating the RLCA method's model-free learning capability, efficient convergence, and adaptability.