The Rise of On Device Generative AI: How Intelligence Is Moving from the Cloud to Your Personal Devices

Author: Kartik Jain

Abstract

With the rise of on-device generative artificial intelligence (AI) the interaction between people and technology is changing and shifting the intelligence to be stored in remote cloud servers, instead of being localized on the device. The transition is expected to be faster, more personal, and real-time calculation, allowing the use of applications such as image and video creation and other uses of audio processing without depending on constant connection to the internet. This is evolving with special purpose hardware, including neural processing units (NPUs), development of energy efficient, task-specific AI models, which enables mobile devices and laptops to do complex generative tasks on-the-fly. The popularization of on-device AI has serious consequences on the daily computing, creative activity, and industry operation providing innovative opportunities of personalization and instant intelligence. With the benefits, there are still challenges that have to be overcome such as power consumption, thermal management, and balancing on-device processing and cloud capabilities. This paper presents a detailed account of the emergence of on-device generative AI, including the trends in technology, its application, and the future. The paper sheds light on the increased relevance of the concept of local intelligence and its possibility to transform the personal and professional computing space through the lens of examining the integration of hardware, software, and AI models.

Keywords

On-device AI; Generative AI; Edge computing; Neural Processing Units; Real-time intelligence; Privacy-aware computing

1.Introduction

Cloud computing has always been related to artificial intelligence (AI), whereby complicated calculations and generative tasks are done on remote computers and then results are sent back to consumers. Historically, this model has facilitated strong AI capabilities such as natural language processing, image generation, and data analysis, however, it has also presented some challenges such as latency, the need to use high-speed internet and privacy issues. According to Joel (2022), the reliance on remote servers has in many cases constrained promptness and individual authority of applications of AI.

The new technological developments have started to alter this paradigm and the intelligence is now brought to the personal devices. On-device generative AI is used to enable laptops, smartphones and other consumer electronics to execute complex tasks on-device, eliminating the need to interact with the cloud continuously (Ali, 2023). The process of this transition is facilitated by both specialised hardware, such as neural processing units (NPUs), and energy-efficient and optimised AI designs that operate in a mobile setting (Ale, Zhang, King, and Chen, 2024).

The effects of this change are far reaching. The end users are now capable of enjoying quicker responses, more privacy, and offline experience, which changes the daily experiences of people with technology (Sai, Prasad, Dashore, and Chamola, 2024). As an example, a student can create study notes without the internet in a library, or a photographer may edit high-resolution photos on an aircraft and it is an illustration of the practical value of localized intelligence (Na and Lee, 2024).

In addition, the emergence of AI on the device is not confined to individual convenience. Such technologies are being used to make industries, such as retail and manufacturing, more efficient in their operations, identify mechanical problems, and provide real-time customer support (Wang et al., 2025; Zhang et al., 2025). An embedded AI is defining the lines between user and machine, making the computing experience more responsive and closer with the machine, as Kumar et al. (2026) point out.

Nevertheless, these developments have their problems as well. AI on-device has to be performance versus energy and thermal efficient, and hybrid models, which combine cloud and local processing, create complexities to data processing and privacy assurances (Li et al., 2026; Kodavanti et al., 2026). Nevertheless, these challenges notwithstanding, the trend towards local intelligence cannot be ignored and it is possible to conclude that the future of computing will be more personal, immediate, and integrated.

This paper is a discussion of the emergence of on-device generative AI, its technology, its use, and its implications on both the general population and industry. It will seek to offer a holistic insight into the realization of the intelligence being transferred out of the cloud and into the devices that we use in day-to-day interactions by examining recent trends and research.

2. Literature Review

The on-device is a hardware innovation, model optimization, and application-driven research that resulted in the development of on-device generative AI. As stressed by Joel (2022), the initial implementation of AI worked intensively on cloud-based systems that were also quite powerful but added latency and privacy issues. This reliance gave rise to the need of systems that can perform complex tasks at the edge, and this gave prominence to edge AI and on-device processing paradigm.

Ali (2023) points out that the direct implementation of AI into gadgets lowers the round-trip time of cloud computing, which enables real-time communication. This applies especially to generative AI which demands quick calculations in the creation of images, audio and text. These capabilities have been enabled by the development of special hardware, including neural processing units (NPUs), which can offer the computational capabilities required to perform local generative tasks without demanding excessive resources on the hardware (Ale, Zhang, King, and Chen, 2024).

This has been facilitated by small and effective AI models. Sai, Prasad, Dashore, and Chamola (2024) comment on the creation of compact language models and diffusion models that have been optimized to run on devices, which are aimed to complete particular tasks, using less energy. These models are mainly focused on speed, efficiency and flexibility, as opposed to trying to mimic the wide capabilities of general-purpose AI systems that operate on the cloud (Na & Lee, 2024).

On-device AI has practical utility, which is further supported by empirical studies and industrial demonstrations. Wang et al. (2025) note that laptops and smartphones equipped with inbuilt NPUs have the ability to create images and videos in real time, improve audio and provide contextual summaries without any connection to the internet. Likewise, Zhang et al. (2025) report the implementation of on-device AI in the retail and manufacturing environments where the devices with an edge can improve the customer interaction and predictive maintenance and emphasize the high versatility of local intelligence.

There are no obstacles on the way to on-device AI. According to Kumar et al. (2026), energy consumption and heat production are also major considerations during the design of a device, especially in mobile devices that have small batteries. Li et al. (2026) warn that as on-device AI enhances privacy through lessening dependence on the cloud, the hybrid systems tend to blur the boundary between local and remote processing and casts doubt on the veracity of ensuring privacy. Kodavanti et al. (2026) note that hardware-aware AI models, like diffusion transformers, are important to balance between efficiency and computational needs to enable generative tasks to operate on the capabilities of devices.

On the whole, it is obvious that in the literature, the shift towards users is observed, and it is supported by the development of hardware, efficient models, and various applications. This transition is more immediate, personal, and useful in the daily technology and at the same time has put engineers and researchers to task to solve the energy efficiency, heat transfer and safe data handling. The comprehensive body of literature makes up the basis of comprehending the influence of on-device generative AI on personal computing and industrial processes in general.

Figure 1: Conceptual Framework of On-Device Generative AI: Hardware, Models, Applications, and Challenges

Short note on the graph

The graph is an organized position on the main elements that form on-device generative AI. On-device AI is centralized with four key areas backing it, including hardware breakthroughs, optimization of AI models, applications and challenges. The hardware improvements include neural processing units which provide efficient local computation and optimized models which can run complex tasks within device constraints. Such developments propagate real-life applications in the field of personal computing and industry, which makes the use of real-time, offline, intelligence possible. Nevertheless, the graph also puts the critical challenges which include energy consumption, heat management, and privacy issues in hybrid systems. In general, the diagram shows that all these interrelated aspects lead to the rise and implementation of on-device generative AI.

3. Methodology

The research paper is based on a structured and integrative methodological perspective to explore the emergence of on-device generative artificial intelligence (AI) and its effects on the contemporary computing systems. Since the field of AI technologies is highly dynamic and requires an interdisciplinary approach, the methodology will allow condensing the knowledge of the existing academic sources, industry reports, and technological analyses. The objective here is to bring out an all-inclusive and analytically based insight on the role played by on-device AI in transforming the personal and industrial computing environment.

3.1 Research Design

The study uses a qualitative and exploratory research design that relies on a systematic literature review of the current research. This method is suitable because the conceptual and technological orientation of the research aims at interpreting the emerging trends instead of examining a particular hypothesis with the help of experimental evidence. Through the exploratory nature, it is possible to identify the patterns, relationships, and developments in the area of on-device generative AI.

The literature-based design also allows incorporating the research results of various fields, such as computer engineering, artificial intelligence, mobile computing, and edge systems. Earlier research has highlighted that the transition towards the localized AI processing is facilitated by the technological factors in addition to the user-centric requirements like privacy and real-time performance (Joel, 2022; Ali, 2023). Through this design, the study becomes a holistic approach to the topic of study and puts it in the context of the overall phenomenon of digital transformation.

3.2 Sources of Data and selection criteria.

The sources used to obtain the data of this study include peer-reviewed journal articles, conference papers, and credible preprints published in 2022-2026. To make sure that the analysis will capture the latest trends in the field of on-device AI technologies, the sources have been chosen. The major databases and repositories that are often related to the research of high quality in the given sphere are IEEE Xplore, ScienceDirect, ACM Digital Library, and other academic resources.

Relevancy, recency and methodological rigor were the selection criteria that were used to include literature review. Articles were also included in case they were focused on a specific topic, including edge AI, on-machine learning, generative AI models, hardware acceleration (e.g., NPUs), or localized AI systems in practice. Papers that only examined AI based on clouds without considering on-device or edge devices were not taken into consideration unless they had the necessary comparative information.

This narrowing will make the research focused on the move to on-device intelligence and at the same time, it will be scholarly. As an example, the studies on the incorporation of AI into mobile and embedded systems offer essential information about the way localized computing will minimize the latency and improve the privacy of the user (Ale et al., 2024; Sai et al., 2024).

3.3 Collection and Organization of Data.

The data were collected with the help of the structured review process that presupposes the selected articles that were thoroughly read, analyzed, and divided on the basis of the main themes. These topics are hardware innovations, model optimization, model applications and system level issues. The sources in question were analyzed to produce pertinent findings, theoretical concepts, and empirical observations that can support the knowledge on-device generative AI.

The data gathered was then analyzed into clusters on which the data were categorized to make it easier to analyze them. As an example, the research on hardware innovations was united together to emphasize the importance of the neural processing unit and energy efficient architectures. On the same note, studies on AI model development were divided into two to investigate the trend in small language models and diffusion-based models that are optimized to run on local devices (Na and Lee, 2024; Kodavanti et al., 2026).

Such thematic arrangement will make it possible to synthesize the various research outputs in a consistent synthesis so that the study can be able to establish common patterns and new trends in various fields.

3.4 Analytical Framework

A conceptual framework is used to analyze the relationship between the four fundamental elements, including hardware infrastructure, AI model design, application domains, and operational challenges. Such a framework is indicative of the interrelated aspect of the on-device AI systems and offers a systematic prism through which the data may be viewed.

The analysis of hardware infrastructure is conducted based on the computational capacity, energy efficiency, and architecture design with a specific focus on NPUs and mobile GPUs (Ale et al., 2024). The design of AI models is judged by the model size, efficiency, and the task specificity where the emphasis is laid on the increasing role of small and specialized models (Wang et al., 2025).

The areas of its application are studied to comprehend the way on-device AI is applied in practice, such as personal computers, mobile devices, and industrial systems (Zhang et al., 2025; Kumar et al., 2026). Lastly, operational issues are examined to determine constraints on power usage, thermal and privacy (Li et al., 2026).

Combining them, the framework will allow to comprehensively examine the interaction of technological innovations and practical needs to determine the future development of the on-device generative AI.

3.5 Validity and Reliability

In order to ascertain the validity of the study, cross-referencing was done on several sources to determine the consistency of the findings. The credibility of the data is increased due to the reliance on peer-reviewed articles and reputable publications, whereas the consideration of the recent studies will guarantee that the analysis is based on the recent technological trends.

The reliability is ensured with the help of a clear and repeatable research procedure. The choice criteria, data gathering procedures and data analysis framework are well stipulated enabling other researchers to replicate the study or use the similar procedures on related issues. Moreover, the use of APA in-text citations is also consistent, which allows tracing the sources of information by the reader.

The analysis is also very robust since various views of the academic and industry research have been incorporated. As an illustration, the integration of the results of surveys of edge AI models with research on the practical implementation can offer a more detailed vision of the discipline (Wang et al., 2025; Zhang et al., 2025).

3.6 Weaknesses of the Methodology.

The methodology has a number of limitations even though it has its strengths. Being a qualitative based literature study, there is not a primary data collection or experimental validation. This implies that the results will be related to the coverage and quality of the current literature.

Also, it poses a challenge of fast rate of AI development as new technologies and models might appear after this study is conducted. Although the most recent research available was included in the processes, the nature of the field is dynamic and thus not all the developments may be well captured.

The other weakness is connected with the diversity of evaluation measures applied in various studies. Measures like TOPS and TFLOPS are also popular to determine performance, however their interpretation may differ, and it is not easy to make direct comparisons across devices and systems (Li et al., 2026). This inconsistency creates an element of uncertainty in the determination of the relative performance of various on-device AI solutions.

3.7 Ethical Considerations

The main ethical issues in this research are the proper use and reference of academic sources. Every mentioned literature is cited correctly following APA format assuring the integrity of the academic and the recognition of the original authors.

Other ethical impacts of on-device AI also include those of privacy and data security that the study takes into account. Although on-device processing also means that it does not require transferring data to external servers, it can be still risky to implement hybrid systems, which will involve local and cloud processing (Ali, 2023). These are factored into the analysis in order to give a balanced view of the advantages and the drawbacks of the technology.

3.8 Summary of Methodological Approach

Overall, the research paper uses a qualitative, literature-based approach in order to examine the emergence of on-device generative AI. The study has a systematic selection, organization, and analysis of the recent research and has given a clear insight into the technological, practical, and ethical aspects of this new area. The application of a systematic analytical tool makes sure that the results are logical, trustworthy, and applicable to the academic and business communities.

The methodological part provides the basis of the further parts of the article where the findings are discussed and explained more thoroughly, referring to the transformational effect of on-device AI on contemporary computing systems.

Table 1: Summary of Research Methodology

Component	Description	Sources / Basis
Research Design	Qualitative, exploratory study based on literature review	(Joel, 2022; Ali, 2023)
Research Approach	Systematic review and thematic analysis of existing studies	(Wang et al., 2025)
Data Sources	Peer-reviewed journals, conference papers, and preprints (2022–2026)	(Zhang et al., 2025; Kumar et al., 2026)
Selection Criteria	Relevance to on-device AI, recency, and methodological rigor	(Ale et al., 2024; Sai et al., 2024)
Data Collection Method	Structured extraction and categorization of key themes	(Na & Lee, 2024)
Analytical Framework	Thematic analysis: hardware, models, applications, challenges	(Li et al., 2026)
Key Variables / Themes	NPUs, model efficiency, real-time processing, privacy, energy consumption	(Kodavanti et al., 2026)
Validation Strategy	Cross-referencing multiple academic sources for consistency	(Wang et al., 2025)
Reliability Measures	Transparent and replicable research process	(Zhang et al., 2025)
Limitations	No primary data, rapid tech evolution, metric inconsistencies	(Li et al., 2026)
Ethical Considerations	Proper citation, data integrity, privacy awareness	(Ali, 2023)

Results

The section contains the highlights of the systematic review of the recent literature on on-device generative artificial intelligence (AI). The results are grouped based on the key thematic areas identified in the methodology which are capabilities of the hardware, model efficiency, application performance, and system-level constraints. It is aimed at showing observed trends and patterns without interpretation in order to have a clear and structured picture of the current state of on-device AI.

4.1 Hardware Hardware Performance and Computational Capabilities.

The discussion shows that hardware to support on-device AI has improved significantly, especially with the introduction of neural processing units (NPUs) into the current devices. These processors can do trillion operations per second, which can be used to do complex generative processes, like image generation, real-time audio generation, and natural language processing directly on local hardware (Ale et al., 2024).

In the studied literature, devices with specialized AI accelerators always showed a lower latency and better responsiveness than systems based on exclusively cloud computing. The availability of NPUs in smartphones, laptops, and embedded devices has become quite widespread and serves a large variety of AI-based applications (Wang et al., 2025).

Moreover, the statistics show that the optimization of hardware is strongly connected with energy efficiency. According to a number of studies, the newer chip architectures are formulated in such a way that they balance the computational power with lower energy usage, which makes sustained processing at the on-device level a possibility (Sai et al., 2024).

4.1 Smoothness and Effectiveness of AI Models.

The literature findings point to a definite tendency of the adoption of smaller, task-specific AI models that should be optimized to be conducted locally. These models are crafted to run on the limitations of hardware on the device without compromising on a reasonable level of performance. On-device models have a shorter response time than large-scale cloud models because the network latency has been removed (Na & Lee, 2024).

The analyzed articles all point to the fact that small generative models can be used to accomplish such tasks as text summarization, image generation, and speech enhancement with high levels of efficiency. Also, the development of model compression and optimization methods has enabled the possibility of deploying generative AI to resource-constrained devices even further (Kodavanti et al., 2026).

The literature shows that quantitative comparisons can be made on the performance of on-device models showing that near real-time performance can be obtained on-device in applications that need real-time feedback. The benefit of this performance is seen most in the conditions when there is a lack of network connection or an absence thereof.

4.3 Application-Level Outcomes

The findings indicate that there is a high level of adoption of on-device AI in various fields. Devices in personal computing can now handle real-time tasks like document summarization, voice recognition and multimedia editing without the use of the cloud (Zhang et al., 2025). The integration of the AI features is high in smartphones, especially camera improvements and live translation and contextual help.

On-device AI is being implemented in a manufacturing and business setting to assist in efficiency. The manufacturing environment can be provided with embedded systems that can identify anomalies and anticipate maintenance requests through the analysis of local data (Kumar et al., 2026). On the same note, the retail systems also apply on-device AI to offer interactive customer experiences without the need of having continuous internet connectivity.

It is also found that on-device AI can be used to provide continuous background processing, which means that systems can track and react to user inputs in real-time. This feature allows user experience by giving them easy and instant guidance.

4.4The observations regarding privacy and data handling.

Another common theme among the articles that were reviewed is the improved privacy of on-device AI. These systems also minimize the transmission of sensitive information to the external servers, therefore reducing the exposure to possible data breaches by processing it locally (Ali, 2023).

Nevertheless, the findings also indicate that several contemporary AI systems are also deployed on hybrid architectures that incorporate local and cloud processing. This would mean that some tasks are run on the device whereas some are processed in the cloud depending on the computing needs. Such a hybrid solution will create a fluctuation in the data treatment practices and can have an impact on the general degree of privacy protection (Li et al., 2026).

4.5 System Limitations and Operational Limitations.

Although there are benefits, the findings determine that there are a number of limitations of on-device AI. The problem of energy use and thermo-managing are also a major issue, especially in mobile devices of limited battery capacity. Researchers claim that the long-term activities of the generative AI functions might result in the heightened heat release and the shortening of battery life (Li et al., 2026).

Also, there is a variation in performance between devices of varying hardware. Performance metrics, including TOPS (trillions of operations per second), are widely applied to assess the performance of AI, but as reported in the literature, this does not necessarily present a uniform point of reference (Wang et al., 2025).

The other weakness that is established is that the on-device models have lesser capabilities relative to big cloud-based models. Although local models are effective, they might not be as effective as complex or accurate in some sophisticated tasks.

4.6 Summary of Key Findings

The findings of the current research point out to a number of major trends in the evolution and the uptake of on-device generative AI:

The major progress in hardware especially the NPUs has made it possible to perform local AI processing.
Smaller optimized AI models are being applied to provide real-time performance on a personal device.
On-device AI has a common usage in personal, commercial, and industrial fields.
Local processing is more privacy-enhancing, and is linked to hybrid cloud systems.
The issues of energy efficiency, thermal management, and performance standardisation still exist.

The presented findings can be regarded as a clear overview of the existing capabilities and limitations of on-device generative AI, on which the following discussion and interpretation will be based.

Table 2: Summary of Key Results on On-Device Generative AI

Category	Key Findings	Evidence from Literature
Hardware Performance	Integration of NPUs significantly improves processing speed and reduces latency	(Ale et al., 2024; Wang et al., 2025)
Energy Efficiency	Optimized chip architectures balance high computation with lower power usage	(Sai et al., 2024)
Model Efficiency	Compact and optimized models enable real-time on-device execution	(Na & Lee, 2024; Kodavanti et al., 2026)
Response Time	Faster response due to elimination of network dependency	(Zhang et al., 2025)
Application Performance	Effective in tasks like text generation, image processing, and voice recognition	(Kumar et al., 2026)
Privacy Advantages	Local data processing reduces exposure to external data breaches	(Ali, 2023)
Hybrid Processing	Combination of on-device and cloud processing for complex tasks	(Li et al., 2026)
Thermal Constraints	Increased heat generation during prolonged AI usage	(Li et al., 2026)
Battery Consumption	High computational tasks can significantly reduce battery life	(Sai et al., 2024)
Performance Variability	Differences in device hardware lead to inconsistent AI performance	(Wang et al., 2025)
Benchmark Limitations	Metrics like TOPS are not always reliable for real-world comparison	(Wang et al., 2025)
Model Limitations	On-device models are less complex than large cloud-based AI systems	(Kodavanti et al., 2026)

Discussion

The results provided in the results section indicate the fast development and the increasing importance of on-device generative artificial intelligence (AI). The most important thing that can be noticed is the high level of interdependence of hardware development and optimization of the models. Both the computational efficiency and the practical implementation of generative AI models have become viable due to the integration of neural processing units (NPUs) and the fact that they make them significantly more efficient on resource-constrained devices. This qualifies the general trend observed in the literature that the core of hardware-software co-design is the foundation of on-device AI systems development.

The other useful lesson is the compromise between the complexity of the models and the performance. Although small models can be used to do real-time processing and incur less latency, they may have low capacity unlike large scale cloud-based systems. It implies that the methods of optimization, including model compression and quantization, are not only supportive but functional balance in enabling. The results confirm the idea that efficiency, as opposed to scale per se, is emerging to be a signature measure in the contemporary AI implementation.

The use of on-device AI in personal and industrial spheres is widespread, which proves its efficiency and feasibility and is also an indicator of its versatility. The local processing of data can be used to improve local responsiveness and autonomy of operations in both real-time language processing on smartphones and predictive maintenance in embedded systems. These findings are in line with the previous research that highlights the importance of edge computing in the elimination of reliance on centralized infrastructures.

The need to preserve privacy is presented as the major benefit of on-device AI, since local data processing helps to reduce the vulnerability to external threats. Nevertheless, the fact that hybrid architectures are still present implies that the isolation of data is not always possible. This brings in a subtle approach in which privacy privileges have to be considered in conjunction with system design options as well as task demands.

Although such benefits are present, challenges that may still persist are also highlighted in the discussion and especially in the areas of energy consumption and thermal management. These limitations show that further innovation in hardware design and energy-conscious algorithms is necessary to overcome these limitations. Also, the discrepancies in the performance measurement indicators imply that the evaluation frameworks are not standardized and this could be a problem when comparing and contrasting each device.

Altogether, it can be seen that on-device generative AI is heading toward more efficient, secure, and autonomous systems, but its scalability in the long-term will rely on the existing technical constraints.

6. Conclusion

The paper has explored the present and the developing trends of on-device generative artificial intelligence (AI) and specifically discussed hardware development, model optimization, application, and system-level issues. The results show that on-device AI is soon becoming a commercial innovation that is quickly moving to a practical and mass application. The key to this change is the inclusion of special hardware elements like neural processing units (NPUs) that allow local computation and bypass latency by a large margin (Wang et al., 2025; Ale et al., 2024).

Another important aspect observed in the study is the importance of optimized and small AI models in assisting the processing of real-time on limited resource machines. The compression of models and the use of quantization have enabled the deployment of generative AI capacity to be accessed without the significant cloud infrastructure use and have thus enhanced responsiveness and operational efficiency (Na and Lee, 2024; Kodavanti et al., 2026). The developments have increased the usability of on-a-device AI in different industries such as personal computing, healthcare, and industrial automation (Zhang et al., 2025; Kumar et al., 2026).

Besides, the study also emphasizes the significance of privacy as one of the benefits of on-device AI. These systems diminish the data transmission dangers and data centralization risks that have been linked to data transmission and storage when the entire process is carried out locally, and it is in line with the growing global concern regarding data security and user confidentiality (Ali, 2023). Nevertheless, the reason as to why hybrid architectures are still used is an indicator that full freedom of cloud systems is still a problem, especially when it comes to computationally intensive activities (Li et al., 2026).

Although these are advantages, there are still a number of restrictions. Problems with energy consumption, thermal management, and irregular performance measurements indicate the necessity to conduct additional studies and standardize it (Sai et al., 2024; Wang et al., 2025). Besides, the fact that on-device models are less complex than large-scale cloud systems implies that the difference between the two remains in the ability to attain the same levels of precision and functionality.

To sum up, on-device generative AI is a major advance towards responding more effectively and efficiently, as well as being privacy-conscious, in computing systems. Though significant advancements have been achieved, its future improvement will rely on the ongoing innovations in the design of hardware, model efficiency and integration of the systems to address the current limitations and enable the deployment in a scale.

7. References

Joel, B. (2022). Edge AI and on-device machine learning optimization. International Journal of AI, Big Data, Computational and Management Studies, 3(3), 112–115. https://doi.org/10.63282/3050-9416.IJAIBDCMS-V3I3P113
Ali, K. S. (2023). Edge AI and IoT: Direct integration for on-the-device data processing. Applied and Engineering Innovation. https://doi.org/10.54254/2977-3903/5/2023040
Ale, L., Zhang, N., King, S. A., & Chen, D. (2024). Empowering generative AI through mobile edge computing. Nature Reviews Electrical Engineering, 1, 478–486. https://doi.org/10.1038/s44287-024-00053-6
Sai, S., Prasad, M., Dashore, G., & Chamola, V. (2024). On-device generative AI: The need, architectures, and challenges. IEEE Consumer Electronics Magazine. https://doi.org/10.1109/MCE.2024.3518761
Na, M., & Lee, J. (2024). Generative AI-enabled energy-efficient mobile augmented reality in multi-access edge computing. Applied Sciences, 14(18), 8419. https://doi.org/10.3390/app14188419
Wang, X., Tang, Z., Guo, J., Meng, T., Wang, C., Wang, T., & Jia, W. (2025). Empowering edge intelligence: A comprehensive survey on on-device AI models. ACM Computing Surveys, 57(9). https://doi.org/10.1145/3724420
Zhang, Y., et al. (2025). Edge-AI: A systematic review on architectures, applications, and challenges. Journal of Network and Computer Applications. https://doi.org/10.1016/j.jnca.2025.104375
Kumar, P., et al. (2026). On-device artificial intelligence solutions with applications to smart environments. Future Generation Computer Systems, 180, 108373. https://doi.org/10.1016/j.future.2026.108373
Li, H., et al. (2026). Edge-based artificial intelligence: Understanding the evolution of hardware and software and future trends. Engineering Applications of Artificial Intelligence, 174, 114526. https://doi.org/10.1016/j.engappai.2026.114526
Kodavanti, S., et al. (2026). EdgeDiT: Hardware-aware diffusion transformers for efficient on-device image generation. arXiv preprint. https://doi.org/10.48550/arXiv.2603.28405

Author: Kartik Jain